COMPUTER & INTERNET TIPS+TRICKS & SOFTWARE

MD5

You're probably wondering whether any technique is sufficient. I am happy to report that there is such a technique. It involves calculating the digital fingerprint, or signature, for each file. This is done utilizing various algorithms. A family of algorithms, called the MD series, is used for this purpose. One of the most popular implementations is a system called MD5.

MD5 is a utility that can generate a digital signature of a file. MD5 belongs to a family of one-way hash functions called message digest algorithms. The MD5 system is defined in RFC 1321. Concisely stated:

The algorithm takes as input a message of arbitrary length and produces as output a 128-bit "fingerprint" or "message digest" of the input. It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest. The MD5 algorithm is intended for digital signature applications, where a large file must be "compressed" in a secure manner before being encrypted with a private (secret) key under a public-key cryptosystem such as RSA.

Cross Reference: RFC 1321 is located at http://www.freesoft.org/Connected/RFC/1321/1.html.

When one runs a file through an MD5 implementation, the signature emerges as a 32-character value. It looks like this:

2d50b2bffb537cc4e637dd1f07a187f4

Many sites that distribute security fixes for the UNIX operating system employ this technique. Thus, as you browse their directories, you can examine the original digital signature of each file. If, upon downloading that file, you find that the signature is different, there is a 99.9% chance that something is terribly amiss.

MD5 performs a one-way hash function. You may be familiar with these operations from other forms of encryption, including those used to encrypt password files.

Some very extreme security programs use MD4 and MD5 algorithms. One such program is S/Key, which is a registered trademark of Bell Laboratories. S/Key implements a one-time password scheme. One-time passwords are nearly unbreakable. S/Key is used primarily for remote logins and to offer advanced security along those channels of communication (as opposed to using little or no security by initiating a normal, garden-variety Telnet or Rlogin session). The process works as described in "S/Key Overview" (author unknown):

S/Key uses either MD4 or MD5 (one-way hashing algorithms developed by Ron Rivest) to implement a one-time password scheme. In this system, passwords are sent cleartext over the network; however, after a password has been used, it is no longer useful to the eavesdropper. The biggest advantage of S/Key is that it protects against eavesdroppers without modification of client software and only marginal inconvenience to the users.

Cross Reference: Read "S/Key Overview" at http://medg.lcs.mit.edu/people/wwinston/skey-overview.html.

With or without MD5, object reconciliation is a complex process. True, on a single workstation with limited resources, one could technically reconcile each file and directory by hand (I would not recommend this if you want to preserve your sanity). However, in larger networked environments, this is simply impossible. So, various utilities have been designed to cope with this problem. The most celebrated of these is a product aptly named TripWire.

TripWire

TripWire (written in 1992) is a comprehensive system-integrity tool. It is written in classic Kernhigan and Ritchie C (you will remember from Chapter 7, "Birth of a Network: The Internet," that I discussed the portability advantages of C; it was this portability that influenced the choice of language for the authors of TripWire).

TripWire is well designed, easily understood, and implemented with minimal difficulty. The system reads your environment from a configuration file. That file contains all filemasks (the types of files that you want to monitor). This system can be quite incisive. For example, you can specify what changes can be made to files of a given class without TripWire reporting the change (or, for more wholesale monitoring of the system, you can simply flag a directory as the target of the monitoring process). The original values (digital signatures) for these files are kept within a database file. That database file (simple ASCII) is accessed whenever a signature needs to be calculated. Hash functions included in the distribution are

MD5
MD4
CRC32
MD2
Snefru (Xerox secure hash function)
SHA (The NIST secure hash algorithm)

It is reported that by default, MD5 and the Xerox secure hash function are both used to generate values for all files. However, TripWire documentation suggests that all of these functions can be applied to any, a portion of, or all files.

Altogether, TripWire is a very well-crafted package with many options.

Cross Reference: TripWire (and papers on usage and design) can be found at ftp://coast.cs.purdue.edu/pub/tools/unix/TripWire/.

TripWire is a magnificent tool, but there are some security issues. One such issue relates to the database of values that is generated and maintained. Essentially, it breaks down to the same issue discussed earlier: Databases can be altered by a cracker. Therefore, it is recommended that some measure be undertaken to secure that database. From the beginning, the tool's authors were well aware of this:

The database used by the integrity checker should be protected from unauthorized modifications; an intruder who can change the database can subvert the entire integrity checking scheme.

Cross Reference: Before you use TripWire, read "The Design and Implementation of TripWire: A File System Integrity Checker" by Gene H. Kim and Eugene H. Spafford. It is located at ftp://ftp.cs.purdue.edu/pub/spaf/security/Tripwire.PS.Z.

One method of protecting the database is extremely sound: Store the database on read-only media. This virtually eliminates any possibility of tampering. In fact, this technique is becoming a strong trend in security. In Chapter 21, "Plan 9 from Bell Labs," you will learn that the folks at Bell Labs now run their logs to one-time write or read-only media. Moreover, in a recent security consult, I was surprised to find that the clients (who were only just learning about security) were very keen on read-only media for their Web-based databases. These databases were quite sensitive and the information, if changed, could be potentially threatening to the security of other systems.

Kim and Spafford (authors of TripWire) also suggest that the database be protected in this manner, though they concede that this could present some practical, procedural problems. Much depends upon how often the database will be updated, how large it is, and so forth. Certainly, if you are implementing TripWire on a wide scale (and in its maximum application), the maintenance of a read-only database could be formidable. Again, this breaks down to the level of risk and the need for increased or perhaps optimum security.

Subscribe in a reader