CompareFile Method  
 

Compares the contents of two files with an optional message digest.

Syntax

object.CompareFile( File1, [File2], [Options], [Digest] )

Parameters

File1
A string which specifies the name of the first file to compare. This parameter must specify the name of a file which exists and can be opened for read access by the current process. If the file name specifies a directory or device name, the method will fail. It is permitted to use environment variables in the file name by surrounding the variable name with the percent symbol. If the file name contains any leading or trailing space characters after the variable expansion, they will be removed. This parameter cannot be a zero-length string.
File2
An optional string which specifies the name of the second file to compare. If this parameter is omitted or a zero-length string, the Digest parameter must specify a valid message digest and the contents of the first file will be compared against that digest. If this parameter specifies a file name, the file must exist and the current process must be able to open it for read access. If the file name specifies a directory or device name, the method will fail. It is permitted to use environment variables in the file name by surrounding the variable name with the percent symbol. If the file name contains any leading or trailing space characters after the variable expansion, they will be removed.
Options
An integer value which specifies one or more option flags which will control how the files are compared. This parameter is constructed by using a bitwise operator with any of the following values:
Value Description
fileCompareDefault Use the default comparison option, which is to perform a byte comparison of both files. If the Digest parameter is omitted or a zero-length string, the method will compare the message digest against the contents of the file. If the Digest parameter is included and no specific hash algorithm has been specified as one of the options, the hash type will be automatically determined based on its value.
fileCompareContents Perform a byte comparison of the two files. This option requires both files be specified. The current process must be able to open both files for exclusive read access. If either file is opened for write access, the method will fail. The File2 parameter must specify a valid file name if this option is used.
fileCompareCreated Compare the file creation times for both files and only consider them to be identical if the creation times are identical. If this option is specified and the creation times are different, the method will return an error even if the contents of the two files are identical. The File2 parameter must specify a valid file name if this option is used.
fileCompareModified Compare the file modification times for both files and only consider them to be identical if the modification times are identical. If this option is specified and the modification times are different, the method will return an error even if the contents of the two files are identical. The File2 parameter must specify a valid file name if this option is used.
&H100 fileCompareMD5 Use the MD5 algorithm to compute 128-bit hash value for the contents of the first file and compare this against the specified digest string. This algorithm is the most efficient method to compute a hash to verify data integrity, however it is not considered cryptographically secure and hash collision is possible. When this option is used the Digest parameter must specify a valid MD5 hash as a 32 digit hexadecimal string.
&H200 fileCompareSHA1 Use the SHA-1 algorithm to compute 160-bit hash value for the contents of the first file and compare this against the specified digest string. This algorithm is comparable to the MD5 algorithm and has a lower chance of hash collision; however, it is not considered cryptographically secure and hash collision is possible. When this option is used the Digest parameter must specify a valid SHA-1 hash as a 40 digit hexadecimal string.
&H400 fileCompareSHA256 Use the SHA-256 algorithm to compute a 256-bit hash value for the contents of the first file and compare this against the specified digest string. This is a cryptographically secure algorithm which is more compute intensive than either MD5 or SHA-1, but the chance of a hash collision is extremely low and typically not a concern for most applications. When this option is used the Digest parameter must specify a valid SHA-256 hash as a 64 digit hexadecimal string.
Digest
An optional string which specifies a sequence of hexadecimal numbers which are used to compare against the computed hash value for the file. If this parameter is used and no option is specified to indicate the hash type, it will be inferred from the length of the string. If this parameter is omitted or a zero-length string, it will be ignored. The case of the hexadecimal digits in the string are ignored and it is permitted, but not required, to prefix the string with either "0x" or "&H" to indicate it is a hexadecimal value. If the string does not specify a valid hexadecimal number or is not a value computed using one of the supported hash algorithms, the method will fail.

Return Value

A return value of zero indicates there were no errors and the files are identical using the specified criteria. A non-zero return value specifies an error code which indicates the reason for the failure.

Remarks

The CompareFile method can be used to either compare the contents of two files or compare the contents of a file against a pre-computed hash value to determine if they match. In addition to comparing file contents, the method can also check the file creation and/or modification times to ensure they are the same.

In most cases, an application will specify either two files to compare or a message digest value. If you call the method using two file names and a message digest, the contents of both files must be identical and the computed hash value must match the specified message digest. If any of the requirements are are not satisfied, the method will return an error code. For example, if the actual contents of the two files are identical, but the value specified by the Digest parameter does not match the computed hash value of files, the method will still fail.

This method performs the file comparison in a way to minimize potential disk I/O whenever possible. First, the size of the two files are compared and if they are different, no further checks are performed. Next, if either the fileCompareCreated or fileCompareModified options have been specified, the file timestamps are compared and no further checks are performed if they do not match. A byte comparison of both files will only be performed after the other comparison criteria has been satisfied.

Comparing the contents of a file against a message digest value is the most expensive operation in terms of memory and processor utilization. The larger the files are, and the more complex the hash algorithm, the more compute-intensive the operation will be. In most cases, the SHA-1 algorithm provides a good balance between resource utilization and minimizing the possibility of hash collisions where different files could potentially generate the same hash value.

If it is critically important to minimize the possibility of a hash collision, use the SHA-256 algorithm; however, be aware of the impact it can have on performance when used with very large files. The current thread will block while the hash is being computed and if a large amount of data is being processed, this may cause the application to become non-responsive.

This method will normalize the file names provided by the caller and perform checks to make sure the names do not contain illegal characters. If the name includes quotes, wildcard characters or other invalid symbols the method will fail and return stErrorInvalidFileName. It is permitted to use forward slashes in path names and these will be converted to the standard backslash character used with Windows paths. This method cannot be used to compare the contents of NTFS alternate data streams.

Example

' Compare the contents of a file against an SHA-1 hash value
Dim strFileName As String
Dim strDigest String
Dim nError As Long

strFileName = "%USERPOFILE%\\Documents\\Projects\\MyDocument.docx"
strDigest = "541E43FDC612E41C02B770A2FD87EBB4A12EA800" ' SHA-1 hash value

nError = FileEncoder1.CompareFile(strFileName, , fileCompareSHA1, strDigest)

If nError = 0 Then
    MsgBox "The file contents match the SHA-1 digest", vbInformation
Else
    MsgBox "The file contents do not match the SHA-1 digest", vbExclamation
End If

' Compare the contents of two files using the default options
Dim strFile1 As String
Dim strFile2 As String
Dim nError As Long

strFile1 = "%USERPROFILE%\\AppName\\ProjectData.db"
strFile2 = "%TEMP%\\BackupData.db"

nError = FileEncoder1.CompareFile(strFile1, strFile2)

If nError = 0 Then
    MsgBox "The two files are identical", vbInformation
Else
    MsgBox "The two files are different", vbExclamation
End If

See Also

CompressFile Method, DecodeFile Method, EncodeFile Method, ExpandFile Method