bash&shell Series of articles :http://www.cnblogs.com/f-ck-need-u/p/7048359.html
<http://www.cnblogs.com/f-ck-need-u/p/7048359.html#blogshell>

To compare whether the contents of two files are exactly the same , Easy to use diff command . for example :
diff file1 file2 &>/dev/null;echo $?
however diff Command can only give two file parameters , So you can't compare multiple files at once ( Directories are also treated as files ), and diff Very inefficient when comparing non text files or very large files .

You can use md5sum To achieve , comparison diff Line by line comparison of ,md5sum It's much faster .

md5sum See :Linux File in MD5 check <http://www.cnblogs.com/f-ck-need-u/p/7430264.html>.

but md5sum Only by viewing md5 Value to indirectly compare files for the same , To achieve automatic batch comparison , It needs to be written as a loop . The script is as follows :
#!/bin/bash ########################################################### #
description: compare many files one time # # author : Golden Dragon # # blog :
http://www.cnblogs.com/f-ck-need-u/ #
########################################################### # filename: md5.sh
# Usage: $0 file1 file2 file3 ... IFS=$'\n' declare -A md5_array # If use while
read loop, the arrayin while statement will # auto set to null after the loop,
so i usefor statement # instead the while, and so, i modify the variable IFS to
# $'\n'. # md5sum format: MD5 /path/to/file # such
as:80748c3a55b726226ad51a4bafa1c4aa/etc/fstab for line in `md5sum "[email protected]"` do
index=${line%% *} file=${line##* } md5_array[$index]="$file
${md5_array[$index]}" done # Traverse the md5_array for i in ${!md5_array[@]}
do echo -e "the same file with md5: $i\n--------------\n`echo
${md5_array[$i]}|tr ' ' '\n'`\n" done
To test the script , Copy a few files first , And modify the contents of several of them , for example :
[[email protected] ~]# for i in `seq -s' ' 6`;do cp -a /etc/fstab /tmp/fs$i;done
[[email protected]~]# echo ha >>/tmp/fs4 [[email protected] ~]# echo haha >>/tmp/fs5
Now? ,/tmp There are 6 Files fs1,fs2,fs3,fs4,fs5 and fs6, among fs4 and fs5 Modified , surplus 4 File contents are identical .
[[email protected] tmp]# ./md5.sh /tmp/fs[1-6] the same file with md5:
a612cd5d162e4620b442b0ff3474bf98-------------------------- /tmp/fs6 /tmp/fs3
/tmp/fs2 /tmp/fs1 the same file with md5: 80748c3a55b726226ad51a4bafa1c4aa
-------------------------- /tmp/fs4 the same file with md5:
30dd43dba10521c1e94267bbd117877b-------------------------- /tmp/fs5
More general comparison methods : Compare files with the same name in multiple directories .
[[email protected] tmp]# find /tmp -type f -name "fs[0-9]" -print0 | xargs -0 ./md5.sh
the samefile with md5:a612cd5d162e4620b442b0ff3474bf98
-------------------------- /tmp/fs6 /tmp/fs3 /tmp/fs2 /tmp/fs1 the same file
with md5:80748c3a55b726226ad51a4bafa1c4aa-------------------------- /tmp/fs4
the samefile with md5:30dd43dba10521c1e94267bbd117877b
-------------------------- /tmp/fs5
Script description :

(1).md5sum The calculation result format is "MD5 /path/to/file", Therefore, it is necessary to output both in the result MD5 value , Same output MD5 Corresponding documents , Consider using arrays .

(2). I used it at the beginning while loop , Read each file from standard input md5sum Results of . The statement is as follows :
md5sum "[email protected]" | while read index file;do md5_array[$index]="$file
${md5_array[$index]}" done
But because of the pipeline while Statement in child shell Executed in , therefore while Array assigned in md5_array Will fail at the end of the cycle . So it can be rewritten as :
while read index file;do md5_array[$index]="$file ${md5_array[$index]}" done
<<<"$(md5sum "[email protected]")"
But I ended up using the more cumbersome for loop :
IFS=$'\n' for line in `md5sum "[email protected]"` do index=${line%% *} file=${line##* }
md5_array[$index]="$file ${md5_array[$index]}" done
but md5sum There are two columns in each row result of , and for Loop to default IFS The two columns are split into two values , So it was modified IFS The value of the variable is $'\n', Make a line assign a variable once .

(3).index and file Variables are used to md5sum The result of each row of is split into two variables ,MD5 Partial as array index,
file As part of the value of an array variable . therefore , The array assignment statement is :
md5_array[$index]="$file ${md5_array[$index]}"
(4). After array assignment , Start traversing array . There are many ways to traverse . I'm going through arrays index list , Of each line MD5 value .
# Traverse the md5_array for i in ${!md5_array[@]} do echo -e "the same file
with md5: $i\n--------------\n`echo ${md5_array[$i]}|tr ' ' '\n'`\n" done