This article isrsync Official recommended technical reportrsync technical report
<https://rsync.samba.org/tech_report/tech_report.html>
Translation, The main content isRsync Algorithm principle andrsync How to implement these principles. In the process of translation, Add the translator's own notes to some incomprehensible places.

Collection of my translations:http://www.cnblogs.com/f-ck-need-u/p/7048359.html
<http://www.cnblogs.com/f-ck-need-u/p/7048359.html#mytranslations>

Below isrsync Series:
  1.rsync( One): Basic commands and usage <http://www.cnblogs.com/f-ck-need-u/p/7220009.html>
  2.rsync( Two):inotify+rsync Detailed description andsersync
<http://www.cnblogs.com/f-ck-need-u/p/7220193.html>
  3.rsync Algorithm principle and workflow analysis <http://www.cnblogs.com/f-ck-need-u/p/7226781.html>
  4.rsync Technical Report( translate) <http://www.cnblogs.com/f-ck-need-u/p/7220753.html>
  5.rsync Working mechanism( translate) <http://www.cnblogs.com/f-ck-need-u/p/7221535.html>
  6.man rsync translate(rsync Command manual in Chinese)
<http://www.cnblogs.com/f-ck-need-u/p/7221713.html>

Rsync algorithm

Andrew Tridgell Paul Mackerras 
Department of Computer Science 
Australian National University 
Canberra, ACT 0200, Australia

<>

1.1 abstract

This report describes an algorithm for updating files on one machine to be consistent with files on another machine. We assume low bandwidth between two machines, High latency two-way link for communication.
The algorithm calculates the consistent part of the source file and the target file( translator's note: Consistent parts of data block), Then send only those that don't match( translator's note: That is to say, the inconsistent parts in the documents at both ends) Part of
. Actually, This algorithm calculates a series of differences between two files on two machines. If the two documents are similar, The efficiency of the algorithm is very high, But even though the two documents are very different, It can also ensure correct and efficient work.

<>

1.2 The problem

Suppose you have two filesA andB, You want to updateB Let it beA Exactly the same, The most direct way is to copyA Turn intoB.


But imagine, If the two files are connected and communicated between machines with extremely slow communication links, For example, using dial-upIP link. IfA File is very large. CopyA reachB It's very slow. In order to speed up, You can takeA Send after compression, But this method can only obtain20% reach40% Promotion.


Now supposeA andB Files are very similar, Maybe they are both separated from the same source file. For real speed, You need to take advantage of this similarity. A common approach is to send only over a linkA andB Part of the difference between, Then reorganize the file according to the difference list( translator's note: So you also need to create a list of differences, Send difference list, Finally, match the difference list and reorganize it).


The problem with this general approach is, To create a collection of differences between two files, It relies on the ability to open and read two files. therefore, This method is used before reading two files, It is required to obtain the file from the other end of the link in advance. If you cannot get a file from the other end, This algorithm will fail( But if you actually get the file from the other end, Both files will be on the same machine, At this time, you can copy directly, There is no need to compare the differences between these documents).rsync That's what we're dealing with.


rsync The algorithm can efficiently calculate the same part of the source file and the target existing file( translator's note: That is, the same data block can be matched). These same parts do not need to be sent over the link; All you need to do is reference these same parts of the target file to make matching references( translator's note: Reference documentbasis
file). Only the unmatched parts of the source file will be byte by byte in a pure data way
Send out. after, The receiver can form a copy of the source file through the same existing parts and byte by byte data received.

generally speaking, The data sent to the receiver can be compressed and transmitted by any common compression algorithm, To further speed up.

<>

1.3 The rsync algorithm

Suppose we have two computersα andβ,α There are accessible files onA,β There are accessible files onB, AndA andB The two documents are similar.α andβ Low speed link communication between.

rsync The algorithm consists of the following processes:

1.β Will fileB Split into a series of non overlapping and fixed sizeS byte( translator's note: The authors note that500 reach1000 Byte is more suitable) Data block. Of course, Last block may be less thanS byte.


2.β Two check codes are calculated for each such block:32 Weak check code of bitrolling-checksum and128 Bit strong check codeMD4-checksum( translator's note: Currentrsync Used is128 BitMD5-checksum).

3.β Send these check codes toα.

4.α Will search for filesA, Find out from it that all the lengths areS Byte sum sumB Two data blocks with the same check code in( Search from any offset). This search and comparison process can be performed by using the weak roll check(rolling
checksum) The special functions of.


( translator's note: String123456 take as an example, To search for inclusion3 Character string, If you search all by any offset3 Character length string, The most basic method is from1 Start searching to get123, from2 Start searching to get234, from3 Start searching to get345, Until the search is complete. That's what any offset means, That is to say, the search length from any position isS Data block)


( Translator Renotes: Why to search at any offset, Consider a situation, There are two identical filesA andB, Now toA Insert a piece of data in the middle of the file, thatA From this data, The offsets of all data blocks immediately following are shifted back a certain length, If you do not search for a fixed length block at any offset, Then start with the newly inserted data, All data blocks andB Dissimilarity, stayrsync Which means that all these data blocks need to be transmitted, But actuallyA andB Different data blocks are only inserted in the middle segment)


5.α Send a series of instructions toβ To constructA Copies of documents. Each instruction is either rightB References to data blocks in, Or pure data. The pure data isA Cannot match toB Data block part of data block in( translator's note: NamelyA Different fromB Data block).


The end result isβ Got it.A Copy of, But only sentA Exist inB Data part not present in( It also includes some check codes and data block index data). The algorithm only requires one link round trip( translator's note: The first link communication isβ Send check code toα, The second communication isα Send instruction, Check code, Index andA Exist inB Pure data that does not exist inβ), This minimizes the impact of link latency.

The most important detail of the algorithm is rolling check(rolling
checksum) And many alternatives related to it(multi-alternate) Search mechanism, They guarantee all offset checks(all-offsets
checksum, I.e. the above steps4 Search data block from any offset position in) The search process can be handled very quickly. They are discussed in more detail below.

( translator's note: If you don't want to see algorithm theory, The following algorithm can be skipped( Let's see the final conclusion), Just make sense of itrsync What's done in the algorithm is enough.)

<>

1.4 Rolling checksum

rsync Weak roll check used in the algorithm(rolling checksum) Need to be fast, Low consumption based on given bufferX1 ...Xn  AndX1,Xn+1
The byte value of calculates the check value of the buffer.

We arersync The design inspiration of weak check algorithm used inMark Adler Ofadler-32 check. Our validation definition formula is:

 





 

amongs(k,l) Is the rolling check code of bytes. In order to calculate the rolling check code quickly and simply, We use.

The most important feature of this algorithm is that it can use recursion relation to calculate continuous values very efficiently.





Therefore, theS Length of data block to calculate the check code, And the number of calculations is very small.

Although the algorithm is simple enough, But it's enough to be the first level check when two file data blocks match. We found in practice, When block contents are different, Check code(rolling
checksum) The probability of matching is very low. This is very important, Because every weak parity check can match the data block on it, the strong parity check code must be calculated, It is very expensive to calculate strong check codes.( translator's note: In other words, Different block contents, Weak check codes may still be the same( Although the probability is very low), that isrolling
checksum There was a collision, Therefore, it is necessary to calculate the strong check code for the data block with the same weak check code for further matching)

<>

1.5 Checksum searching


Whenα I got itB Check code list of file data block, It has to searchA Data block of file( Search at any offset), The goal is to find a matchB Data block part of the data block check code. The basic strategy is toA The length of each byte of the file is calculated successively asS Byte of data block32 Bit weak roll check code(rolling
checksum), Then, for each calculated weak check code, All taken andB Match the check codes in the file check code list. In our algorithm, Simple3 Layer search check( translator's note:3 Step search process).


First floor inspection, For each32 A bit weak roll check code is calculated16 Bit lengthhash value, And will each216 One suchhash Items in one sheethash surface. According to this16 positionhash Value pair check code list( E.g. receivedB Check code set of file data block) Sort.hash Each item in the table points to the corresponding one in the check code listhash First element of value( translator's note: Data blockID), Or when there is no corresponding in the check code listhash Value time, thishash Table entry will be a null value( translator's note: The reason why there is a null check code and the corresponding nullhash Possible values, Becauseβ Those will beα There areβ The check code of the file not found on is set to null and sent toα, suchα You'll know when you search for filesβ The file is not available on, Then send the entire file directly toβ).

For each offset in the file, It will be calculated32 Bit rolling check code and its16 positionhash value. If thishash Valuedhash Table item is a non null value, Layer 2 check will be called.

( translator's note: In other words, The first level is to compare and match16 Bithash value, If it can be matched, it is considered that the data block has the same potential, Then enter the second layer of check from the location of this data block)


The second level check will scan the sorted check code list, It will be fromhash Where the table entry points to( translator's note: This entry corresponds to the data block that can be matched after the first level check) Start scanning, The purpose is to find out what can match the current value32 Bit rolling check code. When scanning to the same32 Bit weak scrolling check value or until different16 positionhash None of the values match32 Bit weak check code, Scan termination( translator's note: Becausehash The probability of repetition of value and weak check code is very low, So basically scan down again1 Most items2 Item to find unmatched weak roll check codes). If you find a match, Then the third level check is called.


( translator's note: In other words, The second level of checking is to compare and match32 Weak roll check code of bit, If it can match, it means that it has the same potential, Then from this position, you can enter the third level of inspection, If there is no matching weak roll check code, It indicates thehash Duplicate value, But the good thing is that the matching of weak roll check excludes it)


The third level check will calculate the strong check code for the data block of the current offset in the file, And compare it with the strong check code in the check code list. If two strong check codes match, We thinkA Data blocks in andB Data blocks in are exactly the same. Theoretically, These data blocks may be different, But the probability is extremely small, So in the process of practice, We think it's a reasonable assumption.


When we find that we can match the data block on,α WillA The offset of this block in the file and the end offset of the previous matching block are sent toβ, The matching data block will also be sent in theB Index of data block in( translator's note: That is, theID,chunk number). When matching data is found, These data( translator's note: Including reassembly instructions related to data blocks on matching and pure data of data blocks that are not matched between two matching blocks) Will be sent immediately, This allows us to execute communication in parallel with further computation.


If it is found that the data block of the current offset in the file does not match up, The weak check code will scroll down to the next offset address and continue searching( translator's note: That is to say, a byte is scrolled down). If you find a match值时,弱校验码搜索将从匹配到的数据块的终止偏移地址重新开始(译者注:也就是说向下滚动了一个数据块).
对于两个几乎一致的文件(这是最常见的情况),这种策略节省了大量的计算量.另外,对于最常见的情况,A的一部分数据能依次匹配上B的一系列数据块,这时对数据块索引号进行编码将是一件很简单的事.

<>

1.6 Pipelining

上面几个小章节描述了在远程系统上组建一个文件副本的过程.如果我们要拷贝一系列文件,我们可以将过程流水线化(pipelining the
process)以期获得很可观的延迟上的优势.


这要求β上启动的两个独立的进程.其中一个进程负责生成和发送校验码给α,另一个进程则负责从α接收不同的信息数据以便重组文件副本.(译者注:即generator-->sender-->receiver)

如果链路上的通信是被缓冲的,两个进程可以相互独立地不断向前工作,并且大多数时间内,可以保持链路在双方向上被充分利用.

<>

1.7 Results


为了测试该算法,创建了两个不同Linux内核版本的源码文件的tar包.这两个内核版本分别是1.99.10和2.0.0.这个tar包大约有24M且5个不同版本的补丁分隔.

在1.99.10版本中的2441个文件中,2.0.0版本中对其中的291个文件做了更改,并删除了19个文件,以及添加了25个新文件.

使用标准GNU diff程序对这两个tar包进行"diff"操作,结果产生了超过32000行总共2.1 MB的输出.

下表显示了两个文件间使用不同数据块大小的rsync的结果.


block size

matches

tag hits

false alarms

data

written

read


300

64247

3817434

948

5312200

5629158

1632284


500

46989

620013

64

1091900

1283906

979384


700

33255

571970

22

1307800

1444346

699564


900

25686

525058

24

1469500

1575438

544124


1100

20848

496844

21

1654500

1740838

445204

在每种情况下,所占用的CPU时间都比在两个文件间直接运行"diff"所需时间少.

表中各列的意思是:

block size:计算校验和的数据块大小,单位字节.

matches:从A中匹配出B中的某个数据块所匹配的次数.(译者注:相比于下面的false matches,这个是true matches)

tag hits:A文件中的16位hash值能匹配到B的hash表项所需的匹配次数.

false alarms:32位弱滚动校验码能匹配但强校验码不能匹配的匹配次数.(译者注:即不同数据块的rolling
checksum出现小概率假重复的次数)

data:逐字节传输的文件纯数据,单位字节.

written:α所写入的总字节数,包括协议开销.这几乎全是文件中的数据.

read:α所读取的总字节数,包括协议开销,这几乎全是校验码信息数据.


结果表明,当块大小大于300字节时,仅传输了文件的一小部分(大约5%)数据.而且,相比使用diff/patch方法更新远端文件,rsync所传输的数据也要少很多.


每个校验码对占用20个字节:弱滚动校验码占用4字节,128位的MD4校验码占用16字节.因此,那些校验码本身也占用了不少空间,尽管相比于传输的纯数据大小而言,它们要小的多.

false alarms少于true matches次数的1/1000,这说明32位滚动校验码可以很好地检测出不同数据块.

tag hits的数值表明第二层校验码搜索检查算法大致每50个字符被调用一次(译者注:以block
size=1100那行数据为例,50W*50/1024/1024=23M).这已经非常高了,因为在这个文件中所有数据块的数量是非常大的,因此hash表也是非常大的.文件越小,我们期望tag
hit的比例越接近于匹配次数.对于极端情况的超大文件,我们应该合理地增大hash表的大小.


下一张表显示了更小文件的结果.这种情况下,众多文件并没有被归档到一个tar包中,而是通过指定选项使得rsync向下递归整个目录树.这些文件是以另一个称为Samba的软件包为源抽取出来的两个版本,源码大小为1.7M,对这两个版本的文件运行diff程序,结果输出4155行共120kB大小.


block size

matches

tag hits

false alarms

data

written

read


300

3727

3899

0

129775

153999

83948


500

2158

2325

0

171574

189330

50908


700

1517

1649

0

195024

210144

36828


900

1156

1281

0

222847

236471

29048


1100

921

1049

0

250073

262725

23988