Explain:

(1). This manual only selectsinfo sort Translation of useful information in, To view the whole content, Please help yourself.info sort.

(2). In Translation, Used in parentheses" notes" Of, For me, Non original content, Help to understand and explain.

(3). This articlesort Order forCentOS 7.2 Upper, Version issort (GNU coreutils) 8.22, Some options are available inCentOS
6 Do not support, as"--debug".

(4). I don't understand.sort When processing fields and sorting mechanisms, It is strongly recommended not to readman sort.

(5).sort Command complete usage: The king of text sorting: Play throughsort command
<http://www.cnblogs.com/f-ck-need-u/p/7442886.html>.

Collection of my translations:http://www.cnblogs.com/f-ck-need-u/p/7048359.html
<http://www.cnblogs.com/f-ck-need-u/p/7048359.html#mytranslations>

7.1 'sort': Sort text files

===========================


sort Command for sorting, Merge or compare a given file( Multiple can be given) All rows, If no input file is given or the input file is"-", Then read the standard input. By default,sort Print operation results in standard output.

grammar:

sort [OPTION]... [FILE]...

sort Yes3 Operation modes: sort( default), Merge and check whether it has been sorted. Use the following3 Options change operation mode:

'-c'
'--check'
'--check=diagnose-first'

Check whether the given file has been sorted: If unsorted is detected, The diagnostic information will be output and the status code1 Sign out, This diagnostic message contains the first out of order row. Otherwise exit with success status. At most one detection file can be given.

'-C'
'--check=quiet'
'--check=silent'

It is similar to"-c", But diagnostic information will not be output. If files are sorted, Exit with success status, Otherwise, use the status code1 Sign out. At most one file can be given.

'-m'
'--merge'

Merge multiple files, Each input file must be sorted. When merging, groups will be merged according to the sorted results.sort Generally used for sorting, But it still provides consolidation, Because it's very fast.

 


sort The collation is: Sort the given fields in the order given on the command line, Sorting is based on the sorting options assigned to each field, Until different sorting options are found or the end of sorting. If no sort is givenkey( notes:key mean-k Specified value), Sort the whole row. Last, If all givenkey When the comparison results of, The entire row will be sorted completely by default( notes: In ascending order), but"-r" Can change this promotion, Descending order. This ranking is called" Last sort". Use"-s" Option can be disabled" Last sort", Make the rows with the same sorting result retain the original relative order."-u" The option also disables" Last sort".

Unless otherwise specified, Otherwise, all comparisons are based on"LC_COLLATE" Sorting by the specified character set's collation.

 

Exit status code:

0 When no errors occur
1 If"-c" or"-C" When the test finds that the input data is not sorted
2 When an error occurs

If the environment variable is set"TMPDIR",sort It will be used as a temporary directory instead of the default"/tmp"."-T" Option will override the value set by the environment variable.

The following options affect the output of the sort. they
Can be specified as a global option, It can also be used askey Part of. If nokey, The global option will be applied to the entire row, Otherwise specifiedkey Global options will be inherited, Unlesskey It also specifies options( notes: Thekey Global options will be overridden).

To consider portability, It is recommended that global options be specified in the"-k"( or"--key") In front of.

'-b'
'--ignore-leading-blanks'

ignorekey Leading whitespace for( Including spaces, Tab character). When this option is not given, Blank symbol pair"-k" Option specifies character position has an effect( notes: for example"-k
2.2" Designated first2 Characters may be blank).

'-f'
'--ignore-case'


Treat lowercase characters as uppercase characters. for example,"b" and"B" They are equal.. When and"-u" When used with options( notes: Duplicate rows can only be output once), Equivalent lines of those lowercase characters are discarded( notes: In other words, The output is a line of uppercase characters).( At present, there is no way to discard the equivalent lines of uppercase characters, Even use"-r" Not good either. Because at any time,"-r" Options just reverse the final sort result, Does not affect the sorting process.

'-h'
'--human-numeric-sort'
'--sort=human-numeric'


Sort file size formats. First, sort positive and negative( Positive number>0> negative), Sort the size suffix again(0<k=K<M<G<T...), Finally, sort the values. It doesn't care if the conversion accuracy is1000 still1024, Because it's always automatically expanding to the closest suffix( notes: for example999M and1G When comparing, the1000 As conversion unit,1023M and1G When comparing, the1024 As conversion unit).

'-M'
'--month-sort'
'--sort=month'

Sort by month in character format.
An initial string, consisting of any amount of blanks, followed by a month
name abbreviation, is folded to UPPER case and compared in the order 'JAN' <
'FEB' < ... < 'DEC'. Invalid names compare low to valid names.

'-n'
'--numeric-sort'
'--sort=numeric'

Sort by number. Empty string"" or"\0" Be regarded as nothing. Numerical sort is precise sort, Sort without rounding.

( notes:
The difference between numerical sorting and default sorting rules is, Whenkey When non mathematical characters are encountered in, Blank space, Letter, Special characters, etc, Sort will end directly( staysort Internal thought no match found). In other words,"-k
2" and"-k
2n" Different, Although these twokey Will extend to the end of the line, The former compares from the second field all the way to the end of the line in character set order, The latter may be only for the2 Field matching, Because there may be special symbols between the second and third fields, Causes the numerical sorting to end directly.

therefore, about"abc 100 200" Such input, Suppose the field separator is a space, When specified"-k 2n" Time, Thekey by"100
200", But because of the white space, Make thekey The sorting of ends in the second field. If it is"abc 100\0200 200","-k
2n" When sorting, Although it seems to be100200, But it's only right.100 Sort, In other words, If there is another line at this time2 Field value is110, It looks big100200 Would be less than110. Test statement:
echo -e "b 100:200 200\na 110 300" | tr ':' '\0'|sort -t ' ' -k2n -k1
therefore, about"-n" Speaking, It's absolutely impossible to crosskey Boundary. But the default collation will spankey Work.)

'-r'
'--reverse'

Reverse the results of the comparison, Make the result largerkey Earlier appearance.( notes:"-r" Does not change sorting behavior, Instead, the output after sorting is reversed, Therefore, only output results after sorting are affected)

'-k POS1[,POS2]'
'--keys=POS1[,POS2]'

Specify sortedkey, That is, the start and end fields of each row sorting( If omittedPOS2, The ending position is the end of the line).


POS The format is"F[.C][OPTS]", amongF Indicates the sequence number of the field,C Indicates the sequence number of the characters in the field. Field and character positions are from1 Start calculation. IfPOS2 The character position of is specified as0, ExpressPOS2 Last character in field. IfPOS1 Omission".C", The default value is1( Start character of field), IfPOS2 Omission".C", The default value is0( End character of field).OPTS Sorting options for, These options override the global options, Make thekey You can sort by independent sorting options.keys Can span multiple fields.

( notes:OPTS Specified inPOS1 andPOS2 It's the same thing, Because one"-k" Assign onekey, Whether it isPOS1 stillPOS2 MediumOPTS It's all rightkey Effective, but"b" Except option, See below)

Example: To sort the second field, Use"--key=2,2"(-k 2,2). May use"--debug" Options help view, Analyze and determine which fields in each row are used for sorting.

'--debug'

' Show parts of each row for sorting. Additional information will also be provided.

'-o OUTPUT-FILE'
'--output=OUTPUT-FILE'


Write the output of sorting toOUTPUT-FILE in. generally speaking,sort Open upOUTPUT-FILE Read all input before, So you can safely save the sorting results to the input file, Just like"sort
-o file1 file1" and"cat file1 | sort -o
file1" equally. however,"-m" Option to open the output file before reading the input, So the following statement is not safe:
"cat file1 | sort -m -o file1 -"
'-s'
'--stable'

prohibitsort implement" Last sort". When no field or global options are specified, This option will not work, Unless otherwise specified"-r" option.
( notes: Last sort: staykey When the comparison results of are the same,sort The final method is to sort the whole row again by default, I.e. by letter, Ascending to sort the whole row last. This is called" Last sort".
If no options are specified, It's completely default, So there's no need for a final sort. If you specify yes"-r" option, Because"-r" Is to reverse the final result, So it will affect this time" Last sort" Result)

'-t SEPARATOR'
'--field-separator=SEPARATOR'

When searching in each linekey When, UseSEPARATOR Character as field separator. By default, Fields are separated by empty strings between blank and non blank characters.

therefore, If the input behavior" foo bar", Split into two fields by default" foo" and"
bar",( notes: The empty characters between the blank and non blank characters are at the beginning of the line and"oo" Post position). The field separator is not the content of the delimited field, therefore"sort -t ' '" Yes" foo
bar" Separation time, Will be split into3 Field: Empty field,"foo" and"bar". however, Each individual field is extended to the end of the row, Just like"-k 2", Or like"-k
2,3" Fields containing ranges, They all retain field separators when extended.
( notes: withsort -t ' ' take as an example,"-k 2" In fact, it means"foo bar", It extends to the end of the line, And the middle field separator is reserved. and"-k
1,2" In fact, it means" foo", Because it's clearly specifiedkey To end of second field, But the middle field separator remains)
If you want to specify a blank field separator, Then use"\0", for example"sort -t '\0'".

'--parallel=N'

Set upsort The number of parallel threads running isN. defaultN Set to availablecpu Number, But the maximum limit is8, Because more than8 Performance benefits decrease after that.

'-u'
'--unique'

Normally,"-u" Only the first row of the sorted repeated row will be output. This option disables" Last sort"( notes: See previous translation).

"sort -u" and"sort | uniq" It is equivalent. But it may not be equivalent if more options are extended, for example,"sort -n
-u" Only the uniqueness of the numerical part will be checked, but"sort -n | uniq" staysort After sorting the number of rows,uniq Uniqueness of the entire row will be checked.

'-z'
'--zero-terminated'

Use"\0" Split each line instead of using line breaks.

 


"-k" Designatedkey Can be specified later"bfhgnr" Other options, In this case, Thekey Global options will not be inherited. except"b" option, All options work for the entirekey, Whether the option is written inPOS1 stillPOS2 upper. If specified"b" option, It only works independentlyPOS1 orPOS2 upper, But if you inherit the global"-b", It will affect the wholekey upper. If the input line contains leading white space characters, And not used"-t" option,"-k" Usually combined"-b" Or some options that implicitly ignore leading white space characters(ghn) Use together, Otherwise, leading white space characters may lead to very confusing fields.

IfPOS The field or character position specified in exceeds the end of line or field, Then thekey Empty. If specified"-b" option,".C" Section will start with the first non blank character of the field.

Here are some examples, Used to describe the combination of different options:

*   Sort by number, Descending order(reverse) sort -n -r
*   Sort alphabetically, Ignore first and second fields, And the leading blank of the third field is ignored. Single used herekey, Thekey Start with a non blank character in the third field, Extend all the way to the end of the line.
This wholekey All in alphabetical order. sort -k 3b
* Sort the second field by numerical value, And by specifying the fifth field3,4 Alphabetize between characters to break the rule of numerical sorting. Use":" As field separator. sort -t : -k 2
,2n -k5.3,5.4
( notes: Anytime, When you only want to sort a field, It is recommended to specify the start and end positions)

Be careful, If it's about"-k 2n" Instead of"-k 2,2n", Thekey Extend from the second field all the way to the end of the line, This is the main sortkey, Sub ordinationkey"-k
5.3,5.4" Sorting in main orderkey Sort by letter based on. In most cases, Give Waykey Backward expansion is generally not the expected behavior.

Attention should also be paid."n" Option scope is firstkey. This is equivalent to"-k 2n,2" or"-k
2n,2n". All modifiers, except"-b", Whether written inpos1 stillpos2, It's going to work on the whole thingkey.

( notes: Becausen Option cannot spankey, So even if it's written as"-k 2n" It's also equivalent, But the following two commands are different:
sort -t : -k 2 -k 5.3,5.4n sort -t : -k 2,2 -k 5.3,5.4n

Because the default character set collation will spankey, In the first orderkey From the first2 Field start, Until the end of the line, So we'll start with the wholekey Sort by character, Then on this basis, pairkey Sort by number.
Let's take another example: Even the Lordkey The field of is in the secondarykey After the field of, Vice-key Because of character set sorting, So it will still cross the mainkey.)
sort -t : -k 5n -k 2
* Yes/etc/passwd Document No.5 field order, And ignore leading blanks. If the first5 Field sorting results are equal, Then further compare the3 Fielduid Sort. The field separator is":".
sort -t : -k 5b,5 -k 3,3n /etc/passwd sort -t : -n -k 5b,5 -k 3,3 /etc/passwd
sort -t : -b -k 5,5 -k 3,3n /etc/passwd

The above three commands are equivalent. The first command specifies the firstkey OfPOS1 To ignore leading blanks, And the second one.key To sort by value. In the other two commands, Missing option'skey Global options will be inherited. Why inheritance works correctly here, Because"-k
5b,5b" and"-k 5b,5" It is equivalent..

* Right one系列日志文件进行排序,主排序key为IPv4,副排序key为时间戳.如果两行的主,副key都完全一致,则按照文件被读取时的相对顺序输出.日志
文件包含的行格式大致如下:
4.150.156.3 - - [01/Apr/2004:06:31:51 +0000] message 1
211.24.3.231 - - [24/Apr/2004:20:17:39 +0000] message 2

使用单个空格可以精确分割这些字段.IPV4地址列按照字典顺序排序,例如212.61.52.2小于212.129.233.201,因为61小于129.
sort -s -t ' ' -k 4.9n -k 4.5M -k 4.2n -k 4.14,4.21 file*.log |\ sort -s -t '.'
-k1,1n -k 2,2n -k 3,3n -k 4,4n

该示例无法仅使用一个sort语句实现,因为IPV4地址需要使用"."分隔,而时间戳需要使用空格分隔.因此,使用两个sort语句:第一个sort语句按照时间戳排序,第二个语句按照IPV4排序.第一个sort命令中使用"-k"将每个字段进行隔离,先按照年排序,再按照月份排序,接着是日,最后对"时:分:秒"排序.除了"时:分:秒"这个key,其余的key都没必要指定key的结束位置,因为"n"和"M"选项作用范围不能跨域每个key的左边界.第二个sort命令是对ipv4地址按照字典顺序排序的.第二个sort语句中使用了"-s"选项,以防止主排序key的关系被副排序key破坏,第一个sort语句中使用"-s"选项是为了保证两个sort语句在"-s"属性上的一致性.

(注:由于n选项无法跨越key边界和非数学字符,因此上面第二个sort命令和下面的命令是等价的:)
sort -s -t '.' -n -k1 -k2 -k3 -k4