comm

comm: The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.

Contents

1 Usage

2 Return code

3 Example

4 Comparison to diff

5 Other options

6 Limits

7 See also

8 References

Usage

comm reads two files as input, regarded as lines of text. comm outputs one file, which contains three columns. The first two columns contain lines unique to the first and second file, respectively. The last column contains lines common to both. This functionally is similar to diff.

Columns are typically distinguished with the <tab> character. If the input files contain lines beginning with the separator character, the output columns can become ambiguous.

For efficiency, standard implementations of comm expect both input files to be sequenced in the same line collation order, sorted lexically. The sort (Unix) command can be used for this purpose.

The comm algorithm makes use of the collating sequence of the current locale. If the lines in the files are not both collated in accordance with the current locale, the result is undefined.

Return code

Unlike diff, the return code from comm has no logical significance concerning the relationship of the two files. A return code of 0 indicates success, a return code >0 indicates an error occurred during processing.

Example

File foo

apple banana eggplant

File bar

apple banana banana zucchini

comm foo bar apple banana banana eggplant zucchini

This shows that both files have one banana, but only bar has a second banana.

In more detail, the output file has the appearance that follows. Note that the column is interpreted by the number of leading tab characters. \t represents a tab character and \n represents a newline (C language notation). The spaces shown are not part of the output file.

\t \t a p p l e \n \t \t b a n a n a \n \t b a n a n a \n e g g p l a n t \n \t z u c c h i n i \n

Comparison to diff

In general terms, diff is a more powerful utility than comm. The simpler comm is best suited for use in scripts.

The primary distinction between comm and diff is that comm discards information about the order of the lines prior to sorting.

A minor difference between comm and diff is that comm will not try to indicate that a line has "changed" between the two files; lines are either shown in the "from file #1", "from file #2", or "in both" columns. This can be useful if one wishes two lines to be considered different even if they only have subtle differences.

Other options

comm has command-line options to suppress any of the three columns. This is useful for scripting.

There is also an option to read one file (but not both) from standard input.

Limits

Up to a full line must be buffered from each input file during line comparison, before the next output line is written.

Some implementations read lines with the function readlinebuffer() which does not impose any line length limits if system memory suffices.

Other implementations read lines with the function fgets(). This function requires a fixed buffer. For these implementations, the buffer is often sized according to the POSIX macro LINE_MAX.

See also

Comparison of file comparison tools

List of Unix programs

cmp (Unix) -- character oriented file comparison

cut (Unix) -- splitting column oriented files

References

comm: select or reject lines common to two files – Commands & Utilities Reference, The Single UNIX® Specification, Issue 7 from The Open Group

comm(1): compare two sorted files line by line – Linux User Commands Manual

v · d · eUnix command-line interface programs and shell builtins (more)

File system
cat · cd · chmod · chown · chgrp · cksum · cmp · cp · dd · du · df · file · fsck · fuser · ln · ls · lsattr · lsof · mkdir · mount · mv · pax · pwd · rm · rmdir · size · split · tee · touch · type · umask

Processes
at · bg · chroot · cron · kill · killall · nice · pgrep · pidof · pkill · ps · pstree · time · top

User environment
clear · env · exit · finger · history · id · logname · mesg · passwd · su · sudo · uptime · talk · tput · uname · w · wall · who · whoami · write

Text processing
awk · banner · basename · comm · csplit · cut · dirname · ed · ex · fmt · head · iconv · join · less · more · paste · sed · sort · spell · strings · tail · tr · uniq · vi · wc · xargs

Shell builtins
alias · echo · expr · printf · sleep · test · true and false · unset · wait · yes

Networking
dig · inetd · host · ifconfig · netstat · nslookup · ping · rdate · rlogin · netcat · ssh · traceroute

Searching
find · grep · locate · whatis · whereis · which

Documentation
apropos · help · info · man

Miscellaneous
bc · dc · cal · date · lp · lpr

List of Unix utilities

Categories:
Free file comparison tools
Standard Unix programs
Unix SUS2008 utilities

Игры ⚽ Нужна курсовая?

Look at other dictionaries:

Comm — утилита unix, читает файл1 и файл2, которые должны быть предварительно лексически отсортированы, и генерирует вывод, состоящий из трёх колонок текста: строки, найденные только в файле файл1; строки, найденные только в файле файл2; и строки, общие … Википедия
comm — утилита unix, читает файл1 и файл2, которые должны быть предварительно лексически отсортированы, и генерирует вывод, состоящий из трёх колонок текста: строки, найденные только в файле файл1; строки, найденные только в файле файл2; и строки, общие … Википедия
comm. — comm. comm. (fork. for communicationis); cand.comm.; stud.comm … Dansk ordbog
Comm. — Comm., bei Pflanzennamen Abkürzung für Ph. Commerson (s.d.) … Meyers Großes Konversations-Lexikon
comm — abbrev. 1. commander 2. commentary 3. commerce 4. commission 5. committee 6. commonwealth 7. communication * * * … Universalium
comm. — comm. abbr. commendatore … Dizionario italiano
comm — abbrev. 1. commander 2. commentary 3. commerce 4. commission 5. committee 6. commonwealth 7. communication … English World dictionary
Comm — The comm command in Unix is a utility that is used to compare two files for common and distinct lines. Comm is specified in the POSIX standard. It has been widely available on Unix like systems since the mid to late 1980s. Usagecomm reads two… … Wikipedia
comm. — 1. commander. 2. commerce. 3. commission. 4. committee. 5. commonwealth. * * * abbrev 1. Commander 2. Commentary 3. Communication * * * comm., 1. commentary … Useful english dictionary
Comm. — Philibert Commerson Philibert Commerson (prononcer le on de Commerson comme dans cresson ) Philibert Commerson (18 novembre 1727, Châtillon les Dombes[ … Wikipédia en Français

Academic Dictionaries and Encyclopedias

comm

Contents

Usage

Return code

Example

Comparison to diff

Other options

Limits

See also

References

Look at other dictionaries:

Share the article and excerpts

Academic Dictionaries and Encyclopedias

Wikipedia

comm

Contents

Usage

Return code

Example

Comparison to diff

Other options

Limits

See also

References

Look at other dictionaries:

Share the article and excerpts

Direct link