Remove duplicate successive line using uniq utility

With uniq utility you can remove duplicate line if they are within successive line. For example you have successive identical line in the file, then with uniq you can discard all but one of successive identical lines from the file. Consider you have following lines within with files.

# vi student_data.txt
Roll number is 024401
His Name is SHAIK
His Name is SHAIK
He is 24 years old.
Roll number is 12345

Then using use uniq utility as below will yield following result.

# uniq student_data.txt
Roll number is 12345
His Name is SHAIK
He is 24 years old.

Note that within file there was two duplicate lines. One is, "Roll number is 12345" and another is "His Name is SHAIK". Using the "uniq" output only "His Name is SHAIK" line is omitted because they are successive identical lines. However "Roll number is 12345" text line is not removed because they are not successive though they are identical. So uniq utility is used to remove adjacent identical line only.

With the help of "sort" command uniq can be used to remove all duplicate lines within a file regardless of they are successive or not. Following is an example which will remove all duplicate lines within file student_data.txt and save it as sort_student.txt.

# sort student_data.txt | uniq > sort_student.txt

# cat sort_student.txt
He is 24 years old.
His Name is SHAIK
Roll number is 12345

Comments

Popular posts from this blog

ORA-00923: FROM keyword not found where expected

How to make partitioning in Oracle more Quickly

Copy files between Unix and Windows with rcp