How to keep only every nth line of a file

~ $ awk 'NR == 1 || NR % 3 == 0' yourfile
Line 1
Line 3
Line 6

NR (number of records) variable is records number of lines because default behavior is new line for RS (record seperator). pattern and action is optional in awk's default format 'pattern {actions}'. when we give only pattern part then awk writes all the fields $0 for our pattern's true conditions.


sed can also do this:

$ sed -n '1p;0~3p' input.txt
Line 1
Line 3
Line 6

man sed explains ~ as:

first~step Match every step'th line starting with line first. For example, ``sed -n 1~2p'' will print all the odd-numbered lines in the input stream, and the address 2~5 will match every fifth line, starting with the second. first can be zero; in this case, sed operates as if it were equal to step. (This is an extension.)


Perl can do this too:

while (<>) {
    print  if $. % 3 == 1;
}

This program will print the first line of its input, and every third line afterwards.

To explain it a bit, <> is the line input operator, which iterates over the input lines when used in a while loop like this. The special variable $. contains the number of lines read so far, and % is the modulus operator.

This code can be written even more compactly as a one-liner, using the -n and -e switches:

perl -ne 'print if $. % 3 == 1'  < input.txt  > output.txt

The -e switch takes a piece of Perl code to execute as a command line parameter, while the -n switch implicitly wraps the code in a while loop like the one shown above.


Edit: To actually get lines 1, 3, 6, 9, ... as in the example, rather than lines 1, 4, 7, 10, ... as I first assumed you wanted, replace $. % 3 == 1 with $. == 1 or $. % 3 == 0.

Tags:

Bash