How to print the longest line in a file?

cat ./text | awk ' { if ( length > x ) { x = length; y = $0 } }END{ print y }'

UPD: summarizing all the advices in the comments

awk 'length > max_length { max_length = length; longest_line = $0 } END { print longest_line }' ./text

cat filename | awk '{ print length }' | sort -n | tail -1

Grep the first longest line

grep -Em1 "^.{$(wc -L <file.txt)}\$" file.txt

The command is unusually hard to read without practise because it mixes shell- and regexp syntax.
For explanation, I will use simplified pseudocode first. The lines starting with ## do not run in the shell.
This simplified code uses the file name F, and leaves out quoting and parts of regexps for readability.

How it works

The command has two parts, a grep- and a wc invocation:

## grep "^.{$( wc -L F )}$" F

The wc is used in a process expansion, $( ... ), so it is run before grep. It calculates the length of the longest line. The shell expansion syntax is mixed with the regular expression pattern syntax in a confusing way, so I will decompose the process expansion:

## wc -L F
42
## grep "^.{42}$" F

Here, the process expansion was replaced with the value it would return, creating the grep commandline that is used. We can now read the regular expression more easily: It matches exactly from start (^) to end ($) of the line. The expression between them matches any character except newline, repeated by 42 times. Combined, that is lines that consist of exactly 42 characters.

Now, back to real shell commands: The grep option -E (--extended-regexp) allows to not escape the {} for readability. Option -m 1 (--max-count=1) makes it stop after the first line is found. The < in the wc command writes the file to its stdin, to prevent wc from printing the file name together with the length.

Which longest lines?

To make the examples more readable with the filename occurring twice, I will use a variable f for the filename; Each $f in the example could be replaced by the file name.

f="file.txt"

Show the first longest line - the first line that is as long as the longest line:

grep -E -m1 "^.{$(wc -L <"$f")}\$" "$f"

Show all longest lines - all lines that are as long as the longest line:

grep -E "^.{$(wc -L <"$f")}\$" "$f"

Show the last longest line - the last line that is as long as the longest line:

tac "$f" | grep -E -m1 "^.{$(wc -L <"$f")}\$"

Show the single longest line - the longest line longer than all other lines, or fail:

[ $(grep -E "^.{$(wc -L <"$f")}\$" "$f" | wc -l) = 1 ] && grep -E "^.{$(wc -L <"$f")}\$" "$f"

(The last command is even more inefficient than the others, as it repeats the complete grep command. It should obviously be decomposed so that the output of wc and the lines written by grep are saved to variables.
Note that all longest lines may actually be all lines. For saving in a variable, only the first two lines need to be kept.)

How to print the longest line in a file?

Grep the first longest line

How it works

Which longest lines?

Tags:

Bash

Awk

Filter

Related

Recent Posts