How to delete line if longer than XY?

sed '/^.\{2048\}./d' input.txt > output.txt

Here's a solution which deletes lines that has 2049 or more characters:

sed '/.\{2049\}/d' <file.in >file.out

The regular expression .\{2049\} would match any line that contains a substring of 2049 characters (another way of saying "at least 2049 characters"). The d command deletes them from the input, producing only shorter line on the output.

BSD sed (on e.g. macOS) can only handle repetition counts of up to 256 in the \{...\} operator (the value of RE_DUP_MAX; see getconf RE_DUP_MAX in the shell). On these systems, you may instead use awk:

awk 'length <= 2048' <file.in >file.out

Mimicking the sed solution literally with awk:

awk 'length >= 2049 { next } { print }' <file.in >file.out

Note that any awk implementation is only guaranteed to be able to handle records of lengths up to LINE_MAX bytes (see getconf LINE_MAX in the shell), but may support longer ones. On macOS, LINE_MAX is 2048.


Something like this should work in Python.

of = open("orig")
nf = open("new",'w')
for line in of:         
    if len(line) < 2048:
        nf.write(line)
of.close()
nf.close()

Tags:

Sed