How can I fix lines broken in wrong places?

With awk:

awk -v ORS= '{print (NR == 1 ? "" : /^[[:lower:]]/ ? " " : RS) $0}
             END {if (NR) print RS}'

That is, do not append the record separator to each line (ORS empty). But prepend a record separator before the current line if not on the first line and the current line doesn't start with a lowercase letter. Otherwise prepend a space character instead, except on the first line.


try

awk '$NF !~ /\.$/ { printf "%s ",$0 ; next ; } {print;}' file

where

  • $NF !~ /\.$/ match line where last element do not end with a dot,
  • { printf "%s ",$0 print this line with a trailling space, and no line feed,
  • next ; } fetch next line,
  • {print;} and print it.

I am sure there will be a sed option.

Note: this will work with line ending in a dot, however condition in sentences beginning with upper case letter won't get merged. See Stéphane Chazelas's answer.


In perl:

#!/usr/bin/perl -w
use strict;
my $input = join("", <>);
$input =~ s/\n([a-z])/ $1/g;
print $input;

Technically you wanted to replace "newline followed by lower-case letter" with "space and-that-lower-case-letter", which is what the core of the above perl script does:

  1. Read in the input to a string input.
  2. Update the input variable to be the result of the search & replace operation.
  3. Print the new value.