A script that deletes extra spaces between letters in text

Use wordsegment, a pure-Python word segmentation NLP package:

$ pip install wordsegment
$ python2.7 -m wordsegment <<<"T h e b o o k a l s o h a s a n a n a l y t i c a l p u r p o s e w h i c h i s m o r e i m p o r t a n t"
the book also has an analytical purpose which is more important

The following regex will remove the first space in any string of spaces. That should do the job.

s/ ( *)/\1/g

So something like:

perl -i -pe 's/ ( *)/\1/g' infile.txt

...will replace infile.txt with a "fixed" version.


Based on the fact that the input includes double spaces between words, there is a much simpler solution. You simply change the double spaces to an unused character, remove the spaces and change the unused character back to a space:

echo "T h e  b o o k  a l s o  h a s  a n  a n a l y t i c a l  p u r p o s e  w h i c h  i s  m o r e  i m p o r t a n t  " | sed 's/  /\-/g;s/ //g;s/\-/ /g'

...outputs:

The book also has an analytical purpose which is more important