How to clean up output of linux 'script' command

If you want to view the file, then you can send the output through col -bp; this interprets the control characters. Then you can pipe through less, if you like.

col -bp typescript | less -R

On some systems col wouldn't accept a filename argument, use this syntax instead:

col -bp <typescript | less -R

cat typescript | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > typescript-processed

here's some interpretation of the string input to perl:

  • s/pattern//g means to do a substitution on the entire (the g option means do the entire thing instead of stopping on the first substitute) input string

here's some interpretation of the regex pattern:

  • \e match the special "escape" control character (ASCII 0x1A)
  • ( and ) are the beginning and end of a group
  • | means the group can match one of N patterns. where the N patterns are
    • [^\[\]] or
    • \[.*?[a-zA-Z] or
    • \].*?\a
  • [^\[\]] means
    • match a set of NOT characters where the not characters are [ and ]
  • \[.*?[a-zA-Z] means
    • match a string starting with [ then do a non-greedy .*? until the first alpha character
  • \].*?\a means
    • match a string that starts with ] then do a non-greedy .*? until you hit the special control character called "the alert (bell) character"

For a large quantity of script output, I'd hack a perl script together iteratively. Otherwise hand edit with a good editor.

There is unlikely to be an existing automated method of removing control characters from script output in a way that reproduces what was displayed on the screen at certain important moments (such as when the host was waiting for that first character of some user input).

For example the screen might be blank except for Andrew $, if you then typed rm /* and pressed backspace twelve times (far more than needed), what gets shown on the screen at the end of that depends on what shell was running, what your current stty settings are (which you might change partway through a session) and probably some other factors too.

The above applies to any automated method of continuously capturing input and output. The main alternative is taking "screen shots" or cutting and pasting the screen at appropriate times during the session (which is what I do for user guides, notes for a day-log, etc).

Tags:

Linux

Script