Find only GUIDs in file - Bash

With the GNU implementation of grep (or compatible):

<your-file grep -Ewo '[[:xdigit:]]{8}(-[[:xdigit:]]{4}){3}-[[:xdigit:]]{12}' |
  while IFS= read -r guid; do
    your-action "$guid"
    sleep 5
  done

Would find those GUIDs wherever they are in the input (and provided they are neither preceded nor followed by word characters).

GNU grep has a -o option that prints the non-empty matches of the regular expression.

-w is another non-standard extension coming I believe from SysV to match on whole words only. It matches only if the matched text is between a transition between a non-word and word character and one between a word and non-word character (where word characters are alphanumerics or underscore). That's to guard against matching on things like:

aaaaaaaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaaaaaaaaaa

The rest is standard POSIX syntax. Note that [[:xdigit:]] matches on ABCDEF as well. You can replace it with [0123456789abcdef] if you want to match only lower case GUIDs.

While I love Regular Expressions, I prefer to avoid over-specifying. For this particular data set (known data format, one GUID per line, plus header and footer), I'd just strip out the header/footers:

$ cat guids.txt | egrep -v 'GUIDs|--|rows|^$' |
    while read guid ; do
      some_command "$guid"
      sleep 5
    done

Alternatively, I'd grep out the lines I want, but also keep the regexp as simple as possible for the current data set:

egrep '^[0-9a-f-]{36}$'

Find only GUIDs in file - Bash

Tags:

Scripting

Bash

Wildcards

Shell Script

Related

Recent Posts