Index a string in bash

Simple as this.

(bash)

for i in * ; do mv -- "$i" "${i:0:5}" ; done

Voila.

And an explanation from Advanced Bash-Scripting Guide (Chapter 10. Manipulating Variables), (with extra NOTEs inline to highlight the errors in that manual):

Substring Extraction

${string:position}

Extracts substring from $string at $position.

If the $string parameter is "*" or "@", then this extracts the positional parameters, starting at $position.

${string:position:length}

Extracts $length characters of substring from $string at $position.

NOTE missing quotes around parameter expansions! echo should not be used for arbitrary data.

stringZ=abcABC123ABCabc
#       0123456789.....
#       0-based indexing.

echo ${stringZ:0}                       # abcABC123ABCabc
echo ${stringZ:1}                       # bcABC123ABCabc
echo ${stringZ:7}                       # 23ABCabc 

echo ${stringZ:7:3}                     # 23A
                                        # Three characters of substring.


# Is it possible to index from the right end of the string?

echo ${stringZ:-4}                      # abcABC123ABCabc
# Defaults to full string, as in ${parameter:-default}.
# However . . . 

echo ${stringZ:(-4)}                    # Cabc
echo ${stringZ: -4}                     # Cabc
# Now, it works.
# Parentheses or added space "escape" the position parameter.

The position and length arguments can be "parameterized," that is, represented as a variable, rather than as a numerical constant.


If the $string parameter is "*" or "@", then this extracts a maximum of $length positional parameters, starting at $position.

echo ${*:2}          # Echoes second and following positional parameters.
echo ${@:2}          # Same as above.

echo ${*:2:3}        # Echoes three positional parameters, starting at second.

NOTE: expr substr is a GNU extension.

expr substr $string $position $length

Extracts $length characters from $string starting at $position.

stringZ=abcABC123ABCabc
#       123456789......
#       1-based indexing.

echo `expr substr $stringZ 1 2`           # ab
echo `expr substr $stringZ 4 3`           # ABC

NOTE: That echo is redundant and makes it even less reliable. Use expr substr + "$string1" 1 2.

NOTE: expr will return with a non-zero exit status if the output is 0 (or -0, 00...).


BTW. The book is present in the official Ubuntu repository as abs-guide.


In POSIX sh,

  • "${var%?????}" is $var stripped of the last 5 trailing characters (or $var if $var contains fewer than 5 characters)

  • "${var%"${var#??????????}"}" is the first 10 characters of $var.

  • "${var%_*}" is $var stripped of the shortest string that matches _* at the end of $var (foo_bar_baz -> foo_bar).
  • "${var%%_*}": same but longest match instead of shortest match (foo_bar_baz -> foo).
  • if you wanted to get foo_bar_: "${var%"${var##*_}"}" (${var##pattern} is the same as ${var%%pattern} but looking for the pattern at the beginning of $var instead of the end).

With zsh:

  • $var[1,-6] for first character to 6th from the end (so all but the last 5).
  • $var[1,10] for first 10 characters.

With ksh, bash or zsh:

  • "${var:0:10}": first 10 characters of $var

With bash or zsh:

  • "${var:0:-5}": all but the last 5 characters (gives an error and exits the script if $var is set but contains fewer than 5 characters, also when $var is not set with zsh).

If you need Bourne sh compatibility, it's very difficult to do reliably. If you can guarantee the result won't end in newline characters you can do:

first_10=`expr " $var" : ' \(.{1,10\}\)'` # beware the exit status
                                          # may be non-zero if the
                                          # result is 0 or 0000000000

all_but_last_5=`expr " $var" : ' \(.*\).\{5\}'`

You'll also have a limit on the length of $var (varying between systems).

In all those solutions, if $var contains bytes that can't form part of valid characters, YMMV.


sh doesn't provide a built-in way of getting a substring out of a string (as far as I can see), but with bash you may do

${i:0:10}

This will give you the first ten characters of the value of the variable i.

The general format is ${variable:offset:length}.

Tags:

String

Shell

Bash