Is piping, shifting, or parameter expansion more efficient?

  • First rule of software optimization: Don't.

    Until you know the speed of the program is an issue, there's no need to think about how fast it is. If your list is about that length or just ~100-1000 items long, you probably won't even notice how long it takes. There's a chance you're spending more time thinking about the optimization than what the difference would be.

  • Second rule: Measure.

    That's the sure way to find out, and the one that gives answers for your system. Especially with shells, there are so many, and they aren't all identical. An answer for one shell might not apply for yours.

    In larger programs, profiling goes here too. The slowest part might not be the one you think it is.

  • Third, the first rule of shell script optimization: Don't use the shell.

    Yeah, really. Many shells aren't made to be fast (since launching external programs doesn't have to be), and they might even parse the lines of the source code again each time.

    Use something like awk or Perl instead. In a trivial micro-benchmark I did, awk was dozens of times faster than any common shell in running a simple loop (without I/O).

    However, if you do use the shell, use the shell's builtin functions instead of external commands. Here, you're using expr which isn't builtin in any shells I found on my system, but which can be replaced with standard arithmetic expansion. E.g. i=$((i+1)) instead of i=$(expr $i + 1) to increment i. Your use of cut in the last example might also be replaceable with standard parameter expansions.

    See also: Why is using a shell loop to process text considered bad practice?

Steps #1 and #2 should apply to your question.


Pretty simple with awk. This will get you the value of every fourth field for input of any length:

$ awk -F' ' '{for( i=1;i<=NF;i+=3) { printf( "%s%s", $i, OFS ) }; printf( "\n" ) }' <<< $list
1 5 6 9 15

This works be leveraging built-in awk variables such as NF (the number of fields in the record), and doing some simple for looping to iterate along the fields to give you the ones you want without needing to know ahead of time how many there will be.

Or, if you do indeed just want those specific fields as specified in your example:

$ awk -F' ' '{ print $1, $4, $7, $10, $13 }' <<< $list
1 5 6 9 15

As for the question about efficiency, the simplest route would be to test this or each of your other methods and use time to show how long it takes; you could also use tools like strace to see how the system calls flow. Usage of time looks like:

$ time ./script.sh

real    0m0.025s
user    0m0.004s
sys     0m0.008s

You can compare that output between varying methods to see which is the most efficient in terms of time; other tools can be used for other efficiency metrics.


I'm only going to give some general advice in this answer, and not benchmarks. Benchmarks are the only way to reliably answer questions about performance. But since you don't say how much data you're manipulating and how often you perform this operation, there's no way to do a useful benchmark. What's more efficient for 10 items and what's more efficient for 1000000 items is often not the same.

As a general rule of thumb, invoking external commands is more expensive than doing something with pure shell constructs, as long as the pure shell code doesn't involve a loop. On the other hand, a shell loop that iterates over a large string or a large amount of string is likely to be slower than one invocation of a special-purpose tool. For example, your loop invoking cut could well be noticeably slow in practice, but if you find a way to do the whole thing with a single cut invocation that's likely to be faster than doing the same thing with string manipulation in the shell.

Do note that the cutoff point can vary a lot between systems. It can depend on the kernel, on how the kernel's scheduler is configured, on the filesystem containing the external executables, on how much CPU vs memory pressure there is at the moment, and many other factors.

Don't call expr to perform arithmetic if you're at all concerned about performance. In fact, don't call expr to perform arithmetic at all. Shells have built-in arithmetic, which is clearer and faster than invoking expr.

You seem to be using bash, since you're using bash constructs that don't exist in sh. So why on earth would you not use an array? An array is the most natural solution, and it's likely to be the fastest, too. Note that array indices start at 0.

list=(1 2 3 5 9 8 6 90 84 9 3 2 15 75 55)
for ((count = 0; count += 3; count < ${#list[@]})); do
  echo "${list[$count]}"
done

Your script may well be faster if you use sh, if your system has dash or ksh as sh rather than bash. If you use sh, you don't get named arrays, but you still get the array one of positional parameters, which you can set with set. To access an element at a position that is not known until runtime, you need to use eval (take care of quoting things properly!).

# List elements must not contain whitespace or ?*\[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
count=1
while [ $count -le $# ]; do
  eval "value=\${$count}"
  echo "$value"
  count=$((count+1))
done

If you only ever want to access the array once and are going from left to right (skipping some values), you can use shift instead of variable indices.

# List elements must not contain whitespace or ?*\[
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
set $list
while [ $# -ge 1 ]; do
  echo "$1"
  shift && shift && shift
done

Which approach is faster depends on the shell and on the number of elements.

Another possibility is to use string processing. It has the advantage of not using the positional parameters, so you can use them for something else. It'll be slower for large amounts of data, but that's unlikely to make a noticeable difference for small amounts of data.

# List elements must be separated by a single space (not arbitrary whitespace)
list='1 2 3 5 9 8 6 90 84 9 3 2 15 75 55'
while [ -n "$list" ]; do
  echo "${list% *}"
  case "$list" in *\ *\ *\ *) :;; *) break;; esac
  list="${list#* * * }"
done