Why doesn't the command "ls | file" work?

The fundamental issue is that file expects file names as command-line arguments, not on stdin. When you write ls | file the output of ls is being passed as input to file. Not as arguments, as input.

What's the difference?

  • Command-line arguments are when you write flags and file names after a command, as in cmd arg1 arg2 arg3. In shell scripts these arguments are available as the variables $1, $2, $3, etc. In C you'd access them via the char **argv and int argc arguments to main().

  • Standard input, stdin, is a stream of data. Some programs like cat or wc read from stdin when they're not given any command-line arguments. In a shell script you can use read to get a single line of input. In C you can use scanf() or getchar(), among various options.

file does not normally read from stdin. It expects at least one file name to be passed as an argument. That's why it prints out usage when you write ls | file, because you didn't pass an argument.

You could use xargs to convert stdin into arguments, as in ls | xargs file. Still, as terdon mentions, parsing ls is a bad idea. The most direct way to do this is simply:

file *

Because, as you say, the input of file has to be filenames. The output of ls, however, is just text. That it happens to be a list of file names doesn't change the fact that it is simply text and not the location of files on the hard drive.

When you see output printed on the screen, what you see is text. Whether that text is a poem or a list of filenames makes no difference to the computer. All it knows is that it is text. This is why you can pass the output of ls to programs that take text as input (although you really, really shouldn't):

$ ls / | grep etc
etc

So, to use the output of a command that lists file names as text (such as ls or find) as input for a command that takes filenames, you need to use some tricks. The typical tool for this is xargs:

$ ls
file1 file2

$ ls | xargs wc
 9  9 38 file1
 5  5 20 file2
14 14 58 total

As I said before, though, you really don't want to be parsing the output of ls. Something like find is better (the print0 prints a \0 instead of a newilne after each file name and the -0 of xargs lets it deal with such input; this is a trick to make your commands work with filenames containing newlines):

$ find . -type f -print0 | xargs -0 wc
 9  9 38 ./file1
 5  5 20 ./file2
14 14 58 total

Which also has its own way of doing this, without needing xargs at all:

$ find . -type f -exec wc {} +
 9  9 38 ./file1
 5  5 20 ./file2
14 14 58 total

Finally, you can also use a shell loop. However, note that in most cases, xargs will be much faster and more efficient. For example:

$ for file in *; do wc "$file"; done
 9  9 38 file1
 5  5 20 file2

learned that '|' (pipeline) is meant to redirect the output from a command to the input of another one.

It doesn't "redirect" the output, but takes the output of a program and use it as input, while file doesn't take inputs but filenames as arguments, which are then tested. Redirections do not pass these filenames as arguments neither piping does, the later what you are doing.

What you can do is read the filenames from a file with the --files-from option if you have a file which list all files you want to test, otherwise just pass the paths to your files as arguments.