Count the words in a text and display them

grep and coreutils  44  42

grep -io '[a-z0-9]*'|sort|uniq -c|sort -nr

Test:

printf "This is a text and a number: 31." |
grep -io '[a-z0-9]*'|sort|uniq -c|sort -nr

Results in:

  2 a
  1 This
  1 text
  1 number
  1 is
  1 and
  1 31

Update

  • Use case-insensitive option and shorter regex. Thanks Tomas.

Java 8: 289

Which is pretty good, since java is a very non-golfy language.

import java.util.stream.*;class C{static void main(String[]a){Stream.of(a).flatMap(s->of(s.split("[\\W_]+"))).collect(Collectors.groupingBy(x->x,Collectors.counting())).entrySet().stream().sorted(x,y->x.getValue()-y.getValue()).forEach(e->System.out.println(e.getKey()+":"+e.getValue()));}

Ungolfed:

import java.util.stream.*;
class C {
    static void main(String [] args){
        Stream.of(args).flatMap(arg->Stream.of(arg.split("[\\W_]+")))
            .collect(Collectors.groupingBy(word->word,Collectors.counting()))
            .entrySet().stream().sorted(x,y->x.getValue()-y.getValue())
            .forEach(entry->System.out.println(entry.getKey()+":"+entry.getValue()));
    }
}

Run from the command line:

java -jar wordCounter.jar This is a text and a number: 31.

APL (57)

⎕ML←3⋄G[⍒,1↓⍉G←⊃∪↓Z,⍪+⌿∘.≡⍨Z←I⊂⍨(I←⍞)∊⎕D,⎕A,⎕UCS 96+⍳26;]

e.g.

      ⎕ML←3⋄G[⍒,1↓⍉G←⊃∪↓Z,⍪+⌿∘.≡⍨Z←I⊂⍨(I←⍞)∊⎕D,⎕A,⎕UCS 96+⍳26;]
This is a text and a number: 31.
 a       2
 This    1
 is      1
 text    1
 and     1
 number  1
 31      1

Explanation:

  • ⎕D,⎕A,⎕UCS 96+⍳26: numbers, uppercase letters, lowercase letters
  • (I←⍞)∊: read input, store in I, see which ones are alphanumeric
  • Z←I⊂⍨: split I in groups of alphanumeric characters, store in Z
  • +⌿∘.≡⍨Z: for each element in Z, see how often it occurs
  • Z,⍪: match each element in Z pairwise with how many times it occurs
  • G←⊃∪↓: select only the unique pairs, store in G
  • ⍒,1↓⍉G: get sorted indices for the occurrences
  • G[...;]: reorder the lines of G by the given indices

Tags:

Code Golf