My daughter's alphabet

GolfScript, 28 / 34 chars

n/:a{|}*{a{.[2$]--}%*$-1=}%$

The 28-character program above assumes that all the input letters are in the same case. If this is not necessarily so, we can force them into upper case by prepending {95&}% to the code, for a total of 34 chars:

{95&}%n/:a{|}*{a{.[2$]--}%*$-1=}%$

Notes:

  • For correct operation, the input must include at least one newline. This will be true for normal text files with newlines at the end of each line, but might not be true if the input consists of just one line with no trailing newline. This could be fixed at the cost of two extra chars, by prepending n+ to the code.

  • The uppercasing used in the 34-character version is really crude — it maps lowercase ASCII letters to their uppercase equivalents (and spaces to NULs), but makes a complete mess of numbers and most punctuation. I'm assuming that the input will not include any such characters.

  • The 28-character version treats all input characters (except newlines and NULs) equally. In particular, if the input contains any spaces, some will also appear in the output; conveniently, they will sort before any other printable ASCII characters. The 34-character version, however, does ignore spaces (because it turns out I can do that without it costing me any extra chars).

Explanation:

  • The optional {95&}% prefix uppercases the input by zeroing out the sixth bit of the ASCII code of each input byte (95 = 64 + 31 = 10111112). This maps lowercase ASCII letters to uppercase, spaces to null bytes, and leaves newlines unchanged.

  • n/ splits the input at newlines, and :a assigns the resulting array into the variable a. Then {|}* computes the set union of the strings in the array, which (assuming that the array has at least two elements) yields a string containing all the unique (non-newline) characters in the input.

  • The following { }% loop then iterates over each of these unique characters. Inside the loop body, the inner loop a{.[2$]--}% iterates over the strings in the array a, removing from each string all characters not equal to the one the outer loop is iterating over.

    The inner loop leaves the ASCII code of the current character on the stack, below the filtered array. We make use of this by repeating the filtered array as many times as indicated by the ASCII code (*) before sorting it ($) and taking the last element (-1=). In effect, this yields the longest string in the filtered array (as they all consist of repeats of the same character, lexicographic sorting just sorts them by length), except if the character has ASCII code zero, in which case it yields nothing.

  • Finally, the $ at the end just sorts the output alphabetically.


J - 37 char

Reads from stdin, outputs to console.

dlb#&a.>./+/"2=/&a.tolower;._2[1!:1]3

1!:1]3 is the call to stdin. tolower;._2 performs double duty by splitting up the lines and making them lowercase simultaneously. Then we count how many times a character occurs in each row with +/"2=/&a., and take the pointwise maximum over all lines with >./.

Finally, we pull that many of each character out of the alphabet with #&a.. This includes spaces—all found at the front due to their low ASCII value—so we just delete leading blanks with dlb.


JavaScript (ECMAScript 6) - 148 139 135 Characters

Version 2:

Updated to use array comprehension:

[a[i][0]for(i in a=[].concat(...s.split('\n').map(x=>x.split(/ */).sort().map((x,i,a)=>x+(a[i-1]==x?++j:j=0)))).sort())if(a[i-1]<a[i])]

Version 1:

[].concat(...s.split('\n').map(x=>x.split(/ */).sort().map((x,i,a)=>x+(a[i-1]==x?++j:j=0)))).sort().filter((x,i,a)=>a[i-1]!=x).map(x=>x[0])

Assumes that:

  • The input string is in the variable s;
  • We can ignore the case of the input (as specified by the question - i.e. it is all in either upper or lower case);
  • The output is an array of characters (which is about as close as JavaScript can get to the OP's requirement of a list of characters); and
  • The output is to be displayed on the console.

With comments:

var l = s.split('\n')             // split the input up into sentences
         .map(x=>x.split(/ */)   // split each sentence up into letters ignoring any
                                  // whitespace
                  .sort()         // sort the letters in each sentence alphabetically
                  .map((x,i,a)=>x+(a[i-1]==x?++j:j=0)))
                                  // append the frequency of previously occurring identical
                                  // letters in the same sentence to each letter.
                                  // I.e. "HELLO WORLD" =>
                                  // ["D0","E0","H0","L0","L1","L2","O0","O1","R0","W0"]
[].concat(...l)                   // Flatten the array of arrays of letters+frequencies
                                  // into a single array.
  .sort()                         // Sort all the letters and appended frequencies
                                  // alphabetically.
  .filter((x,i,a)=>a[i-1]!=x)     // Remove duplicates and return the sorted
  .map(x=>x[0])                   // Get the first letter of each entry (removing the
                                  // frequencies) and return the array.

If you want to:

  • Return it as a string then add .join('') on the end;
  • Take input from a user then replace the s variable with prompt(); or
  • Write it as a function f then add f=s=> to the beginning.

Running:

s="HELLO\nI LOVE CAT\nI LOVE DOG\nI LOVE MOMMY\nMOMMY LOVE DADDY";
[].concat(...s.split('\n').map(x=>x.split(/ */).sort().map((x,i,a)=>x+(a[i-1]==x?++j:j=0)))).sort().filter((x,i,a)=>a[i-1]!=x).map(x=>x[0])

Gives the output:

["A","C","D","D","D","E","G","H","I","L","L","M","M","M","O","O","T","V","Y","Y"]

Tags:

Code Golf