Morse Decode Golf

Haskell, 296 characters

  • Dictionary file: must be a text file named "d"
  • Input: stdin, may have a trailing newline but no internal whitespace
main=do f<-readFile"d";getLine>>=mapM(putStrLn.unwords).(words f&)
i!""=[i]
(i:j)!(p:q)|i==p=j!q
_!_=[]
_&""=[[]]
d&i=do
w<-d
j<-i!(w>>=((replicate 97"X"++words".- -... -.-. -.. . ..-. --. .... .. .--- -.- .-.. -- -. --- .--. --.- .-. ... - ..- ...- .-- -..- -.-- --..")!!).fromEnum)
n<-d&j
[w:n]

Explanation of elements:

  • main reads the dictionary, reads stdin, executes &, and formats the output of & with appropriate whitespace.
  • (replicate 97"X"++words".- -... -.-. -.. . ..-. --. .... .. .--- -.- .-.. -- -. --- .--. --.- .-. ... - ..- ...- .-- -..- -.-- --..")!!) (an expression inside of the definition of &) is a list whose indices are character codes (97 is the code of 'a') and values are Morse sequences.
  • ! (a function named as an infix operator) matches a string against a prefix; if the prefix is present it returns the remainder in a one-element list (success in the list monad), otherwise the empty list (failure in the list monad)
  • & uses the list monad for “nondeterministic” execution; it

    1. picks an entry of d (a dictionary word),
    2. uses ! to match the Morse form of that word (w>>=((…)!!).fromEnum, which is equivalent to concatMap (((…)!!) . fromEnum) w) against the input string i,
    3. calls itself (d&j) to match the rest of the string, and
    4. returns the possible result as a list of words w:n, in the list monad [w:n] (which is the shorter, concrete equivalent to return (w:n)).

    Note that every line after line 6 is part of the do expression started on line 6; this takes exactly the same number of characters as using semicolons on a single line, but is more readable, though you can only do it once in a program.

This program is extremely slow. It can be made faster (and slightly longer) easily by storing the morsified words next to the originals in a list rather than recomputing them at each pattern match. The next thing to do would be to store the words in a binary tree keyed by Morse symbols (a 2-ary trie) so as to avoid trying unnecessary branches.

It could be made slightly shorter if the dictionary file did not contain unused symbols such as "-", allowing removal of replicate 97"X"++ in favor of doing .(-97+) before the !!.


Ruby, 210

(1..(g=gets).size).map{|x|puts IO.read(?d).split.repeated_permutation(x).select{|p|p.join.gsub(/./,Hash[(?a..?z).zip"(;=/%513':07*)29@-+&,4.<>?".bytes.map{|b|('%b'%(b-35))[1,7].tr'01','.-'}])==g}.map{|r|r*' '}}

If there exists such a practice as "over-golfing", I suspect I have partaken this time around. This solution generates an array of arrays of repeated permutations of all of the dictionary words, from length 1 up to the length of the input. Given that "a" is the shortest word in the dictionary file and its code is two characters long, it would have been sufficient to generate permutations of length up to half the size of the input, but adding /2 is tantamount to verbosity in this domain, so I refrained.

Once the permutation array has been generated (NB: it is of length 45404104 in the case of the pangrammatic example input), each permutation array is concatenated, and its alphabetic characters are replaced with their Morse code equivalents via the rather convenient (Regexp, Hash) variant of the #gsub method; we've found a valid decoding if this string is equal to the input.

The dictionary is read (several times) from a file named "d", and the input must not contain a newline.

Example run (with a dictionary that'll give the program a fighting chance at ending before the heat death of the universe):

$ cat d
puzzles
and
code
dummy
golf
programming
$ echo -n .--..-.-----..-..-----..-.--..--...---..--...-.......--.-..-.-.----...--.---.-....-. | ruby morse.rb
programming puzzles and code golf
^C

Python - 363 345

Code:

D,P='-.';U,N='-.-','.-.'
def s(b,i):
 if i=='':print b
 for w in open('d').read().split():
  C=''.join([dict(zip('abcdefghijklmnopqrstuvwxyz-\'23',[P+D,D+3*P,U+P,'-..',P,D+N,'--.',4*P,2*P,P+3*D,U,N+P,2*D,D+P,D*3,'.--.',D+U,N,P*3,D,'..-',3*P+D,'.--','-..-',U+D,'--..']+['']*4))[c]for c in w]);L=len(C)
  if i[:L]==C:s(b+' '+w,i[L:])
s('',input())

Explanation:

The dictionary must be stored as a plain text file named "d".

D, P, U and N are just some helper variables for a shorter definition of the morse lookup table.

s(i) is a recursive function that prints the previously translated message part p and each valid translation of the remaining code part i: If i is empty, we reached the end of the code an b contains the whole translation, thus we simply print it. Otherwise we check each word w in the dictionary d, translate it into morse code C and, if the remaining code i starts with C, we add the word w to the translated beginning b and call the function s recursively on the remainder.

Note on efficiency:

This is a pretty slow but golfed version. Especially loading the dictionary and constructing the morse lookup table (dict(zip(...))) in each iteration (to avoid more variables) costs a lot. And it would be more efficient to translate all words in the dictionary file once in advance and not in each recursion on demand. These ideas lead to the following version with 40 more characters but significant speed-up:

d=open('d').read().split()
D,P='-.';U,N='-.-','.-.'
M=dict(zip('abcdefghijklmnopqrstuvwxyz-\'23',[P+D,D+3*P,U+P,'-..',P,D+N,'--.',4*P,2*P,P+3*D,U,N+P,2*D,D+P,D*3,'.--.',D+U,N,P*3,D,'..-',3*P+D,'.--','-..-',U+D,'--..']+['']*4))
T=[''.join([M[c]for c in w])for w in d]
def s(b,i):
 if i=='':print b
 for j,w in enumerate(d):
  C=T[j];L=len(C)
  if i[:L]==C:s(b+' '+w,i[L:])
s('',input())