Morse code to standard output

Drat, I was hoping to get here before the GolfScripters arrived :-(

Anyhoo...

C: 228 characters:

char n,t,m[9],*c=" etianmsurwdkgohvf l pjbxcyzq  54 3   2& +    16=/   ( 7   8 90    $       ?_    \"  .    @   '  -        ;! )     ,    :";
main(){while(scanf("%s",m)>0){for(t=m[6]=n=0;m[n];n++)t+=t+1+(m[n]&1);putchar(c[t]);}}

I thought I'd add an explanation of how this works.

The input data is parsed according to the tree data in *c, which can be expanded as follows (using · to represent a vacant node):

                     dot <-- (start) --> dash
                e                               t
        i               a               n               m
    s       u       r       w       d       k       g       o
  h   v   f   ·   l   ·   p   j   b   x   c   y   z   q   ·   ·
 5 4 · 3 · · · 2 & · + · · · · 1 6 = / · · · ( · 7 · · · 8 · 9 0
····$·······?_····"··.····@···'··-········;!·)·····,····:·······

Starting at the top of the tree, work your way down while moving to the left for a dot and to the right for a dash. Then output whatever character you happen to be at when the input string ends (i.e., when a whitespace character is encountered). So for example, three dots and a dash will take you to v via e, i and s. Instead of explicitly checking for dots (ASCII \x2e) and dashes (ASCII \x2d), we only need to check the last bit (m[n]&1), which is 0 for . and 1 for -.

Six rows are enough to encode everything except $, which has 7 dot/dashes: ...-..-, but since the input data is guaranteed to be valid, this can easily be fixed by truncating the input at 6 characters (m[6]=0) and interpreting ...-.. as $ instead. We can also cut away the last 7 bytes from the tree data, since they are all empty and aren't needed if the input is valid.


GolfScript (116 113 97 chars)

This includes non-printable characters used in a lookup table, so I'm giving it as xxd output:

0000000: 6e2d 2720 272f 7b60 7b5c 6261 7365 2035
0000010: 3925 2210 a9cd 238d 57aa 8c17 d25c d31b
0000020: 432d 783e 277a 3823 e146 e833 6423 23ac
0000030: e72a 39d5 021c 4e33 3b30 3138 dc51 2044
0000040: 3aa7 d001 df4b 2032 333f 36ae 51c3 223d
0000050: 7d2b 5b35 2d34 5d2f 2b32 3333 257d 256e
0000060: 2b

This decodes to a program equivalent to

n-' '/{`{\base 59%"\x10\xA9\xCD#\x8DW\xAA\x8C\x17\xD2\\\xD3\eC-x>'z8#\xE1F\xE83d##\xAC\xE7*9\xD5\x02\x1CN3;018\xDCQ D:\xA7\xD0\x01\xDFK 23?6\xAEQ\xC3"=}+[5-4]/+233%}%n+

which is essentially

n-' '/{`{\base 59%"MAGIC STRING"=}+[5-4]/+233%}%n+

This uses a (non-minimal) perfect hash based on the core idea of An optimal algorithm for generating minimal perfect hash functions; Czech, Havas and Majewski; 1992. Their basic idea is that you use two hash functions, f1 and f2, together with a lookup table g, and the perfect hash is (g[f1(str)] + g[f2(str)]) % m (where m is the number of strings we wish to distinguish); the clever bit is the way they build g. Consider all of the values f1(str) and f2(str) for strings str of interest as nodes in an undirected graph, and add an edge between f1(str) and f2(str) for each string. They require not only that each edge be distinct, but that the graph be acyclic; then it is just a DFS to assign weights to the nodes (i.e. to populate the lookup table g) such that each edge has the required sum.

Czech et al generate random functions f1 and f2 which are expressed via lookup tables, but that's clearly no good: I searched for a suitable hash using simple base conversions with two distinct bases from -10 to 9. I also relaxed the acyclic requirement. I didn't want to assign the strings to values from 0 to 54, but to the corresponding ASCII codes, so rather than taking (g[f1(str)] + g[f2(str)]) % m I wanted (g[f1(str)] + g[f2(str)]) % N for some N > 'z'. But that allows freedom to try various N and see whether any of them allow a valid lookup table g, regardless of whether there are cycles. Unlike Czech et al I don't care if the search for the perfect hash function is O(n^4).

The graph generated by -4base and 5base mod 59 is:

Graph rendered by dot with some minor tweaks

which is fairly nice apart from the biggest connected component, which has three cycles of length 1. We have to go up to N=233 before we can find a g which is consistent.


Mathematica 62

Mathematica allows us to cheat

f=ToLowerCase@StringDrop[WolframAlpha[". .- "<>#,"Result"],2]&

f@"."
f@". -..- .- -- .--. .-.. . .-.-.-"
f@".... .- ...- .  -.-- --- ..-  -- --- --- . -..  - --- -.. .- -.-- ..--.."

e

example.

have you mooed today?

The first two symbols . and .- are necessary to interpret small codes correctly.