Golfing ASCII-art

Ruby, 107

I thought I'd actually try to "generate" the image in code (instead of using some existing compression function):

S=?\s*351+?^*60
"⠀鰇렜렟쐺⠾롖鱠롢⡷뢋鲝⢰룁⣩볲룸⤢봪봬뤯鴰⥛".chars{|q|S[511&r=q.ord]=(r>>10).chr}
puts S

There are some non-printable characters in that array literal.

Here is the hex view of the file, to show the non-printable characters, too:

0000000: 533d 3f5c 732a 3335 312b 3f5e 2a36 300a  S=?\s*351+?^*60.
0000010: 22e2 a080 e9b0 87f0 9780 88eb a09c eba0  "...............
0000020: 9ff0 9f80 b8ef a0b9 ec90 baee 80bb efa0  ................
0000030: bcef a0bd e2a0 bef0 9781 87eb a196 e9b1  ................
0000040: a0eb a1a2 f09f 81b6 e2a1 b7f0 93b1 bfef  ................
0000050: a280 efa2 81eb a28b e9b2 9df0 9bb2 9ef0  ................
0000060: 9f82 afe2 a2b0 f097 82b9 eba3 81f0 9f83  ................
0000070: a8e2 a3a9 ebb3 b2f0 9783 b3eb a3b8 f09f  ................
0000080: 84a1 e2a4 a2eb b4aa ebb4 aceb a4af e9b4  ................
0000090: b0f0 9f85 9ae2 a59b f09a a59d f099 b59e  ................
00000a0: f09c b59f f098 85a7 222e 6368 6172 737b  ........".chars{
00000b0: 7c71 7c53 5b35 3131 2672 3d71 2e6f 7264  |q|S[511&r=q.ord
00000c0: 5d3d 2872 3e3e 3130 292e 6368 727d 0a70  ]=(r>>10).chr}.p
00000d0: 7574 7320 53                             uts S

Thanks to Ventero for some major improvements! (He basically reduced the code by 50%.)


CJam, 62 characters

"Ⴀ지尦렒>Ä΀ྀ㸀⡅쇋蒧ʸ鿀ʃ케袧Ƽ蟀ʄ導뤷쀂萯Ű⋥ἀ਎밊耧台 ⢙⿶ꝍ㕟劢햟騤꩏脽啎"2G#b128b:c~

Try it online.

Test run

$ base64 -d > golf.cjam <<< IgHhgqDsp4DlsKbroJLujJ8+w4TOgOC+gOO4gOKhheyHi+iSp8q46b+AyoPsvIDvoIPuhKvooqfGvOifgMqE5bCO66S37ICC6JCvxbDii6XhvIDgqI7rsIrvgYvogKflj7DCoOKimeK/tuqdjeOVn+WKou2Wn+mopO+em+qpj+iEve6arOWVjiIyRyNiMTI4Yjpjfg==
$ wc -m golf.cjam
62 golf.cjam
$ cjam golf.cjam

      '\                   .  .                        |>18>>
        \              .         ' .                   |
       O>>         .                 'o                |
        \       .                                      |
        /\    .                                        |
       / /  .'                                         |
 jgs^^^^^^^`^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
$

How it works

2G#b converts the preceding string into an integer by considering it a base-65536 number.

128b:c converts that integer back to a string (110 bytes) by considering it a base-128 number, which ~ then executes:

"
^F'^@\^S.^B.^X|^@>^@1^@8^@>^@>^@
^H\^N.^I'^A.^S|^@
^GO^@>^@>^I.^Q'^@o^P|^@
^H\^G.&|^@
^H/^@\^D.(|^@
^G/^A/^B.^@')|^@
^A"2/{)iS*}%"jgs"'^7*'`'^51*N

(caret notation)

2/{)iS*}%

splits the string into pairs of two characters and does the following for each pair: Pop the second character of the string, convert it into an integer and repeat the string " " that many times.

For example, ".(" becomes ". ", because the ASCII character code of ( is 40.

Finally,

"jgs"'^7*'`'^51*N

pushes the string "jgs", the character ^ repeated 7 times, the character `, the character ^ repeated 51 times and a linefeed.


bash + iconv + DosBox/x86 machine code (104 97 96 95 characters)

echo ↾각슈삨੽₲ɻ庲錿ʴ⇍罋곹삄ૃ蘊尧⺓⺂粘ㄾ㸸ਾ岈⺎➉⸠粓蜊㹏褾鄮漧粐蠊蝜꘮੼⾈葜꠮੼⾇⼠⺂ꤧ੼樠獧惇૳|iconv -futf8 -tucs2>o.com;dosbox o*

I suggest to put this in a script in an empty directory, it's almost guaranteed that copy-pasting it in a terminal will break everything; even better, you can grab the script here ready-made.

Expected output: expected output

How it works

The bash part is just a launcher that uses iconv to "decompress" a .com file from the UTF-8 characters of the script and launches it with DosBox.

Notice that this poses some limitation on the content, as not all the input sequences can be interpreted as UCS-2 with iconv not complaining; for example, for some reason many operations involving the bx register broke havoc depending on the place where I used them, so I had to work around this problem several times.

Now, the Unicode thing is just to take advantage of the "character count" rules; the actual size (in bytes) of the script is way larger than the original .COM file.

The extracted .com file is

00000000  be 21 01 ac 88 c2 a8 c0  7d 0a b2 20 7b 02 b2 5e  |.!......}.. {..^|
00000010  83 e0 3f 93 b4 02 cd 21  4b 7f f9 ac 84 c0 75 e4  |..?....!K.....u.|
00000020  c3 0a 0a 86 27 5c 93 2e  82 2e 98 7c 3e 31 38 3e  |....'\.....|>18>|
00000030  3e 0a 88 5c 8e 2e 89 27  20 2e 93 7c 0a 87 4f 3e  |>..\...' ..|..O>|
00000040  3e 89 2e 91 27 6f 90 7c  0a 88 5c 87 2e a6 7c 0a  |>...'o.|..\...|.|
00000050  88 2f 5c 84 2e a8 7c 0a  87 2f 20 2f 82 2e 27 a9  |./\...|../ /..'.|
00000060  7c 0a 20 6a 67 73 c7 60  f3 0a 0a 00              ||. jgs.`....|
0000006c

and is 108 bytes long. The NASM source for it is:

    org 100h

start:
    ; si: pointer to current position in data
    mov si,data
    ; load the character in al
    lodsb
mainloop:
    ; bx: repetition count
    ; - zero at startup
    ; - -1 after each RLE run
    ; - one less than each iteration after each "literal" run
    ; the constant decrement is not really a problem, as print
    ; always does at least one print, and there aren't enough
    ; consecutive literal values to have wraparound

    ; if the high bit is not set, we have a "literal" byte;
    ; we prepare it in dl just in case
    mov dl,al
    ; then check if it's not set and branch straight to print
    ; notice that bx=0 is fine, as print prints always at least one character
    ; test the top two bits (we need the 6th bit below)
    test al,0xc0
    ; to see if the top bit was set, we interpret it as the sign bit,
    ; and branch if the number is positive or zero (top bit not set)
    jge print
rle:
    ; it wasn't a literal, but a caret/space with a repetition count
    ; space if 6th bit not set, caret otherwise
    mov dl,' '
    ; exploit the parity bit to see if the 6th bit was set
    jnp nocaret
    mov dl,'^'
nocaret:
    ; lower 6 bits: repetition count
    ; and away the top bits and move in bx
    ; we and ax and not al because we have to get rid of the 02h in ah
    and ax,3fh
    xchg ax,bx
print:
    ; print bx times
    mov ah,2
    int 21h
    dec bx
    jg print
    ; read next character
    lodsb
    test al,al
    ; rinse & repeat unless we got a zero
    jnz mainloop
end:
    ret
data:
    ; here be data
    incbin "compressed.dat"
    ; NUL terminator
    db 0

All this is just a decompresser for compressed.dat whose format is as follows:

  • if the high bit is not set, print the character as-is;
  • otherwise, the low 6 bits are the repetition count, and the second-highest bit specifies if it has to print a space (bit not set) or a caret (bit set).

compressed.dat in turn is generated using a Python script from the original text.

The whole thing can be found here.