Palindrome Compression

JavaScript (ES6), 3.143 (81 bytes saved, 664 byte program)

D=s=>{s=s[R](Z,c=>W(256+T(c),1))+V;M=r=>(s=s[R](p=s.match(`^${r}|`)[0],''),p);for([,a]=M`1.|0`,t=u=i='';!M`111`;)t+=W(X(M`.{5}`)-~8,0,36);for(t+=W(Y(t),a?a/0:1);p;)u+=M`0(?=00)|00?1`?(c=t[i++])?+p[1]?c[U]():c:'':M`10`?' ':M`11`&&S(X(M`.{7}`));return u+W(t,i)}

Now that I'm fairly satisfied with this program (and the scoring system), I'll write a bit of an explanation.

The basic idea is to compress the input into a string of bits, then compress each set of 8 bits into a byte. For the purposes of explanation, I'll just manipulate the bit string.

The bit string can be separated into several sections:

input  -> Taco Cat.
output -> 0101000000100011011111110100001100100011101011100000000

0      | 10100 00001 00011 01111 111 | 01 00001 10 01 0001 110101110
header | letter data                 | styling data

The header is a very simple mapping:

0  -> odd-length palindrome
10 -> even-length palindrome
11 -> non-palindrome

Letter data is also fairly simple. First, all non-letters are extracted from the string, and all letters are converted to uppercase. If the resulting string is a palindrome, the reversed half is stripped. Then this mapping is applied:

A -> 00001
B -> 00010
C -> 00011
D -> 00100
Z -> 11010

This section is terminated with 111. After that comes the styling data, which stores upper/lower-case data and non-letters. This works like so:

01 -> next letter as uppercase
0...01 (n 0s) -> next (n-1) letters as lowercase
10 -> space
11xxxxxxx -> character with code point 0bxxxxxxx

So going through the example shown above, we have

header: 0 -> palindrome
letter data: 10100 00001 00011 01111 111 -> taco
styling data:
  01        -> T
  00001     -> aco
  10        -> <space>
  01        -> C
  0001      -> at
  110101110 -> .

When the end of the bit string is reached, all remaining characters from the letter data are appended to the result. This saves us from having to do one last 000...001 and allows us to truncate these bits from the string.

Going through the test cases:

tacocat -> 3 bytes (-4)
    24 bits: 010100000010001101111111
toohottohoot -> 5 bytes (-7)
    35 bits: 10101000111101111010000111110100111
todderasesareddot -> 7 bytes (-10)
    49 bits: 0101000111100100001000010110010000011001100101111
amanaplanacanalpanama -> 8 bytes (-13)
    59 bits: 00000101101000010111000001100000110000001011100000100011111
wasitacaroracatisaw? -> 11 bytes (-9)
    84 bits: 010111000011001101001101000000100011000011001001111111000000000000000000001110111111
Bob -> 2 bytes (-1)
    16 bits: 0000100111111101
IManAmRegalAGermanAmI -> 13 bytes (-8)
    98 bits: 00100101101000010111000001011011001000101001110000101100111010100010100101000001010100000010100101
DogeeseseeGod -> 7 bytes (-6)
    54 bits: 000100011110011100101001011001100101111010000000000101
A Santa at NASA -> 8 bytes (-7)
    63 bits: 100000110011000010111010100000011110110010000011000011001010101
Go hang a salami! I'm a lasagna hog. -> 20 bytes (-16)
   154 bits: 1000111011110100000001011100011100001100110000101100000010110101001111010011000000110001100000000111010000110011101001110011000110000000001100000111010111

Python 2: 2.765 (70 bytes saved, 641 byte program)

I changed my approach a little. It now works well on imperfect palindromes. There are no compressed strings that will be longer than the input. Perfect even-length palindromes will always compress to 50% the original size.

A=lambda x:chr(x).isalpha()
def c(s):
 r=bytearray(s);q=len(r);L=0;R=q-1;v=lambda:R+1<q and r[R+1]<15
 while L<=R:
  while not A(r[L])and L<R:L+=1
  while not A(r[R])and R:
   if v()and r[R]==32:r[R]=16+r.pop(R+1)
  if A(j)*A(k):
   if L!=R and j&31==k&31:
    if v():r[R]+=r.pop(R+1)
 while r[-1]<16:r.pop()
 return r
def d(s):
 for o in s:
  if 15<o<32:r+=' ';o-=16
  while 0<o<16:r+=chr(t.pop());o-=1
  if o==0:continue
  if 127<o<192:o-=64;t+=[o^32]
  elif o>192:o-=128
  elif A(o):t+=[o]
 while t:r+=chr(t.pop())
 return r


'tacocat' <==> 'tac\xef'
4/7 (3 bytes saved)
'toohottohoot' <==> 'toohot'
6/12 (6 bytes saved)
'todderasesareddot' <==> 'todderas\xe5'
9/17 (8 bytes saved)
'amanaplanacanalpanama' <==> 'amanaplana\xe3'
11/21 (10 bytes saved)
'wasitacaroracatisaw?' <==> 'wasita\xe3ar\xef\x09?'
12/20 (8 bytes saved)
'Bob' <==> '\x82\xef'
2/3 (1 bytes saved)
'IManAmRegalAGermanAmI' <==> 'I\x8d\xa1n\x81m\x92e\xa7\xa1\xec'
11/21 (10 bytes saved)
'Dogeeseseegod' <==> '\x84ogees\xe5'
7/13 (6 bytes saved)
'A Santa at NASA' <==> 'A S\xa1\xaeta\x12\x14'
9/15 (6 bytes saved)
"Go hang a salami! I'm a lasagna hog." <==> "\x87o hang a salam\xa9!\x11'\x01\x11\x17\x13."
24/36 (12 bytes saved)

And as a bonus, it saves 6 bytes on my incorrect palindrome I had before.

'wasita\xe3ar\xef\x02\xf2\x06?' <==> 'wasitacaroraratisaw?'
6 bytes saved


Decompression uses a stack. Codepoints from 32-127 are treated literally. If a character is a letter, a value is pushed onto the stack as well. Values 128-192 are used for case flipped letters, so the caseflipped letter (o^32 because of how ASCII is laid out) gets pushed onto the stack and the normal letter gets added to the string. Values 192-255 are used to add letters without pushing to the stack, so this is used when letters do not match and for the middle letter in odd-length palindromes. Codepoints 1-15 indicate that the stack should be popped that number of times. Codepoints 17-31 are similar, but they print a space first before popping from the stack. There is also an implicit "pop until empty" instruction at the end of an input.

The compressor works from both ends and folds in matching letters as values 1-31. It skips over non-letters. When the letters match but the case does not, it adds 64 to the left letter and increments the right letter. This allows it to save space on IManAmRegalAGermanAmI. At the middle or when the letters do not match, it ors 128 to both sides. I can't add there because I need to avoid the special case where left == right. When folding neighboring pop markers on the right side, I have to check that the neighboring one won't overflow into codepoint 16 because I need that for spaces. (This is not actually an issue for any of the test case strings)

EDIT 1: No more ungolfed version.