How to read Base64 VLQ code?

Even after reading the answers, explanations were still not so clear to me. Here is an explanation in plain english in case it helps someone:

something like ;;AAAA,IAAM,WAAW,SAAX;... means <line0 info>;<line1 info>;...

so for ;;AAAA;IAAM,WAAW,SAAX;..., line 0 and line 1 doesn't have any important info (empty spaces etc.)

then for line 2 we have AAAA,IAAM,WAAW,SAAX

we convert each of these groups to binary using the base64 character mapping:

BASE64_ALPHABET = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'

so we basically find the index in this BASE64_ALPHABET above, and convert the index to 6-bit binary (6 bit because we use base64). eg. index of A is 0, so in 6 bit binary its 000000. so AAAA would be 000000 000000 000000 000000

then if we do this with IAAM we get: 001000 000000 000000 001100.

then this bit representation is the VLQ encoded version of 4 numbers. we start from the left block, remove the sign and continuation bit, keep the bits. and continue adding it bits while continuation bit is 1.

eg. 001000 is (cont)0100(sign)
so cont = 0 (no other block will be added to this number)
sign=0 (its positive)
bits = 0100 --> so it is 4 in decimal

-- note that we only remove sign bit for the first time. so if we had
101000 001000
we would say
0100 (cont=1, sign=0) 01000 (cont=0)
so we would have had +010001000 = 136

when we keep doing this, we will get these 4 numbers (continuation bit should be 0 exactly 4 times).

  • AAAA would map to (0,0,0,0)
  • IAAM would map to (4,0,0,6)
  • WAAW would map to (11,0,0,11) ...

now, each of these mean relative numbers. so we correct those:

  • AAAA actually points to: (0,0,0,0)
  • IAAM actually points to: (0+4, 0+0, 0+0, 0+6) = (4,0,0,6)
  • WAAW actually points to: (4+11, 0+0, 0+0, 6+11) = (15,0,0,17) // we added it where IAAAM was actually pointing to

...

so numbers (n1, n2, n3, n4) here stand for

  • n1: column in generated code
  • n2: corresponding source file index in "sources" array of sourceMapping output
  • n3: line number in original code
  • n4: column number in original code

we already knew which line this referred to from the beginning. so using the information we have above, we learned:

  • AAAA: line 2, column 1 of generated code points to sources[0], line 0, column 0
  • IAAM: line 2, column 4 of generated code points to sources[0], line 0, column 6
  • WAAW: line 2, column 15 of generated code points to sources[0], line 0, column 17 ...

two good sources about this:

  • more on VLQ encoding
  • more on how to interpret/decode

I found an example at http://www.thecssninja.com/javascript/source-mapping, under the section "Base64 VLQ and keeping the source map small".

The above diagram AAgBC once processed further would return 0, 0, 32, 16, 1 – the 32 being the continuation bit that helps build the following value of 16. B purely decoded in Base64 is 1. So the important values that are used are 0, 0, 16, 1. This then lets us know that line 1 (lines are kept count by the semi colons) column 0 of the generated file maps to file 0 (array of files 0 is foo.js), line 16 at column 1.


Despite the examples I could find, I took me quite a while to understand how the coding / decoding really works. So I thought I'd learn best by trying to make something myself in a very explicit, step by step way. I started out with the explanation of VLQ at this blog,

I use the following Python functor to generate sourcemaps for Transcrypt. The code is simple and I think gives good insight in how the coding/decoding works in principle. To achieve speed despite its simplicity, it caches the first 256 numbers, which are used most often in generating a v3 sourcemap.

import math

class GetBase64Vlq:
    def __init__ (self):
        self.nBits32 = 5
        self.encoding = 'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/'
        self.prefabSize = 256
        self.prefab = [self (i, True) for i in range (self.prefabSize)]

    def __call__ (self, anInteger, init = False):
        if not init and 0 < anInteger < self.prefabSize:
            return self.prefab [anInteger]
        else:
            signed = bin (abs (anInteger)) [2 : ] + ('1' if anInteger < 0 else '0')
            nChunks = math.ceil (len (signed) / float (self.nBits32))
            padded = (self.nBits32 * '0' + signed) [-nChunks * self.nBits32 : ]
            chunks = [('1' if iChunk else '0') + padded [iChunk * self.nBits32 : (iChunk + 1) * self.nBits32] for iChunk in range (nChunks - 1, -1, -1)]
            return ''.join ([self.encoding [int (chunk, 2)] for chunk in chunks])

getBase64Vlq = GetBase64Vlq ()

Example of use:

while (True):
    print (getBase64Vlq (int (input ('Give number:'))))