"Vaultless" or "Reversible" Tokenization - Is it really just encryption with a fancy marketing name?

The line between Encryption and Tokenization is blurry.

The answer to your question is that it depends. There are different ways of accomplishing vaultless tokenization. These different breeds of vaultless can be categorized into two main ideas: lookup table techniques, and format-preserving encryption (FPE).

Using FPE is obviously encryption at the very core, so calling it tokenization is a little bit of a marketing play, however, the justification of calling it a token has to do with what is produces, not how it was created. FPE, although it is encryption, can produce a ciphertext that is the same length and format as what was passed in. In this way, FPE produces "tokens" in the aesthetic sense. However, these tokens were generated using a mode of AES encryption, and so in the eyes of security, these tokens must be treated like encrypted data.

The other breed of Vaultless uses pseudo-random values and maps them to the input (i.e. lookup tables). How is this method considered vault-less? Well in order to eliminate the need for a vault (aka a big database), you must reduce the number of entries in the lookup tables, to reduce its size. If you can come up with a way to keep the size of the table down (!!! while also maintaining the reversibility of the lookup process) then you can justify calling this thing vaultless. Most lookup table varieties of vaultless use tables that are small enough to be stored in cache or something.


There isn't a definitional difference. The vaultless tokenization solution whose vendor fastidiously insists is "not encryption" really is encryption, definitionally speaking.

There is a substantial technical difference, however, in that they don't use a format-preserving encryption mode on top of a block cipher like some other solutions do, but rather a format-preserving primitive built on top of tables of random numbers. This conference submission abstract briefly describes it:

A Novel Approach to the Tokenization of Credit Card Numbers

Bart Preneel, COSIC, Katholieke Universiteit Leuven, Belgium
Ulf Mattsson, Protegrity, USA

Encryption techniques are used to ensure the confidentiality of sensitive data. They are typically defined as mappings on bitstrings; they can be defined as a mode of operation of a block cipher (a keyed random permutation on strings of 64 or 128 bits) or based on a stream cipher (that typically operates at the level of bits, bytes or 32-bit words). In order to satisfy strict security definitions, encryption schemes need to be randomized, which means that the ciphertext is larger than the plaintext. For some applications, such as the protection of credit card numbers in certain contexts, both constraints are undesirable: the plaintext space consists of digits rather than bits and the mapping from plaintext to ciphertext has to be a permutation, hence there is no room for randomization. The encryption operation is also called tokenization. It is definitely possible to define a secure tokenization based on a block cipher such as triple-DES or AES. In this submission, we present a completely new approach, in which a highly efficient block cipher is designed from scratch by using S-boxes defined on strings of n digits (n is typically 5 to 7), that can be interpreted as large keys. We will show that if the number of plaintexts encrypted with a single key is limited, a very high security level can be obtained using this approach. It can be proven that under realistic constraints, the security of the scheme is equal to an “ideal” tokenization scheme.

Note how they say:

  • "The encryption operation is also called tokenization." The authors are aware that tokenization is, definitionally speaking, a form of encryption.
  • They label their system a "highly efficient block cipher [...] designed from scratch." I.e., encryption.
  • Their block ciphers, instead of working on binary bitstrings like conventional ones, is "designed from scratch by using S-boxes defined on strings of n digits (n is typically 5 to 7), that can be interpreted as large keys." I.e., instead of using an FPE mode to build a format-preserving cipher from a conventional bitstring blockcipher like AES, they build a format-preserving cipher directly.

Tags:

Encryption