Storing UUID in HSQLDB database

HSQLDB has a built-in UUID type. Use that

CREATE TABLE t (
  id UUID PRIMARY KEY
);

  1. I would recommend char(36) instead of varchar(36). Not sure about hsqldb, but in many DBMS char is a little faster.

  2. For lookups, if the DBMS is smart, then you can use an integer value to "get closer" to your UUID.

For example, add an int column to your table as well as the char(36). When you insert into your table, insert the uuid.hashCode() into the int column. Then your searches can be like this

WHERE intCol = ? and uuid = ?

As I said, if hsqldb is smart like mysql or sql server, it will narrow the search by the intCol and then only compare at most a few values by the uuid. We use this trick to search through million+ record tables by string, and it is essentially as fast as an integer lookup.


You have a few options:

  • Store it as a VARCHAR(36), as you already have suggested. This will take 36 bytes (288 bits) of storage per UUID, not counting overhead.
  • Store each UUID in two BIGINT columns, one for the least-significant bits and one for the most-significant bits; use UUID#getLeastSignificantBits() and UUID#getMostSignificantBits() to grab each part and store it appropriately. This will take 128 bits of storage per UUID, not counting any overhead.
  • Store each UUID as an OBJECT; this stores it as the binary serialized version of the UUID class. I have no idea how much space this takes up; I'd have to run a test to see what the default serialized form of a Java UUID is.

The upsides and downsides of each approach is based on how you're passing the UUIDs around your app -- if you're passing them around as their string-equivalents, then the downside of requiring double the storage capacity for the VARCHAR(36) approach is probably outweighed by not having to convert them each time you do a DB query or update. If you're passing them around as native UUIDs, then the BIGINT method probably is pretty low-overhead.

Oh, and it's nice that you're looking to consider speed and storage space issues, but as many better than me have said, it's also good that you recognize that these might not be critically important given the amount of data your app will be storing and maintaining. As always, micro-optimization for the sake of performance is only important if not doing so leads to unacceptable cost or performance. Otherwise, these two issues -- the storage space of the UUIDs, and the time it takes to maintain and query them in the DB -- are reasonably low-importance given the cheap cost of storage and the ability of DB indices to make your life much easier. :)

Tags:

Java

Uuid

Hsqldb