Once a HiLo is in use, what happens if you change the capacity (maximum Lo)?

HiLo algorithms in general basically map two integers to one integer ID. It guarantees that the pair of numbers will be unique per database. Typically, the next step is to guarantee that a unique pair of numbers maps to a unique integer ID.

A nice explanation of how HiLo conceptually works is given in this previous SO answer

Changing the max_lo will preserve the property that your pair of numbers will be unique. However, will it make sure that the mapped ID is unique and collision-free?

Let's look at Hibernate's implementation of HiLo. The algorithm they appear to use (as from what I've gathered) is: (and I might be off on a technicality)

h = high sequence (starting at 0)
l_size = size of low block
l = low sequence (starting at 1)

ID = h*l_size + l

So, if your low block is, say, 100, your reserved ID blocks would go 1-100, 101-200, 201-300, 301-400...

Your High sequence is now 3. Now what would happen if you all of a sudden changed your l_size to 10? Your next block, your High is incremented, and you'd get 4*10+1 = 41

Oops. This new value definitely falls within the "reserved block" of 1-100. Someone with a high sequence of 0 would think, "Well, I have the range 1-100 reserved just for me, so I'll just put down one at 41, because I know it's safe."

There is definitely a very, very high chance of collision when lowering your l_max.

What about the opposite case, raising it?

Back to our example, let's raise our l_size to 500, turning the next key into 4*500+1 = 2001, reserving the range 2001-2501.

It looks like collision will be avoided, in this particular implementation of HiLo, when raising your l_max.

Of course, you should do some own tests on your own to make sure that this is the actual implementation, or close to it. One way would be to set l_max to 100 and find the first few keys, then set it to 500 and find the next. If there is a huge jump like mentioned here, you might be safe.

However, I am not by any means suggesting that it is best practice to raise your l_max on an existing database.

Use your own discretion; the HiLo algorithm isn't exactly one made with varying l_max in mind, and your results may in the end be unpredictable depending on your exact implementation. Maybe someone who has had experience with raising their l_max and finding troubles can prove this count correct.

So in conclusion, even though, in theory, Hibernate's HiLo implementation will most likely avoid collisions when l_max is raised, it probably still isn't good practice. You should code as if l_max were not going to change over time.

But if you're feeling lucky...

See the Linear Chunk table allocator -- this is logically a more simple & correct approach to the same problem.

What's the Hi/Lo algorithm?

By allocating ranges from the number space & representing the NEXT directly, rather than complicating the logic with high words or multiplied numbers, you can directly see what keys are going to be generated.

Essentially, "Linear Chunk allocator" uses addition rather than multiplication. If the NEXT is 1000 & we've configured range-size of 20, NEXT will advance to 1020 and we'll hold keys 1000-1019 for allocation.

Range-sized can be tuned or reconfigured at any time, without loss of integrity. There is a direct relationship between the NEXT field of the allocator, the generated keys & MAX(ID) existing in the table.

(By comparison, "Hi-Lo" uses multiplication. If the next is 50 & the multiplier is 20, then you're allocating keys around 1000-1019. There are no direct correlation between NEXT, generated keys & MAX(ID) in the table, it is difficult to adjust NEXT safely and the multiplier can't be changed without disturbing current allocation point.)

With "Linear Chunk", you can configure how large each range/ chunk is -- size of 1 is equivalent to traditional table-based "single allocator" & hits the database to generate each key, size of 10 is 10x faster as it allocates a range of 10 at once, size of 50 or 100 is faster still..

A size of 65536 generates ugly-looking keys, wastes vast numbers of keys on server restart, and is equivalent to Scott Ambler's original HI-LO algorithm.

In short, Hi-Lo is an erroneously complex & flawed approach to what should have been conceptually trivially simple -- allocating ranges along a number line.

I tried to unearth behviour of HiLo algorith through a simple helloWrold-ish hibernate application.

I tried a hibernate example with

<generator class="hilo">
<param name="table">HILO_TABLE</param>
<param name="column">TEST_HILO</param>
<param name="max_lo">40</param>
</generator>

Table named "HILO_TABLE" created with single column "TEST_HILO" Initially I set value of TEST_HILO column to to 8.

update HILO_TABLE set TEST_HILO=8;

I observed that pattern to create ID is

hivalue * lowvalue + hivalue

hivalue is column value in DB (i.e. select TEST_HILO from HILO_TABLE ) lowvalue is from config xml (40 )

so in this case IDs started from 8*40 + 8 = 328

In my hibernate example i added 200 rows in one session. so rows were created with IDs 328 to 527 And in DB hivalue was incremented till 13. The increment logic seems to be :-

new hivalue in DB = inital value in DB + (rows_inserted/lowvalue + 1 )

= 8 + 200/40 = 8 + 5 =13

Now if I run same hibernate program to insert rows, the IDs should start from 13*40 + 13 = 533

When ran the program it was confirmed.

Once a HiLo is in use, what happens if you change the capacity (maximum Lo)?

Tags:

Algorithm

Hilo

Related

Recent Posts