Should I use UUID as well as ID

UUIDs are a performance disaster for very large tables. (200K rows is not "very large".)

Your #3 is really bad when the CHARCTER SET is utf8 -- CHAR(36) occupies 108 bytes! Update: There are ROW_FORMATs for which this will stay 36.

UUIDs (GUIDs) are very "random". Using them as either a UNIQUE or a PRIMARY key on large tables is very inefficient. This is because of having to jump around the table/index each time you INSERT a new UUID or SELECT by UUID. When the table/index is too large to fit in cache (see innodb_buffer_pool_size, which must be smaller than RAM, typically 70%), the 'next' UUID may not be cached, hence a slow disk hit. When the table/index is 20 times as big as the cache, only 1/20th (5%) of hits are cached -- you are I/O-bound. Generalization: The inefficiency applies to any "random" access -- UUID / MD5 / RAND() / etc

So, don't use UUIDs unless either

  • you have "small" tables, or
  • you really need them because of generating unique ids from different places (and have not figured out another way to do it).

More on UUIDs: http://mysql.rjweb.org/doc.php/uuid (It includes functions for converting between standard 36-char UUIDs and BINARY(16).) Update: MySQL 8.0 has a builtin function for such.

Having both a UNIQUE AUTO_INCREMENT and a UNIQUE UUID in the same table is a waste.

  • When an INSERT occurs, all unique/primary keys must be checked for duplicates.
  • Either unique key is sufficient for InnoDB's requirement of having a PRIMARY KEY.
  • BINARY(16) (16 bytes) is somewhat bulky (an argument against making it the PK), but not that bad.
  • The bulkiness matters when you have secondary keys. InnoDB silently tacks the PK onto the end of each secondary key. The main lesson here is to minimize the number of secondary keys, especially for very large tables. Elaboration: For one secondary key, the bulkiness debate usually ends in a draw. For 2 or more secondary keys, a fatter PK usually leads to a bigger disk footprint for the table including its indexes.

For comparision: INT UNSIGNED is 4 bytes with range of 0..4 billion. BIGINT is 8 bytes.

Italics Updates/etc were added Sep, 2017; nothing critical changed.

Addressing Ivan's answer and the link he provided.

  • Yes, there are benefits of a distributed id-generator. But...
  • There are other ways to avoid the delay of a centralized ID server -- Dispense 100 consecutive ids at a time to the client. Or use a client-generated concatenation of client_id and time.
  • If the entire index is cached in RAM, then much of my argument is moot. But once it spills to disk, performance suffers severely. His article does briefly point out that UUIDv4 is slower than UUIDv1, but fails to point out that the bits need rearranging to achieve the benefit.
  • I agree with squeezing a 36-byte UUID into a 16-byte BINARY. MySQL 8 even has a function to do such. Furthermore, it (and my blog) rearrange the bits so that UUIDv1 has the temporal characteristic of an auto_increment.
  • Sharding, I contend, is not normally done against auto_increment ids, so switching to UUIDs does not buy anything.

'Rick James' said in the accepted answer:

"Having both a UNIQUE AUTO_INCREMENT and a UNIQUE UUID in the same table is a waste".

But this test (I did it on my machine) show different facts.

For example: with the test (T2) I make table with (INT AUTOINCREMENT) PRIMARY and UNIQUE BINARY(16) and another field as title, then I insert more than 1.6M rows with very good performance, but with another testing (T3) I did the same but the result is slow after inserting 300,000 rows only.

This is my testing result:

T1:
char(32) UNIQUE with auto increment int_id
after: 1,600,000
10 sec for inserting 1000 rows
select + (4.0)
size:500mb

T2:
binary(16) UNIQUE with auto increment int_id
after: 1,600,000
1 sec for inserting 1000 rows
select +++ (0.4)
size:350mb

T3:
binary(16) UNIQUE without auto increment int_id
after: 350,000
5 sec for inserting 1000 rows
select ++ (0.3)
size:118mb (~ for 1,600,000 will be 530mb)

T4:
auto increment int_id without binary(16) UNIQUE
++++

T5:
uuid_short() int_id without binary(16) UNIQUE
+++++*

So binary(16) UNIQUE with auto increment int_id is better than binary(16) UNIQUE without auto increment int_id.

Update:

I make the same test again and record more details. this is full code and results comparison between (T2) and (T3) as explained above.

(T2) create tbl2 (mysql):

CREATE TABLE test.tbl2 (
  int_id INT(11) NOT NULL AUTO_INCREMENT,
  rec_id BINARY(16) NOT NULL,
  src_id BINARY(16) DEFAULT NULL,
  rec_title VARCHAR(255) DEFAULT NULL,
  PRIMARY KEY (int_id),
  INDEX IDX_tbl1_src_id (src_id),
  UNIQUE INDEX rec_id (rec_id)
)
ENGINE = INNODB
CHARACTER SET utf8
COLLATE utf8_general_ci;

(T3) create tbl3 (mysql):

CREATE TABLE test.tbl3 (
  rec_id BINARY(16) NOT NULL,
  src_id BINARY(16) DEFAULT NULL,
  rec_title VARCHAR(255) DEFAULT NULL,
  PRIMARY KEY (rec_id),
  INDEX IDX_tbl1_src_id (src_id)
)
ENGINE = INNODB
CHARACTER SET utf8
COLLATE utf8_general_ci;

This is full testing code, it is inserting 600,000 records into tbl2 or tbl3 (vb.net code):

Public Class Form1

    Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
        Dim res As String = ""
        Dim i As Integer = 0
        Dim ii As Integer = 0
        Dim iii As Integer = 0

        Using cn As New SqlClient.SqlConnection
            cn.ConnectionString = "Data Source=.\sql2008;Integrated Security=True;User Instance=False;MultipleActiveResultSets=True;Initial Catalog=sourcedb;"
            cn.Open()
            Using cmd As New SqlClient.SqlCommand
                cmd.Connection = cn
                cmd.CommandTimeout = 0
                cmd.CommandText = "select recID, srcID, rectitle from textstbl order by ID ASC"

                Using dr As SqlClient.SqlDataReader = cmd.ExecuteReader

                    Using mysqlcn As New MySql.Data.MySqlClient.MySqlConnection
                        mysqlcn.ConnectionString = "User Id=root;Host=localhost;Character Set=utf8;Pwd=1111;Database=test"
                        mysqlcn.Open()

                        Using MyCommand As New MySql.Data.MySqlClient.MySqlCommand
                            MyCommand.Connection = mysqlcn

                            MyCommand.CommandText = "insert into tbl3 (rec_id, src_id, rec_title) values (UNHEX(@rec_id), UNHEX(@src_id), @rec_title);"
                            Dim MParm1(2) As MySql.Data.MySqlClient.MySqlParameter
                            MParm1(0) = New MySql.Data.MySqlClient.MySqlParameter("@rec_id", MySql.Data.MySqlClient.MySqlDbType.String)
                            MParm1(1) = New MySql.Data.MySqlClient.MySqlParameter("@src_id", MySql.Data.MySqlClient.MySqlDbType.String)
                            MParm1(2) = New MySql.Data.MySqlClient.MySqlParameter("@rec_title", MySql.Data.MySqlClient.MySqlDbType.VarChar)

                            MyCommand.Parameters.AddRange(MParm1)
                            MyCommand.CommandTimeout = 0

                            Dim mytransaction As MySql.Data.MySqlClient.MySqlTransaction = mysqlcn.BeginTransaction()
                            MyCommand.Transaction = mytransaction

                            Dim sw As New Stopwatch
                            sw.Start()

                            While dr.Read
                                MParm1(0).Value = dr.GetValue(0).ToString.Replace("-", "")
                                MParm1(1).Value = EmptyStringToNullValue(dr.GetValue(1).ToString.Replace("-", ""))
                                MParm1(2).Value = gettitle(dr.GetValue(2).ToString)

                                MyCommand.ExecuteNonQuery()

                                i += 1
                                ii += 1
                                iii += 1

                                If i >= 1000 Then
                                    i = 0

                                    Dim ts As TimeSpan = sw.Elapsed
                                    Me.Text = ii.ToString & " / " & ts.TotalSeconds

                                    Select Case ii
                                        Case 10000, 50000, 100000, 200000, 300000, 400000, 500000, 600000, 700000, 800000, 900000, 1000000
                                            res &= "On " & FormatNumber(ii, 0) & ": last inserting 1000 records take: " & ts.TotalSeconds.ToString & " second." & vbCrLf
                                    End Select

                                    If ii >= 600000 Then GoTo 100
                                    sw.Restart()
                                End If
                                If iii >= 5000 Then
                                    iii = 0

                                    mytransaction.Commit()
                                    mytransaction = mysqlcn.BeginTransaction()

                                    sw.Restart()
                                End If
                            End While
100:
                            mytransaction.Commit()

                        End Using
                    End Using
                End Using
            End Using
        End Using

        TextBox1.Text = res
        MsgBox("Ok!")
    End Sub

    Public Function EmptyStringToNullValue(MyValue As Object) As Object
        'On Error Resume Next
        If MyValue Is Nothing Then Return DBNull.Value
        If String.IsNullOrEmpty(MyValue.ToString.Trim) Then
            Return DBNull.Value
        Else
            Return MyValue
        End If
    End Function

    Private Function gettitle(p1 As String) As String
        If p1.Length > 255 Then
            Return p1.Substring(0, 255)
        Else
            Return p1
        End If
    End Function

End Class

The Result for (T2):

On 10,000: last inserting 1000 records take: 0.13709 second.
On 50,000: last inserting 1000 records take: 0.1772109 second.
On 100,000: last inserting 1000 records take: 0.1291394 second.
On 200,000: last inserting 1000 records take: 0.5793488 second.
On 300,000: last inserting 1000 records take: 0.1296427 second.
On 400,000: last inserting 1000 records take: 0.6938583 second.
On 500,000: last inserting 1000 records take: 0.2317799 second.
On 600,000: last inserting 1000 records take: 0.1271072 second.

~3 Minutes ONLY! to insert 600,000 records.
table size: 128 mb.

The Result for (T3):

On 10,000: last inserting 1000 records take: 0.1669595 second.
On 50,000: last inserting 1000 records take: 0.4198369 second.
On 100,000: last inserting 1000 records take: 0.1318155 second.
On 200,000: last inserting 1000 records take: 0.1979358 second.
On 300,000: last inserting 1000 records take: 1.5127482 second.
On 400,000: last inserting 1000 records take: 7.2757161 second.
On 500,000: last inserting 1000 records take: 14.3960671 second.
On 600,000: last inserting 1000 records take: 14.9412401 second.

~40 Minutes! to insert 600,000 records.
table size: 164 mb.