What does having the primary key as the last column in a composite secondary index in an InnoDB table do?

No. If your table has the InnoDB engine and the PRIMARY KEY is (pet_id), then defining a secondary index as (person_id) or (person_id, pet_id) makes no difference.

The index includes the pet_id column as well so values are sorted as (person_id, pet_id) in both cases.

A query like the one you have:

SELECT pet_id FROM yourtable 
WHERE person_id = 127 
ORDER BY pet_id ;

will need to access only the index to get the values and even more, it won't need to do any sort, as the pet_id values are already sorted in the index. You can verify this by looking at the execution plans (EXPLAIN):

First, we try with a MyISAM table:

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id)
 ) ENGINE = myisam ;

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using filesort
1 row in set (0.00 sec)

Notice the filesort!

Now, MyISAM with composite index:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id, pet_id)            -- composite index
 ) ENGINE = myisam ;

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;


mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

Filesort is gone, as expected.

Now lets try the same with InnoDB engine:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id)            -- simple index
 ) ENGINE = innodb ;                      -- InnoDB engine

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

No filesort either! Even though the index does not explicitly have the pet_id column, the values are there and sorted. You can check that if you define the index with (person_id, pet_id), the EXPLAIN is identical.

Lets actually do it, with InnoDB and the composite index:

 DROP TABLE IF EXISTS pets ;

 CREATE TABLE table pets 
 ( pet_id int not null auto_increment PRIMARY KEY, 
   person_id int not null, 
   INDEX person_ix (person_id, pet_id)    -- composite index
 ) ENGINE = innodb ;                      -- InnoDB engine

INSERT INTO pets (person_id) 
VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;

mysql> EXPLAIN SELECT pet_id FROM pets 
               WHERE person_id = 2  
               ORDER BY pet_id asc \G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: pets
         type: ref
possible_keys: person_ix
          key: person_ix
      key_len: 4
          ref: const
         rows: 3
        Extra: Using where; Using index
1 row in set (0.00 sec)

Identical plans with the previous case.

To be 100% sure, I also run the last 2 cases (InnoDB engine, with single and composite indexes) enabling the file_per_table setting and adding a few thousands rows in the table:

DROP TABLE IF EXISTS ... ;
CREATE TABLE ... ;

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3) ;
Query OK, 12 rows affected (0.00 sec)
Records: 12  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3),(127) ;
Query OK, 13 rows affected (0.00 sec)
Records: 13  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       VALUES (1),(2),(3),(1),(2),(3),(4),(1),(8),(1),(2),(3),(127) ;
Query OK, 13 rows affected (0.00 sec)
Records: 13  Duplicates: 0  Warnings: 0

mysql> INSERT INTO pets (person_id) 
       SELECT a.person_id+b.person_id-1 
       FROM pets a CROSS JOIN pets b CROSS JOIN pets c ;
Query OK, 54872 rows affected (0.47 sec)
Records: 54872  Duplicates: 0  Warnings: 0

In both cases, checking the actual file sizes, yields identical results:

ypercube@apollo:~$ sudo ls -la /var/lib/mysql/x/ | grep pets
-rw-rw----  1 mysql mysql     8604 Apr 21 07:25 pets.frm
-rw-rw----  1 mysql mysql 11534336 Apr 21 07:25 pets.ibd

According to the MySQL Documentation on the Clustered and Secondary Indexes

How Secondary Indexes Relate to the Clustered Index

All indexes other than the clustered index are known as secondary indexes. In InnoDB, each record in a secondary index contains the primary key columns for the row, as well as the columns specified for the secondary index. InnoDB uses this primary key value to search for the row in the clustered index.

If the primary key is long, the secondary indexes use more space, so it is advantageous to have a short primary key.

Therefore, adding the PRIMARY KEY to a secondary index is definitely redundant. Your index entry would like (person_id, pet_id, pet_id). This would also needlessly bloat the secondary index by having 2 copies of the PRIMARY KEY.

For the index with (person_id), if you were to run a query like this

SELECT * FROM yourtable WHERE person_id = 127 ORDER BY pet_id;

The PRIMARY KEY would be fully engaged in this query and produces the results ordered by PRIMARY KEY anyway. From a physical standpoint, the rows are ordered by insertion order. If the pet_id is AUTO_INCREMENT, then it is order by the auto number.

Tip 1:

PRIMARY KEY(x, id),
INDEX(id) -- where `id` is `AUTO_INCREMENT`

is perfectly valid. It has the performance advantage of being more efficient when many queries need to find multiple rows WHERE x = 123. That is, it is slightly more efficient than the 'obvious'

PRIMARY KEY(id),
INDEX(x, id)

The only rule about AUTO_INCREMENT (for InnoDB) is that id must be the first column in some index. Note that that rule says nothing about PRIMARY or UNIQUE or 'only column'.

The tip is useful for huge tables that are often fetched by x together with other stuff.

Tip 2: Suppose you have

SELECT name FROM tbl WHERE person_id = 12 AND pet_id = 34;

This is a "covering" index:

INDEX(person_id, pet_id, name)

That is, the entire query can be done inside the index's BTree. The EXPLAIN will say "Using index".

What does having the primary key as the last column in a composite secondary index in an InnoDB table do?

Tags:

Mysql

Primary Key

Innodb

Clustered Index

Related

Recent Posts