Finding and fixing InnoDB index corruption

Thanks Rolando and Michael for your responses.

To close the loop, here's the answer I came up with for my original questions:

  • Q: How can I determine if other tables have similar index corruption?
  • A: Use CHECK TABLE. I ran mysqlcheck -c on all of the relevant InnoDB tables to find out which ones had index corruption

  • Q: What's the most efficient way to fix the corrupt indexes?

  • A: Use OPTIMIZE TABLE to rebuid the InnoDB table that have corrupt indexes. This causes a complete table rebuild which fixes the corruption.

You may have the easiest solution there is. However, I would love to clarify some things:

Making a snapshot with a running MySQL Instance could affect the one file that is respsonsible for secondary index manipulation: ibdata1.

The system tablespace ibdata1 is the home of 7 classes of InnoDB Information

  • Data Pages (if innodb_file_per_table disabled)
  • Index Pages (if innodb_file_per_table disabled)
  • Data Dictionary (Included List of Tables and Their TableSpace IDs)
  • Double Write Buffer (Provides Checksum Info to Prevent Data Corruption)
  • Insert Buffer (Changes to Secondary Indexes)
  • Redo Logs
  • Undo Logs
  • Pictorial Representation

The pivotal classes I would worry about are the Double Write Buffer and the Insert Buffer. Doing a Live Snapshot with either of these not properly written will introduce data corruption.

Doing FLUSH TABLES WITH READ LOCK; does not halt writes to ibdata1 as one would think. I wrote about this subject before. I used to think it did, until it was pointed about by fellow DBA.SE member @ShlomiNoach.

Think about the InnoDB Buffer Pool. You would have to flush every dirty page out of it to get everything quiesced to disk. The following forces all dirty pages out on a per-table basis:

  • SET GLOBAL innodb_fast_shutdown = 0; followed by service mysql stop
  • SET GLOBAL innodb_max_dirty_pages_pct = 0; and wait until 1% of Buffer Pool is dirty
  • mysqldump

Also, do not forget that binary logs and relay logs depend on the OS for flushing.

An EC2 Snapshot is not MySQL-Aware in these respects, no more than an LVM snapshot would be. That is why backup software such as CDP's R1Soft has a MySQL Module for such occasions.

Contrariwise, an Amazon RDS MYSQL Instance is aware and built for such InnoDb-centric scenarios. Only if there are active MyISAM tables in the RDS Instance would FLUSH TABLES WITH READ LOCK; be a necessary evil to manually perform.

In regards to your original question, when you ran ALTER TABLE my_table ENGINE=InnoDB; you simply rebuilt index pages reading from data pages of the table, most likely bypassing the ibdata1 Insert Buffer. This is why that worked for you.

If you can do a mysqldump with --single-transaction --master-data=1 on the Master, scp the mysqldump to the slave, and do it without Financial Charges, that would be a safer method for setting up the EC2 Slave.

If you must do the snapshot thing, please to do this on the Slave:

  • Run SET GLOBAL innodb_fast_shutdown = 0;
  • service mysql stop
  • Do your snapshot
  • service mysql start
  • Add innodb_fast_shutdown to /etc/my.cnf

Additional Information: The smaller ib_logilfe0 and ib_logfile1 are, the faster the shutdown.

I hope this explains a lot.

UPDATE 2013-01-14 10:36 EDT

You recently asked in the comment section

how can I tell if a database has index corruption?

Keep in mind you are using EC2 and not RDS. With RDS, Amazon is responsible for the holistic state of the VM and the MySQL Instance. With EC2, Amazon is responsible for the VM only. The holistic state of the MySQL Instance now rests with you. You may want to port the database to RDS because it comes with the additional bells and whistles to guard against such corruption.

Tags:

Mysql

Innodb