Error Code 1117 Too many columns; MySQL column-limit on table

Why would you need to create a table with even 20 columns, let alone 2000 ???

Granted, denormalized data can prevent having to do JOINs to retrieve many columns of data. However, if you have over 10 columns, you should stop and think about what would happen under the hood during data retrieval.

If a 2000 column table undergoes SELECT * FROM ... WHERE, you would generate large temp tables during the processing, fetching columns that are unnecessary, and creating many scenarios where communication packets (max_allowed_packet) would be pushed to the brink on every query.

In my earlier days as a developer, I worked at a company back in 1995 where DB2 was the main RDBMS. The company had a single table that had 270 columns, dozens of indexes, and had performance issues retrieving data. They contacted IBM and had consultants look over the architecture of their system, including this one monolithic table. The company was told "If you do not normalize this table in the next 2 years, DB2 will fail on queries doing Stage2 Processing (any queries requiring sorting on non-indexed columns)." This was told to a multi-trillion dollar company, to normalize a 270 column table. How much more so a 2000 column table.

In terms of mysql, you would have to compensate for such bad design by setting options comparable to DB2 Stage2 Processing. In this case, those options would be

  • max_allowed_packet
  • tmp_table_size
  • max_tmp_tables
  • max_heap_table_size
  • max_length_for_sort_data
  • max_sort_length
  • sort_buffer_size
  • myisam_max_sort_file_size
  • myisam_sort_buffer_size

Tweeking these settings to make up for the presence of dozens, let alone hundreds, of columns works well if you have TBs of RAM.

This problem multiplies geometrically if you use InnoDB as you will have to deal with MVCC (Multiversion Concurrency Control) trying to protect tons of columns with each SELECT, UPDATE and DELETE through transaction isolation.

CONCLUSION

There is no substitute or band-aid that can make up for bad design. Please, for your sake of your sanity in the future, normalize that table today !!!


I'm having trouble imagining anything where the data model could legitimately contain 2000 columns in a properly normalised table.

My guess is that you're probably doing some sort of "fill in the blanks" denormalised schema, where you're actually storing all different sorts of data in the one table, and instead of breaking the data out into separate tables and making relations, you've got various fields that record what "type" of data is stored in a given row, and 90% of your fields are NULL. Even then, though, to want to get to 2000 columns... yikes.

The solution to your problem is to rethink your data model. If you're storing a great pile of key/value data that's associated with a given record, why not model it that way? Something like:

CREATE TABLE master (
    id INT PRIMARY KEY AUTO_INCREMENT,
    <fields that really do relate to the
    master records on a 1-to-1 basis>
);

CREATE TABLE sensor_readings (
    id INT PRIMARY KEY AUTO_INCREMENT,
    master_id INT NOT NULL,   -- The id of the record in the
                              -- master table this field belongs to
    sensor_id INT NOT NULL,
    value VARCHAR(255)
);

CREATE TABLE sensors (
    id INT PRIMARY KEY AUTO_INCREMENT,
    <fields relating to sensors>
);

Then to get all of the sensor entries associated with a given "master" record, you can just SELECT sensor_id,value FROM sensor_readings WHERE master_id=<some master ID>. If you need to get the data for a record in the master table along with all of the sensor data for that record, you can use a join:

SELECT master.*,sensor_readings.sensor_id,sensor_readings.value
FROM master INNER JOIN sensor_readings on master.id=sensor_readings.master_id
WHERE master.id=<some ID>

And then further joins if you need details of what each sensor is.


It's a measurement system with 2000 sensors

Ignore all the comments shouting about normalization - what you are asking for could be sensible database design (in an ideal world) and perfectly well normalized, it is just very unusual, and as pointed out elsewhere RDBMSs are usually simply not designed for this many columns.

Although you are not hitting the MySQL hard limit, one of the other factors mentioned in the link is probably preventing you from going higher

As others suggest, you could work around this limitation by having a child table with id, sensor_id, sensor_value, or more simply, you could create a second table to contain just the columns that will not fit in the first (and use the same PK)

Tags:

Mysql

Table