How to store schema-less data efficiently in a relational database?

Some really nasty problems would happen with these tables (idea)

  1. Data Redundancy (duplicated data) what you need to keep in synchronisation (and you do you check this because MySQL doesn't have an nice JSON to records function?)

  2. You can't force correct values on the database (say goodbye to data integrity and hello to garbage-in garbage-out data) example key birthdate could have value "hello"

And you need some kind of pivot query to fetch the key/values

If you really need dynamic key values (EAV) storage there are some more options:

  • Maybe MySQL 5.6 NoSQL using the InnoDB memcached Plugin
  • XML -> query with xpath
  • RDF
  • NoSQL like MongoDB
  • MariaDB dynamic columns and MariaDB also supports json.

From your sample data i actually see a schema for the table

UserID | UserName | City | Birthdate

If you're using a relational database with the right indexing a search on that would be very fast. If some values could be null then allow nullable columns.

A key value store is great in situations but can become very cumbersome when it comes to data validation. If you are using a relational database product put it to work and let it do the validation on the input side with properly typed columns.

If you really want to use a key value store, a NoSQL product optimized for them would probably be the better way to go.


Given your requirements (MySQL and "schema-less"), I think it might be worth considering Option 1 with a twist. Consider XML instead of JSON.

Why? Because MySql doesn't give you an easy way to process JSON, but it does with xml.

Consider this:

id  |  name   | meta  
1      Bob      <city>GoTown</city><birthdate>1980-08-14&</birthdate>
2      Alice    <city>Streamville</city><birthdate>1986-06-06<birthdate>

Now you can do a sql query using xpath expressions on the content of the meta field. Like so:

SELECT name, extractValue( meta, '//birthdate' ) as birthdate
from tbl

You can even do selects based on the the city (which of course will be VERY inefficient since they will have to parse the contents of the fields that match the where clause, but:

SELECT name<BR>
from tbl<br>
where id < 100
having extractValue( meta, '//city' ) = 'Streamille'

You can even use UpdateXML (another MySQL function) to make changes to the content of your field all within MySQL.

While I know my response is late, I hope someone finds this useful / interesting. :)