What to set innodb_buffer_pool and why..?
The biggest table you have makes up 16.47% (28/170) of the total data. Even if the table was highly written and highly read, not all 28G of the table is loaded in the buffer pool at one given moment. What you need to calculate is how much of the InnoDB Buffer Pool is loaded at any given moment on the current DB Server.
Here is a more granular way to determine innodb_buffer_pool_size for a new DB Server given the dataset currently loaded in the current DB Server's InnoDB Buffer Pool.
Run the following on your current MySQL Instance (server you are migrating from)
SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_data'; -- IBPDataPages SHOW GLOBAL STATUS LIKE 'Innodb_buffer_pool_pages_total'; -- IBPTotalPages SHOW GLOBAL STATUS LIKE 'Innodb_page_size'; -- IPS
Run the formula
IBPPctFull = IBPDataPages * 100.0 / IBPTotalPages.
SET @IBPDataPages = (SELECT VARIABLE_VALUE FROM information_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_pages_data'); -- SELECT @IBPDataPages; SET @IBPTotalPages = (SELECT VARIABLE_VALUE FROM information_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_pages_total'); -- SELECT @IBPTotalPages; SET @IBPPctFull = CAST(@IBPDataPages * 100.0 / @IBPTotalPages AS DECIMAL(5,2)); SELECT @IBPPctFull;
If IBPPctFull is 95% or more, you should set innodb_buffer_pool_size to 75% of the DB Server's RAM.
If IBPPctFull is less than 95%, run this formula :
IBPSize = IPS X IBPDataPages / (1024*1024*1024) X 1.05.
SET @IBPSize = (SELECT VARIABLE_VALUE FROM information_schema.global_status WHERE VARIABLE_NAME = 'Innodb_page_size'); -- SELECT @IBPSize; SET @IBPDataPages = (SELECT VARIABLE_VALUE FROM information_schema.global_status WHERE VARIABLE_NAME = 'Innodb_buffer_pool_pages_data'); -- SELECT @IBPDataPages; SET @IBPSize = concat(ROUND(@IBPSize * @IBPDataPages / (1024*1024*1024) * 1.05, 2), ' GB' ); SELECT @IBPSize;
The number for IBPSize (in GB) is the number that more closely fits your actual working dataset.
Now, if IBPSize is still too big for the biggest Amazon EC2 RAM Config, use 75% of the RAM for the Amazon EC2 DB Server.
I am providing this answer as complementary information to Rolando's answer below.
Before the server is in production
Calculate innodb_buffer_pool_size based on the largest tables that are most often used by MySQL. To identity the largest tables based on their size in the database you could use this script:
select table_schema, table_name, round(data_length/1024/1024,2) as size_mb from information_schema.tables where table_schema like 'my_database' order by size_mb desc; +--------------+-------------------------+---------+ | table_schema | table_name | size_mb | +--------------+-------------------------+---------+ | heavybidder | user | 522.55| | heavybidder | bid | 121.52| | heavybidder | item_for_sale | 10.52| | heavybidder | account_user | 5.02 | | heavybidder | txn_log | 4.02 | | heavybidder | category | 0.02 | +--------------+-------------------------+---------+
Now that we know which tables are the largest in our database, we need to determine which ones are the most frequently used. To do that, I would use a profiling program like Jet Profiler (JP) to look at which tables are being accessed the most. JP will show you which tables are being accessed the most frequently. Here is a screenshot from that section in JP
So with this in mind, I now know that the user and bid tables take about around 640MB of disk space, they are very frequently used according to JP and which means that MySQL is going to store their indexes and data in the buffer pool as Rolando mentions below in his comments.
To make sure MySQL had enough memory to store data for my largest and most frequently used tables, I would then define innodb_buffer_pool_size at 640MB.
There are some additional considerations, but they don't apply to the innodb_buffer_pool_size.
Is this a 32Bit or 64bit system? In a 32Bit system, you are limited to 4GB unless you activate PAE. In Windows, this means running Windows Enterprise or Datacenter editions.
How much memory do the other processes running on your system need? On a dedicated MySQL server, I will leave between 5% and 10% for the OS. In Windows you can use Process Explorer to analyze memory usage. In Linux, you have sysstat, free, htop, top and vmstat.
Is the database made up of only Innodb tables or a mixture of Innodb and MyISAM? If it is a mixture of the two, then I will set aside memory for the key_cache, join variables, query cache, etc. You can later calculate your MyISAM hit ratio once the server is in production.
After the server is in production
What is the current hit ratio for Innodb?
1 - (innodb_buffer_pool_reads / innodb_buffer_pool_read_requests).
What is the Key Cache Hit Ratio
1 - (Key_reads/Key_read_requests)
I typically try to get the ratio as close to 100% as possible.
How well do your tables fit in the buffer pool
You can also look at how well your table data fits in your buffer_pool by referring to this link, which provides a way to show "how many pages are in buffer pool for given table (cnt), how many of them are dirty (dirty), and what is the percentage of index fits in memory (fit_pct)." Applies to Percona server only