MySQL commits taking time
Since your commits are stalling, it's pretty safe to assume you're running with
innodb_flush_log_at_trx_commit=1 (see #4 here), so one obvious change you could do is switch to
innodb_flush_log_at_trx_commit=0, however that may not be the right choice for you - see the previous link.
In any case,
COMMITs don't usualy stall due to other activity on the server (like queries), unless that other activity locks some kind of mutex for far too long.
If I had to guess without any further evidence, I would say that your writes are so intensive, that MySQL is unable to flush fast enough using adaptive checkpointing, so it exceeds 75% of the log capacity and starts sharp checkpointing (which blocks server activity for a short period).
There are few quick ways to make MySQL handle writes better (besides one already mentioned above):
- Increase innodb log files (innodb_log_file_size) - you can go up to 4GB or so for your logs (esimating based on zero information given on the amount of writes),
- Increase pIOPS capacity for the data drive
- Increase innodb_io_capacity (up to ~70-80% of pIOPS capacity)
Otherwise it would be really useful to see disk activity from AWS, amount of data server currently writes (you can calculate it using
Log sequence number delta between two consecutive
show engine innodb status\G runs)
There are other ways too, but for that, I'd need a lot more details on the actual workload.