More on understanding sort_buffer_size
There have been a few posts by Sheeri and Baron today on the MySQL sort_buffer_size variable. I wanted to add some more information about this buffer, what is impacted when it is changed, and what to do about it?
The first thing you need to know is the sort_buffer_size is a per session buffer. That is this memory is assigned per connection/thread. I’ve seen clients that set this assuming it’s a global buffer Don’t Assume – Per Session Buffers.
Second, internally in the OS usage independently of MySQL, there is a threshold > 256K. From Monty Taylor “if buffer is set to over 256K, it uses
malloc()for memory allocation. Actually – this is a libc malloc thing and is tunable, but defaults to 256k. From the manual:” . He goes on in a further to shows that impact > 256K for a buffer is 37x slower. This applies to all per session buffers, not just sort buffer. Now I have heard recently about this limit being 512K. I wasn’t able to nail down the specific speaker to see if this was a newer library or kernel or OS.
With MySQL instrumentation and the sort_buffer_size we are lucky, there is the Sort_merge_passes status variable. While it’s not perfect, it does indicate if the size of the buffer is in-sufficient, however even if we use a sort_buffer_size of say 256K, and you see Sort_merge_passes increasing slowly, does not indicate you have to increase the buffer.
So, all this does not tell you how to tune the buffer? Unfortunately with MySQL there is no actual easy answer. You do need to monitor the mysqld memory usage overall, especially if you are using persistent connections. A connection/thread will not release the memory assigned until it is closed, so it’s important to monitor for memory creep of the PGA, knowing what your initial SGA is. Morgan Tocker wrote a patch in Bug #33540 to create a RESET CONNECTION type command.
You do need to look for memory as a bottleneck. You need to learn how MySQL use memory, not just the sort_buffer_size. I actually started many years ago to write global/session variables to indicate when buffers were used, and how much and I started with the sort_buffer_size which was buried down in some very old filesort code. When I sought the input of an expert C coder around this, they wondered how the code, especially a loop handler even actually worked.
Nobody knows what the optimal setting is, and that’s the problem. In certain areas especially memory usage the MySQL instrumentation is simply non-existent, and I’d like to see this as something that is fixed.
In conclusion, if I ever see a sort_buffer_size above 256K, e.g. 1M or 2M, I always reset it to 256K. My reasoning is simple. Until you have evidence in your specific environment increasing the buffer makes performance better, it’s better to use a smaller value. There are bigger wins, like not using sorting, or better design, or even better simplifying or eliminating SQL.