Don't get hooked on hitrates

When people ask for advice on a performance problem they are often told to provide global database statistics like the shared buffers hitrate. If the hitrate is low (according to the rating scale of the voluntary helper), an increase in the number of shared buffers is normally suggested as a fix. This will probably help with the hitrate but might not solve the original problem. Read on and find out why a good hitrate does not guarantee a good end-user performance.

I will use the following query as an example to look at the shared buffers hitrate for your PostgreSQL database cluster:

SELECT datname,
       100.0*blks_hit/NULLIF(blks_hit+blks_read, 0) AS hitrate
  FROM pg_stat_database;

The output might look something like this:

  datname  |     hitrate
 postgres  | 99.7533232835411813
 dev       | 95.9511568123393316
 prd       | 99.8631058250007681
 template1 | 99.5835395906420162
 template0 |
(5 rows)

But what does that tell us? Except for my development database the numbers are pretty high and therefore the performance must be stunning, right?

The rate obviously only tells you that you have a large number of cache hits in relation to the total number of requested blocks. In other words: any problem could be hiding behind that figure if the number of cache hits is only big enough. A high hitrate may even indicate a lot of useless work going on in your database.

If your database performs a lot of sequential scans due to missing indices, you might have a slow application and unhappy users. But since the blocks are mostly found in the cache you will see a high hitrate. Unfortunately this doesn’t help your users at all. In the end it is not about the hitrate in your database but the response time your users experience.

Let’s look at a practical example. For Oracle there is a nifty Hitrate Generator available. For demonstration purposes I have used that as a guideline to create the following PL/pgSQL function:

CREATE OR REPLACE FUNCTION improve_hit_ratio(integer)
  l_counter integer := $1;
  WHILE (l_counter > 0) LOOP
    l_counter := l_counter - 1;
    -- Generate lots of buffer hits
    PERFORM 1 FROM pg_class LIMIT 10;
$$ LANGUAGE plpgsql;

Call the function with a reasonable large parameter (100000 is a good ballpark figure) and observe how the buffer hitrate improves with every run. We use a lot of useless work to make our database look healthy.

At last you can have any hitrate you like in your database. 😃

Conclusion: The hitrate hides too many important facts and as long as you can’t be sure that your application does not contain a similar functionality in disguise, you should not use a high buffer hitrate as indicator for good performance.

Hitrate is nothing, user response time is everything.