pg_dropbuffers extension 2018-11-29

When comparing the execution time of a query executed multiple times, cache effects can't come into play, as the first execution (with a cold cache) will obviously be slower than the subsequent executions (with a warm cache), since fewer disk reads will be necessary (that is assuming the dataset doesn't fit in RAM; if it does, then it is better to do your benchmark once all the data is cached).

Therefore, it is needed to drop both the database cache and the system cache between two executions. An easy way to do that is to restart postgres, and to drop the system cache (search for drop_caches in that link). But rjuju and I figured that simply writing SELECT pg_drop_caches() would be more convenient, so we wrote a toy extension, called pg_dropbuffers, that is available here.

Please note that this extension is NOT SAFE to use; you may lose some data while using it. Only use it if your data has no value whatsoever.

This is because some buffers could be dirty when they are discarded. Even though dirty buffers are flushed before dropping the cache, concurrent queries could make other buffers dirty again in between. Even if no UPDATEs happen, a simple SELECT can mark your buffers dirty, if hint bits are set (more details).

Let's set up a test environment to see it in action. After the extension is compiled and installed, let's load it in the database along with others:

CREATE EXTENSION IF NOT EXISTS pg_prewarm;
CREATE EXTENSION IF NOT EXISTS pg_buffercache;
CREATE EXTENSION IF NOT EXISTS pg_dropbuffers;

Then we create a dummy table big enough that we can see it in the cache (this one takes 692MB):

CREATE TABLE IF NOT EXISTS foo (
    id SERIAL PRIMARY KEY,
    a INT
);
INSERT INTO foo (a) SELECT generate_series(1, 20000000);

It is now possible to see the hint bits in action:

CHECKPOINT;
-- CHECKPOINT

SELECT count(*) FROM pg_buffercache WHERE isdirty;
--  count
-- -------
--      0
-- (1 row)

SELECT count(*) FROM foo;
--   count
-- ----------
--  20000000
-- (1 row)

SELECT count(*) FROM pg_buffercache WHERE isdirty;
--  count
-- -------
--   6683
-- (1 row)

6683 buffers were dirtied by a SELECT statement, which is why you shouldn't use this extension for anything important, even on a read-only database.

We create a view to see the contents of the shared buffers (credits go to this link for the query):

 CREATE OR REPLACE VIEW shared_buffers_content AS
 SELECT
     c.relname,
     pg_size_pretty(count(*) * 8192) as buffered,
     round(100.0 * count(*) / (SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer, 1) AS buffer_percent,
     round(100.0 * count(*) * 8192 / pg_table_size(c.oid), 1) AS percent_of_relation
 FROM pg_class c
 INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
 INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
 GROUP BY c.oid, c.relname
 ORDER BY 3 DESC
 LIMIT 10;

And this is how we can drop the database cache without restarting it:

SELECT pg_prewarm('foo', 'buffer');
--  pg_prewarm
-- ------------
--       88496
-- (1 row)

SELECT * FROM shared_buffers_content;
--                     relname                    |  buffered  | buffer_percent | percent_of_relation
-- -----------------------------------------------+------------+----------------+---------------------
--  foo                                           | 128 MB     |           99.8 |                18.5
--  pg_constraint_conrelid_contypid_conname_index | 8192 bytes |            0.0 |                50.0
-- (2 rows)

SELECT pg_drop_current_db_buffers();
--  pg_drop_current_db_buffers
-- ----------------------------
--
-- (1 row)

SELECT * FROM shared_buffers_content;
--                     relname                    |  buffered  | buffer_percent | percent_of_relation
-- -----------------------------------------------+------------+----------------+---------------------
--  pg_constraint_conrelid_contypid_conname_index | 8192 bytes |            0.0 |                50.0

As for the system cache, the extension simply does a system call to sysctl (the postgres user must be able to run that command without password, so you'll have to update your sudoers file as indicated in the readme).

SELECT pg_drop_system_cache();

Let's take a look at the cache column of vmstat (units are MB):

vmstat -S M 1
# procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
#  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
#  6  0      0  10323      0   <span class="code-emphasis">3160</span>    0    0   348   452  795 2472  5  3 92  0  0
#  2  0      0  12697      0    <span class="code-emphasis">786</span>    0    0  5832     0 1540 2897 12 14 73  0  0

As you can see, the amount of data in the system cache has significantly dropped (the cache drop happened between the two rows I included).

For convenience, we also wrote a wrapper to drop both the database and the system cache at once:

SELECT pg_drop_caches();

Feel free to give us feedback, or to contribute.

Here is the SQL script I used for this article:

CREATE EXTENSION IF NOT EXISTS pg_prewarm;
CREATE EXTENSION IF NOT EXISTS pg_buffercache;
CREATE EXTENSION IF NOT EXISTS pg_dropbuffers;

CREATE TABLE IF NOT EXISTS foo (
   id SERIAL PRIMARY KEY,
   a INT
);
INSERT INTO foo (a) SELECT generate_series(1, 20000000);

CHECKPOINT;
SELECT count(*) FROM pg_buffercache WHERE isdirty;
SELECT count(*) FROM foo;
SELECT count(*) FROM pg_buffercache WHERE isdirty;

CREATE OR REPLACE VIEW shared_buffers_content AS
SELECT
   c.relname,
   pg_size_pretty(count(*) * 8192) as buffered,
   round(100.0 * count(*) / (SELECT setting FROM pg_settings WHERE name='shared_buffers')::integer, 1) AS buffer_percent,
   round(100.0 * count(*) * 8192 / pg_table_size(c.oid), 1) AS percent_of_relation
FROM pg_class c
INNER JOIN pg_buffercache b ON b.relfilenode = c.relfilenode
INNER JOIN pg_database d ON (b.reldatabase = d.oid AND d.datname = current_database())
GROUP BY c.oid, c.relname
ORDER BY 3 DESC
LIMIT 10;

SELECT pg_prewarm('foo', 'buffer');
SELECT * FROM shared_buffers_content;
SELECT pg_drop_current_db_buffers();
SELECT * FROM shared_buffers_content;

SELECT pg_drop_system_cache();
SELECT pg_drop_caches();