D.I.Y. MySQL 5.1 monitoring

I wrote recently about using events and the new processlist table in MySQL 5.1 to keep track of the number of connected processes.  Although having the PROCESSLIST available as an INFORMATION SCHEMA table is usefull, it seemed to me that having SHOW GLOBAL STATUS exposed in a similar fashion would be far more useful.  So at the MySQL UC last year, I asked Brian Aker if that would be possible.  I know how many enhancement requests MySQL has to deal with, so I was really happy to see that table appear in the latest 5.1 build (5.1.14 beta). 

This table, together with the EVENT scheduler, lets us keep track of the values of status variables over time without having to have any external deamon running.  This won't come anywhere near to matching what MySQL have made avalable in Merlin, but still could be fairly useful.  So lets build a simple system using events to keep track of "interesting" status variables....

I also thought I’d take the opportunity of working with the MySQL Workbench to design the tables involved.   Unfortunately the product is effectively unusable on my system until at least rc7 (due when?) and b/c it only support windows, I’m not up for a download the source and compile.  Oh well.  I used our own (Quest Software's) Toad Data Modeller instead.   

The idea is to schedule some events that trap (snapshot) the contents of the PROCESSLIST and GLOBAL_STATUS variables into tables that can be used to track trends and monitor performance in a simple way.  I created three tables to hold the data:

Toaddatamodeller_2

  1. GHSNAP_STATUS_VARIABLES lists the variables names we consider "interesting".  If the variable_name is listed here and CAPTURE_FLAG=1, then we will capture the value of the variable.  VARIABLE_SOURCE indicates the table from which the data is obtained - I only implemented GLOBAL_VARIABLES for now
  2. GHSNAP_SNAPSHOTS contains one row for each snapshot of the table we take.  The UPTIME column is used to detect server restarts so we don't calculate nonsense deltas or rates
  3. GHSNAP_SNAPSHOT_VALUES contains one row for each variable for each snapshot.  We capture the raw value of the variable, its change since the last snapshot and the rate of change per second.

This file installs the tables, stored procedures and events.  You may have to run it from the MySQL query browser to avoid errors caused by the command line client not processing DELIMITER statements properly.  The script creates a new database GH_SNAPSHOTS.  Three stored procedures are created:

  1. ghsnap_populate_variables creates default entries in the GHSNAP_STATUS_VARIABLES table and is run only during installation.  By default I set it up to capture all the INNODB%, COM% and QCACHE% variables but you might want to change the CAPTURE_FLAG for those you do/don't want.
  2. ghsnap_take_snapshot takes a snapshot of GLOBAL_STATUS and stores it in the snapshot tables.  It also works out if the server has been restarted and if not, calculates rates and deltas.
  3. ghsnap_delete_snapshots deletes all snapshots more than a certain number of days old.

There's two events defined:

  1. ghsnap_take_snap_event runs (by default) every five minutes and simply executes ghsnap_take_snapshot.
  2. ghsnap_delete_snap_event runs every 5 hours (by default) and deletes snapshots more than 1 day old (you can edit the installation script to change this default).

There's some obvious additional things that we could do with this (optimze deletes via partitioning, calculate ratios, capture other state, etc) but this probably has some real value already.  Now I can issue queries such as this:

select snapshot_id,snapshot_timestamp ,
       sum( case variable_name when 'COM_COMMIT' then variable_value_rate end
          ) commit_ps,
       sum( case variable_name when 'COM_SELECT' then variable_value_rate end
          ) select_ps,
       sum( case variable_name when 'COM_INSERT' then variable_value_rate end
          ) insert_ps,
       sum( case variable_name when 'COM_DELETE' then variable_value_rate end
          ) delete_ps,
       sum( case variable_name when 'COM_UPDATE' then variable_value_rate end
          ) update_ps
  from ghsnap_snapshots join ghsnap_snapshot_values using (snapshot_id)
where snapshot_id>70
group by snapshot_id,snapshot_timestamp

To view the activity on my server.  Using a reporting or charting tool I can generate usefull charts, etc.  For instance, using the BIRT module in eclipse I created the following chart showing my SQL execution rates:

Birt

Nothing earth-shattering but still useful and - with the event scheduler, stored procedures and the new INFORMATION_SCHEMA tables we don't need any external infrastructure to capture this information. 

We'll probably implement something like this in the next version of our  freeware Spotlight on MySQL as well as better diagnostics on replication and exploiting the fact that in 5.1 you can get SELECT access to logs.  Feel free to check it out.   

P.S. don't forget that to set EVENT_SCHEDULER=1 to enable events on your 5.1 server.  And watch out on Windows where the event scheduler in 5.1.14 still seems a bit unstable.