Oracle 10g introduced the metrics views. Kyle Hailey has blogged about getting a quick view of the load on the system by using the wait class metrics. Using these ideas I started to work on the visualization of Oracle performance metrics using R, an open source programming language and software environment for data analysis and graphics.
Oracle Enterprise Manager and ASH are a nice toolset for performance monitoring. Unfortunately I am not allowed to use OEM at my current workplace and we have no license for the diagnostic and tuning packs either. So I had to find another way to visualize a history of database performance metrics.
For some of the metrics there is an associated history view. These views
normally allow looking into the past for about an hour or so. This is
obviously unusable for long term analysis but it is all I have and at least
allows investigating recent incidents. So I am using the Oracle views
to get data about the recent system and wait class metrics.
The R package
ggplot2 is used to generate graphs from the data
while the package
RJDBC handles the database connection to
Oracle. Check my R script
containing the functions I wrote to create the graphs shown below.
Database time and CPU time
My R function
metric.time() was used for the first diagram. The
graph contains two parts with the upper part showing the time used by the
Oracle instance. It is a stacked graph where the violet color represents the
CPU time used by the Oracle instance while the green color accounts for the
time spent waiting.
The data is displayed as seconds per second. So in this case the violet color shows that every second Oracle uses about 10 seconds of CPU time, thus indicating a demand for ten CPU cores. The green color represents the time the database has accounted as wait time. Again this shows that every second there were about 50 or 60 waiting sessions on average.
The bottom part has a graph of the CPU utilization of the server. In this example the server has been busy between 45% and almost 80%. This includes not just the Oracle instance itself but all processes running on the server.
Both graphs share the time axis and therefore allow an easy correlation between CPU utilization and Oracle CPU demand.
Database wait class time
My R function
metric.waitclass() generates another graph. The
example below (this time from another instance) shows the wait times broken
down by wait classes. Wait classes are used by Oracle to group the large
number of wait events and create understandable clusters. You can check
V$EVENT_NAME for the mapping. The data here is also displayed as
waited seconds per second. The wait class Idle has been omitted from the
It is easy to see that user I/O contributes the biggest part. The wait class Network mostly contains access to remote databases over database links. Problems on the remote database would be visible here. The sudden peak in the Administrative class was caused by rman saving archived logs to tape.