Tablespace Utilization Trend and Forecasting

As a DBA, everyday we have a responsibility to keep sufficient space in tablespace across the fleet. And when the fleet is large, you have thousands of tablespace to monitor. And sometimes it is unpredictable behavior in growth due to change in business requirement, bad code or bad planning by application teams, which result in incidents. Everyone has their own workflow and tools. Some uses Enterprise Manager, some uses their own custom solutions.

To ease the monitoring and forecasting, I use following workflow:

    1. Data Collection – collect the utilization every day from all the databases and put in repository.
    2. Visualization – collected data is then visualized in a lattice graph (one graph per instance with top tablespaces as series in each plot). You can use excel, google charts etc to do this.
    3. Monitor – you can choose to manually look into the plot and take necessary actions
    4. Drill Down and Automatic prediction using linear regression – When I found anything suspicious in monitoring, I further drill down and use oracle linear regression function to predict 100% date. You can get the SQL from my github repository which generates the Plot and Provide you top objects and also the prediction date.  One can also create reports like Top 20 tablespaces, Top 20 fastest growing tablespace etc.


The math behind the prediction is quite simple. I am using oracle provided analytics function like REGR_INT, REGR_R2 and REGR_INTERCEPT which can be used to perform linear regression on the timeseries data like tablespace utilization.  Also note that you need to convert the timestamp data into Julian formatted date to use regression functions.

The visualization is done by using Google charts. I was inspired by Carlos Sierra’s blog post on creating line charts using google charts API.

SELECT DDATE, instance_name, tablespace_name,
 megs_alloc number_column_1,
 megs_used number_column_2,
 --regr_intercept(ddate - to_date('19700101 00:00:00','YYYYMMDD HH24:MI:SS'), megs_alloc - megs_used) over (partition by tablespace_name) as regr_int,
 (to_date('19700101 00:00:00','YYYYMMDD HH24:MI:SS')+ regr_intercept(ddate - to_date('19700101 00:00:00','YYYYMMDD HH24:MI:SS'), megs_alloc - megs_used) over (partition by tablespace_name)) as predicted_date,
 regr_r2(ddate - to_date('19700101 00:00:00','YYYYMMDD HH24:MI:SS'), megs_alloc - megs_used) over (partition by tablespace_name) as coeff_of_determination  
instance_name  = substr('&&metric_selected.', 0,instr('&&metric_selected.',':')-1) 
AND tablespace_name  = substr('&&metric_selected.', instr('&&metric_selected.',':')+1)      AND ddate >  (SYSDATE) - nvl(&&report_period., 1) 
order by DDATE asc 

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s