Database Buddha: Introducing GIMPI

here is a clipping from the wiki page I created for the developers.....

I will try to include some code samples in my next posting.

GIMPI (Getty Images Missing Index Production Information)

This set of reports are designed to provide developers with the information they need to identify poorly performing current indexes and/or missing indexes in the production environment. The normal difficulties are that Developers generally do not have access to the environments where the real activity is going on. And that is the only place where accurate statistics and performance data is being generated. We have developed a system that gathers this data from our production servers daily and aggregates it into a server that the dev teams do have access to. To make things more usable, we have also created reports that are designed to spell out what the data is saying in a more understandable format. Even this can still be a little confusing and overwhelming so we have put together this Wiki to explain the process, navigation, and results so that you can quickly use this data for everyones benefit.

Server Level

The first page of the report lists the servers that we are currently gathering data for.

On this page Each server will have a little plus symbol next to it which, when clicked, will expand that server to show the databases on it. The data showed on this page includes the following:

Date : This is the date that the statistical data was generated on the remote server.
Missing : The number of missing index identified at that level.
To Be Dropped : The number of indexes that have been identified as needing to be removed.
Realigned : The number of Clustered indexes that have been identified as inefficient and needing changed.
Low Read : The number of indexes that are rarely read from and are causing more of a burden then a benefit.
High Page Splits: the number of indexes that are poorly designed and causing page splits on inserts at an extremely high rate.
Duplicate Indexes : Indexes that are redundant in whole with another index. These indexes are more burden than benefit and should be removed.
Overlapping Indexes : Indexes that are redundant in part with another index. These indexes are more burden than benefit and should be removed.

A this point, each database name is a link to the next report at database level. Clicking on any database name will then move you to that report. using the browser back button or the Blue Arrow will bring you back to the server level report where you can make a different database choice.

[edit]Database Level

The Database Level Report starts with each table grouped into ranges based on their row counts. in general the larger the row count is, the more impact will be seen in index design. The Distribution is currently:

HUGE
LARGE
MEDIUM
SMALL
TINY

Beside each class name is another plus sign which will expand each group accordingly. The values shown on this page are similar to the previous page with a few exceptions.

Tables : When each class is collapsed, it shows the number of tables in that class.
Rows : The number of Data Rows
Data Pages : The number of 8K Database pages used by the data.
Idx Pages : The number of 8K Database pages used by the Current Indexes.
Wasted Idx Pages : The number of 8K Database pages used by the Current Indexes that are recommended to be dropped.
Blends : New recommended indexes that have the potential of being combined with existing indexes or other recommended indexes.
Creates : Recommended New Indexes.
Drops : Indexes Recommended for removal.
ReAlign's : Clustered Indexes that should be replaced. This is most often caused by blindly clustering the primary key which is used for uniqueness (Identity Column) when the clustered index should be a column used when accessing ranges of records. Single record access sees no benefit from clustered to non clustered.
IDx's : The Number of current Indexes

The chart below the grid summarizes much of the data in the grid and converts pages to GBs to give an idea of space gained if indexes are dropped.

[edit]Table Level

Once a table is selected, you will see the following grid which shows all of the existing and recommended indexes for that table.

The values in this grid show more detailed and specific values related to each index. Not all columns are populated for every index because some values are related to recommendations while others are related to statistics which won't be available for indexes that do not yet exist. Most of the counters for stats reset at each time the SQL Service is started so the absolute value is not always valuable directly. Most values are only meaningful in comparison to other indexes in the same table. The impact values are manufactured values that have no meaning on their own and Only have meaning in order to weight the recommendations against each other. Best results are usually found by making changes one at a time starting with the largest impact and then let the system rebuild the statistics for a period of time before rechecking for additional recommendations.

Action: This identifies the action if any for the index.
Unique: This will show true for an existing index that is defined as unique.
Desc.: self explanatory.
Imp Est Cur: Impact Estimate Currently
Imp Est Cache: Impact Estimate from Cache.
Table Read %: what percent of reads to this table use this index.
Table Write %: what percent of writes to this table use this index.
Index Read %: what is the ratio of reads on this index. only compares activity on this index.
Index Write %: what is the ratio of writes on this index. only compares activity on this index.
Page Splits: Number of page splits in this index.

The Gague at the bottom of the page shows you the distribution of activity on this table. keep in mind seeks are better than scans and huge updates with little to no reads of any kind show an index that is of no use at all.

When you expand an index by pressing its plus sign you are presented with even more details about that index. Again, Existing and recommended indexes wil use the grid differently and all cells wont be used by all indexes.

The newly exposed cells are in two sections, 'Lock History' and 'User Access'. 'Lock History' shows a summary of locks and waits for row and page locks. The 'User Access' section summarizes the exact reads and writes experience by this index.

Recommended indexes will not show any statistics, Duplicates will show other duplicate indexes, Overlaps will show which indexes it overlaps with....

on the lower left or the index grid there will be two links. One provides you with the create script and the other with the Drop Script for this index using our recommended naming convention and wrapper.

If you click the text "SHOW Cached Query Plan Information", just above the create and drop scripts, you will be taken to the Plan Level portion of the report.

[edit]Plan Level

At Plan Level, you will be shown example queries pulled from cached query plans, which are usually from stored procedure calls. Each entry will show you the portion of the sproc with the query related to this portion of the plan. This will give you an idea of which sproc's and generally which query in the sproc will see the impact from this change.