Data Warehouse Object Health State Data Dedicated Maintenance Recovery State


A customer of mine had his SCOM 2007 R2 CU6 Root Management Server that was in a critical health state. When looking at the health explorer, we saw this:


 

SCOM performance its own database maintenance. Kevin Holman wrote a nice article about what maintenance is done automatically by SCOM and what maintenance you could configure additionally: http://blogs.technet.com/b/kevinholman/archive/2008/04/12/what-sql-maintenance-should-i-perform-on-my-opsmgr-databases.aspx .

When looking at the state change events, we saw the following description:

Failed to store data in the Data Warehouse. Exception ‘SqlException’: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance Instance name: State data set Instance ID: {GUID} Management group: MG

 What does this mean? SCOM is trying to perform its maintenance on state data, but unfortunately, there is too much maintenance work to do, so the rule times out. Best would be to now check why this is timing out. Maybe you have corrupt tables? Or you had a sudden flow of a lot of data that was inserted in the datawarehouse? For this customer, we had issues with config churn which we just fixed. So this time-out could be easily explained.

How to fix this? First, disable the rule performing the maintenance so it will not interfere with our manual procedure. Open the SCOM console

  1. Go to the authoring pane
  2. Select Authoring => Management Pack Objects => Rules
  3. Change the scope to ‘Standard Data Set’
  4. Right-click The only rule there is: Standard Data Warehouse Data Set maintenance rule
  5. Select Override the rule for all objects of class: Standard Data Set
  6. Disable the rule by ticking the row with the parameter name Enabled and by changing the Override Value to false


  7. Select an appropriate override management pack (not the default management pack!) and click on apply.

Now we will be triggering the stored procedure used by this rule manually, and this with no timeout!

  1. Open SQL Management Studio
  2. Select the instance where your OperationsManagerDW database is residing
  3. Once opened click on new query and select you OperationsManagerDW database.
  4. Remember that our state change event description mentioned issues with state data, so we first need to get the ID of the state data set. To do this, click on new query, select your datawarehouse database and enter the following command:

    SELECT DatasetID FROM vDataset WHERE DatasetDefaultName = ‘State data set’


    Click on execute and Copy the resulting ID.

  5. Click on new query, select your datawarehouse database and enter the following command:

    EXEC StandardDataSetMaintenance “GUID”, with GUID the ID that resulted from the previous query.


  6. Wait until the command finishes successfully.

Go back to the authoring pane of the SCOM console to remove our previously defined override.

  1. Go to the authoring pane
  2. Select Authoring => Management Pack Objects => Rules
  3. Change the scope to ‘Standard Data Set’
  4. Right-click The only rule there is: Standard Data Warehouse Data Set maintenance rule
  5. Select Overrides summary
  6. Delete the previously defined override:

And the error disappeared! If this error comes back frequently, you should check for corrupt tables, config churn, database performance or other issues why the maintenance is timing out.

2 Responses to Data Warehouse Object Health State Data Dedicated Maintenance Recovery State

  1. B. Wright says:

    What’s “config churn” sounds like what I’m experiencing?

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.