Apply monitors or rules to a specific group of machines

November 14, 2013

Say,

We want to apply a rule or monitor to only a specific group of machines. How do we do that ?

Seems like a common question, let’s try.

First some symantics : http://technet.microsoft.com/en-us/library/hh457603.aspx

What I remember :

Rules : can generate alerts, can be used for collection data sets, can be used for historical reporting.

Monitors : can be used to generate a “state” of a component, have 3 states ( ok, bad and warning ), can create an alert when a state changes.

Okay, nice now back to the question, how to exclude a group of machines from a specific monitor or rule ?

Step 1 : create a group containing the machines you want



Now modify the required monitors. In this case we only want to monitor the scheduled tasks on the ts Servers.

Disable the monitor for all classes.



Now create an override for the newly created group.


Now if we check the health explorer for a member of the group.


Now if we check for a non-member.


Okay , that’s wat we wanted.

Enjoy.



SCOM: Elevate your BizTalk monitoring

September 4, 2013

BizTalk Application context

The existing BizTalk Management Pack does a good job in discovering all BizTalk components. It also includes a number of useful monitors and rules but still it isn’t super Helpdesk/non-BizTalk user firendly.

How can you overcome this problem? By creating your own custom BizTalk Addon Management Pack.
A typical BizTalk Application consists of “Receive port(s) & Receive location(s)” an “Orchestration” which processes the incomming messages and one or more “Send ports” to forward the processed messaged to another source.

Knowing this makes it possible to create a “Distributed Application” for each one of your BizTalk Applications.


Creating your “Distributed Application”

Once you have a good understanding of how your BizTalk Application is constructed, you can go into SCOM Authoring and create your DA. In my example I have added a TCP Port check to ensure availability of the Receive port that is used.



After Authoring your DA, you can view your Distributed Application under the Monitoring Pane. My BizTalk application is a bit faulty.



Exposing your work

Next step is to expose your work to the colleagues. You can do this by Creating a “BizTalk Addon Management Pack Folder”. In this folder you can create alert & diagram views.

In the BizTalk 1st line Alerts view, you should filter alerts. This way your 1st line helpdesk isn’t spammed with all the biztalk alerts. You can use a Group and the condition criteria to do this.

There’s a lot more you can do to improve and elevate your BizTalk monitoring. I.e. a scripted monitor that counts the number of suspended messaged, a Distributed Application for the Shared BizTalk components, etc… Common sense and someone that masters BizTalk will help you in doing this.

Hope you get the idea of elevated BizTalk Monitoring.
Samuel.


AD FS Management Pack undocumented required configuration

October 5, 2012

When implementing the AD FS management pack in System Center Operations Manager and looking at the guide, the only required configuration you get, is:

  • Install SCOM 2007 R2 or 2007 SP1 agents on AD FS servers
  • Enable Agent Proxy on all AD FS servers.
  • Using the Add Role Services Wizard in Windows Server 2008, verify that the IIS 6 Management Compatibility and IIS 6 Metabase Compatibility role services are installed. (Some AD FS 2.0 scripts depend on Internet Information Services (IIS) Windows Management Instrumentation (WMI) objects being installed.)

We did just that: we installed SCOM 2012 (hey, MS told us the SCOM 2007 R2 management packs work in SCOM 2012!) and enabled agent proxy.

Next, on the AD FS servers, we added the IIS services IIS 6 Management Compatibility and IIS 6 Metabase Compatibility role service.

The discovery scripts kicked in, and our AD FS servers were discovered. Next, we noticed some alerts on AD FS:

AD FS 2.0 application pool Is Not Running On The Federation Server


But the alert is false. The application pool is running!



When looking at the monitor, you see that a powershell script is executed, trying to connect to root/MicrosoftIISv2. Using wbemtest, you will notice that root/MicrosoftIISv2 is not available.

To ensure this script works, add the following IIS services:

  • IIS Management Scripts and Tools
    • To enable managing IIS using WMI
  • IIS 6 WMI compatibility

    • To enable the provider root/MicrosoftIISv2

    No restart required. Just wait a few moments and the alerts will magically disappear!


Creating a Runbook to update Aged Alerts

October 3, 2012

Server administrators often use scom notifications to receive an e-mail for critical alerts. A common problem with this is that the e-mail is sent once and if no action is taken, the alert will stay in the scom console untill there is a system failure. After the failure everyone is going to search for a scapegoat.Why didn’t the scom operators take action? To whom was the e-mail sent? Why didn’t this person take action? Blahblahblah… too late…

With this runbook we can change the ResolutionState of an alert that reaches a certain age and for which no action was taken. This can be a trigger to send out mail to a somebody that’s higher up the ladder. This person can then take action to see who’s not doing his/her job correctly.

 Ingrediënts:

System Center Orchestrator with SC 2012 Operations manager Integration Pack

  • a valid Microsoft System Center Operations Manager Connection (options > SC 2012 Operations Manager)
  • The following runbook activities
    – Monitoring Date/Time Activity
    – Run .NET Script
    – Get Alert
    – Update Alert
  • powershell cmd
    – $AlertDateBefore = (Get-Date).AddDays(-2) | get-date -Format ‘yyyy-MM-ddTHH:mm:ss’
System Center Operations Manager
  • A new Alert resolution State (i.e. “AlertAge Passed 1 day”)
  • An active alert to test the runbook
  • The MonitoringRuleId of the active alert if you want to update only certain alerts (not really necessary)

 Lets start

In your Orchestrator Runbook designer add the four runbook activities and connect them.


 
 

1) In the Monitor “Date/Time” activity you can set when and how frequent you want to run the runbook.

2) In the “Run .Net Script” activity paste the powershell command 


 

3) In the “Get Alert” activity specify the filters to granularly select which alerts you want to update.
– MonitoringRuleId: you can get this through using the “get-scomalert” cmdlet + some parameters
– TimeRaised: For this one you have to subscribe to the published data from our previous step. Just right click in the blank field 😉
– ResolutionsState: We are only going to update New alerts.
– Severity: We are only going to update Critical alerts.


 4) In the “Update Alert” activity, right click the Alert ID field. Choose subscribe > Published Data and select “Id” from the Get Alert activity. Use the “Select field…” to add the Resolution State item. You can select the new resolution state you’ve create in scom (see ingrediënts)

That’s it!! Now test your runbook.
Samuel.


Authenticate SCOM console on a proxy

September 27, 2012

When going to client environments I often face this issue: internet access is only possible when authenticating against a proxy. Now, this is especially a problem when you want to be able to download management packs using the SCOM console, or at least check whether you are up-to-date.

Following a hint of my collegue Thomas Vuylsteke (see his excellent blog at http://setspn.blogspot.com/), I quickly found the “proxy configuration” page on MSDN, stating that it is possible to configure proxy utilization in the .exe.config file: http://msdn.microsoft.com/en-us/library/dkwyc043.aspx

This is actually a quick fix:

On SCOM 2012, open C:\Program Files\System Center 2012\Operations Manager\Console\Microsoft.EnterpriseManagement.Monitoring.Console.exe.config

Add the following code:

<system.net>
<defaultProxy enabled="true" useDefaultCredentials="true">
<proxy usesystemdefault="True" />
</defaultProxy>
</system.net>

The bottom of your config-file should look like this:


En now it is possible to use the online catalog!



A quick fix for a nasty issue! I didn’t test it yet for the SCOM 2007 R2 console, but I’m pretty sure it will also work for that version.

UPDATE: apparently, the quotes didn’t pass very well in this blog post. I updated it so now you should be able to copy-paste the code.

UPDATE2: I confirmed that this works with non-MS proxy servers.


Error when uninstalling a SCOM agent

September 14, 2012

Several customers of mine have come across SCOM agents that cannot be uninstalled. This can be triggered by uninstalling an agent manually or when you want to upgrade the agent to 2012. This can occur with SCOM 2007 RTM, SP1 or R2 agents. I haven’t come across this issue on SCOM 2012 yet, but you never know!

In this case, I wanted to uninstall an agent using add/remove programs:


When trying to uninstall the agent, I stumbled across the following issue:


The patch package could not be opened. Verify that the patch package exists and that you can access it, or contact the application vendor to verify this is a valid Windows Installer patch package.

 What does this mean? You probably installed some agent patch on this server, may this be a seperate KB or a cumulative update. The problem is, when uninstalling the agent, the uninstaller looks where the install files for this cumulative update are located. To find out which patch was installed, open the registry editor regedit.exe.

If it is a SCOM 2007 pre-R2 agent, go to HKEY_CLASSES_ROOT\Installer\Products\C9A0067E2876122489E4BA987C08CDD2\Patches

If it is a SCOM 2007 R2 agent, go to: HKEY_CLASSES_ROOT\Installer\Products\7779052F1B26F94BAD9C107B86962A2\Patches

If it is a SCOM 2012 agent, go to: HKEY_CLASSES_ROOT\Installer\Products\9D603783EC87E0E49B25825AC08C3BEE\Patches

(thanks binaryoverflow.wordpress.com for pointing out the location for SCOM 2012!)

Open the Multi-String Patches. In my case, I saw the following 3 lines:


By removing the contents of this REG_MULTI_SZ:


I was able to uninstall the agent. Problem solved!

[EDIT 11-October-2012]I just discovered that Microsoft released a KB for this issue! http://support.microsoft.com/kb/971187%5B/edit%5D


Data Warehouse Object Health State Data Dedicated Maintenance Recovery State

August 1, 2012

A customer of mine had his SCOM 2007 R2 CU6 Root Management Server that was in a critical health state. When looking at the health explorer, we saw this:


 

SCOM performance its own database maintenance. Kevin Holman wrote a nice article about what maintenance is done automatically by SCOM and what maintenance you could configure additionally: http://blogs.technet.com/b/kevinholman/archive/2008/04/12/what-sql-maintenance-should-i-perform-on-my-opsmgr-databases.aspx .

When looking at the state change events, we saw the following description:

Failed to store data in the Data Warehouse. Exception ‘SqlException’: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding. One or more workflows were affected by this. Workflow name: Microsoft.SystemCenter.DataWarehouse.StandardDataSetMaintenance Instance name: State data set Instance ID: {GUID} Management group: MG

 What does this mean? SCOM is trying to perform its maintenance on state data, but unfortunately, there is too much maintenance work to do, so the rule times out. Best would be to now check why this is timing out. Maybe you have corrupt tables? Or you had a sudden flow of a lot of data that was inserted in the datawarehouse? For this customer, we had issues with config churn which we just fixed. So this time-out could be easily explained.

How to fix this? First, disable the rule performing the maintenance so it will not interfere with our manual procedure. Open the SCOM console

  1. Go to the authoring pane
  2. Select Authoring => Management Pack Objects => Rules
  3. Change the scope to ‘Standard Data Set’
  4. Right-click The only rule there is: Standard Data Warehouse Data Set maintenance rule
  5. Select Override the rule for all objects of class: Standard Data Set
  6. Disable the rule by ticking the row with the parameter name Enabled and by changing the Override Value to false


  7. Select an appropriate override management pack (not the default management pack!) and click on apply.

Now we will be triggering the stored procedure used by this rule manually, and this with no timeout!

  1. Open SQL Management Studio
  2. Select the instance where your OperationsManagerDW database is residing
  3. Once opened click on new query and select you OperationsManagerDW database.
  4. Remember that our state change event description mentioned issues with state data, so we first need to get the ID of the state data set. To do this, click on new query, select your datawarehouse database and enter the following command:

    SELECT DatasetID FROM vDataset WHERE DatasetDefaultName = ‘State data set’


    Click on execute and Copy the resulting ID.

  5. Click on new query, select your datawarehouse database and enter the following command:

    EXEC StandardDataSetMaintenance “GUID”, with GUID the ID that resulted from the previous query.


  6. Wait until the command finishes successfully.

Go back to the authoring pane of the SCOM console to remove our previously defined override.

  1. Go to the authoring pane
  2. Select Authoring => Management Pack Objects => Rules
  3. Change the scope to ‘Standard Data Set’
  4. Right-click The only rule there is: Standard Data Warehouse Data Set maintenance rule
  5. Select Overrides summary
  6. Delete the previously defined override:

And the error disappeared! If this error comes back frequently, you should check for corrupt tables, config churn, database performance or other issues why the maintenance is timing out.