SCOM Advanced Authoring : Powershell Discovery from CSV file – Explained using “TCP Port Monitoring” Scenario

SCOM is exceptional tool which allows IT Administrators to customize the monitoring scenario to any extent. To have a customized monitoring solution, one must understand the authoring capabilities in SCOM, so that the solution can be easily implemented, highly optimized and has less overhead on SCOM Management servers and agents.

In this post we will discuss about Powershell Discovery from a centrally located configuration file (CSV format) with an example scenario involving TCP Port monitoring. We will also discuss the impact of this method on end users, IT/SCOM Administrators.

Basics – Classes, Objects, Targets and Discoveries:

An object is the basic unit of management in Operations Manager. An object typically represents something in your computing environment, such as a computer, a logical disk, or a database. A class represents a kind of object, and every object in Operations Manager is considered an instance of a particular class. A target in the Operations console represents all instances of a particular class. A discovery is a special kind of rule to populate the class with instances.


Consider a scenario where you have a datacenter with 1000+ Windows and Unix Servers. We as SCOM Administrators are requested to configure monitoring for various TCP ports from different watcher nodes across various servers in datacenter.

This can be accomplished using TCP Port Template in Authoring Pane of console. But the drawbacks of using this template are:

Each port for each server needs to be configured manually.

Each entry creates bunch of classes, groups, overrides and number of workflows increase which will impact SCOM performance.

There is no central configuration/information on what is being monitored and the monitoring criteria.

Future changes needs to be manually configured in console.

If a watcher node is decommissioned, each port monitored by the watcher node need to be moved to other watcher node manually.

Each time application team has a new request to add/delete or modify, SCOM administrator need to make changes. In real environment, this includes Change requests, approvals etc which can consume considerable time.

Effective Solution:

To overcome the issues, it would be better to have a configuration in a central location and pull the information to SCOM at regular intervals. This way, once the initial configuration is setup,

The application team can maintain the list and can follow their own approval process to add/modify/delete.

The list can be mass updated.

The addition/modification/deletion is automatically sync’ed with SCOM at regular intervals.

No new workflows are added to SCOM for every addition and hence the impact on SCOM performance is minimal. Thus with handful of monitors and rules 1000s of objects can be monitored.

The information is available centrally.

The monitoring solution is self supported and cost effective in terms of support hours.

Below is step by step process with xml fragments included. The entire MP XML file is attached to the blog which you can download and test it in your lab.

Step 1: Create a New Management Pack “GKLab.TCP.Port.Monitoring

Step 2: Add “” as reference.

Here is XML fragment for Step 1 and Step 2

1 <?xml version="1.0" encoding="utf-8"?><ManagementPack ContentReadable="true" SchemaVersion="2.0" OriginalSchemaVersion="1.0" xmlns:xsd="" xmlns:xsl=""> 2 <Manifest> 3 <Identity> 4 <ID>GKLab.TCP.Port.Monitoring</ID> 5 <Version></Version> 6 </Identity> 7 <Name>GKLab.TCP.Port.Monitoring</Name> 8 <References> 9 <Reference Alias="SystemCenter"> 10 <ID>Microsoft.SystemCenter.DataWarehouse.Library</ID> 11 <Version>7.1.10226.0</Version> 12 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 13 </Reference> 14 <Reference Alias="Windows"> 15 <ID>Microsoft.Windows.Library</ID> 16 <Version>7.5.8501.0</Version> 17 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 18 </Reference> 19 <Reference Alias="MicrosoftSystemCenterSyntheticTransactionsLibrary"> 20 <ID>Microsoft.SystemCenter.SyntheticTransactions.Library</ID> 21 <Version>7.1.10226.1090</Version> 22 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 23 </Reference> 24 <Reference Alias="Performance"> 25 <ID>System.Performance.Library</ID> 26 <Version>7.0.8433.0</Version> 27 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 28 </Reference> 29 <Reference Alias="System"> 30 <ID>System.Library</ID> 31 <Version>7.5.8501.0</Version> 32 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 33 </Reference> 34 <Reference Alias="SC"> 35 <ID>Microsoft.SystemCenter.Library</ID> 36 <Version>7.0.8433.0</Version> 37 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 38 </Reference> 39 <Reference Alias="Health"> 40 <ID>System.Health.Library</ID> 41 <Version>7.0.8433.0</Version> 42 <PublicKeyToken>31bf3856ad364e35</PublicKeyToken> 43 </Reference> 44 </References> 45 </Manifest>


Step 3: Create a custom class “GKLab.TCP.Port.Monitoring.Class” to store TCP Port monitoring configuration. The base class is “Microsoft.SystemCenter.SyntheticTransactions.TCPPortCheckPerspective”

1 <TypeDefinitions> 2 <EntityTypes> 3 <ClassTypes> 4 <ClassType ID="GKLab.TCP.Port.Monitoring.Class" Accessibility="Internal" Abstract="false" Base="MicrosoftSystemCenterSyntheticTransactionsLibrary!Microsoft.SystemCenter.SyntheticTransactions.TCPPortCheckPerspective" Hosted="true" Singleton="false" Extension="false"> 5 <Property ID="ServerName" Type="string" AutoIncrement="false" Key="true" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> 6 <Property ID="Port" Type="int" AutoIncrement="false" Key="true" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> 7 <Property ID="NoOfRetries" Type="int" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> 8 <Property ID="TimeWindowInSeconds" Type="int" AutoIncrement="false" Key="false" CaseSensitive="false" MaxLength="256" MinLength="0" Required="false" Scale="0" /> 9 </ClassType> 10 </ClassTypes> 11 </EntityTypes>

Step 4: Now we need to create discovery data source. Before that we will discuss the CSV file format we will be using to store the configuration data.

We will name it as “TCPPortMonitoringList.csv”. The CSV has ServerName, PortNumber, WatcherNode, IntervalSeconds, NoOfRetries and TimeWindowInSeconds as header.

ServerName – Monitored Server Name (NetBIOS or FQDN)

PortNumber – Port Number to be monitored in monitored server.

WatcherNode – Computer/SCOM Agent that needs to monitor the port in monitored server.

IntervalSeconds – Monitoring Interval in seconds.

NoOfRetries – Number of times the monitor should fail before the alert is generated. This will reduce the alerts generated due to network latency. (Minimum value – 2)

TimeWindowInSeconds – Total time interval within which the monitor has to fail to generate an alert. (Minimum value = IntervalSeconds)


Step 5: Since we will use Powershell Script discovery, create a custom data source with a System.SimpleScheduler module and a Microsoft.Windows.PowerShellDiscoveryProbe probe module.

Since we have a centralized configuration CSV file, we can run the discovery from any one management server and populate the objects. In SCOM 2012, we will target the discovery against All Management Server Resource Pool, so that anyone MS will pick up the workflow. The discovery is thus highly available. The CSV file path should be shared so that it can be accessed from any MS.

Below is XML fragment for Custom Discovery module with embedded Powershell script.

1 <ModuleTypes> 2 <DataSourceModuleType ID="GKLab.TCP.Port.Monitoring.Discovery.DataSource" Accessibility="Internal" Batching="false"> 3 <Configuration> 4 <xsd:element minOccurs="1" name="IntervalSeconds" type="xsd:integer" xmlns:xsd="" /> 5 <xsd:element minOccurs="1" name="SyncTime" type="xsd:string" xmlns:xsd="" /> 6 <xsd:element minOccurs="1" name="filePath" type="xsd:string" xmlns:xsd="" /> 7 </Configuration> 8 <OverrideableParameters> 9 <OverrideableParameter ID="IntervalSeconds" Selector="$Config/IntervalSeconds$" ParameterType="int" /> 10 <OverrideableParameter ID="FilePath" Selector="$Config/filePath$" ParameterType="string" /> 11 </OverrideableParameters> 12 <ModuleImplementation Isolation="Any"> 13 <Composite> 14 <MemberModules> 15 <DataSource ID="DS" TypeID="System!System.SimpleScheduler"> 16 <IntervalSeconds>$Config/IntervalSeconds$</IntervalSeconds> 17 <SyncTime>$Config/SyncTime$</SyncTime> 18 </DataSource> 19 <ProbeAction ID="Probe" TypeID="Windows!Microsoft.Windows.PowerShellDiscoveryProbe"> 20 <ScriptName>TCPPortMonitoringConfigDiscovery.ps1</ScriptName> 21 <ScriptBody> 22 param( 23 [string] $sourceId, 24 [string] $managedEntityId, 25 [string] $filePath ) 26 27 #Initialize SCOM API 28 29 $api = new-object -comObject 'MOM.ScriptAPI' 30 $discoveryData = $api.CreateDiscoveryData(0, $SourceId, $ManagedEntityId) 31 write-eventlog -logname "Operations Manager" -Source "Health Service Script" -EventID 999 -Message "TCP Port Monitoring: looking for CSV file" -EntryType Information 32 # $filePath variable contains UNC path of CSV Config file 33 if (test-path $filePath) { 34 write-eventlog -logname "Operations Manager" -Source "Health Service Script" -EventID 999 -Message "TCP Port Monitoring: Accessing CSV file" -EntryType Information 35 $contents = Import-Csv $filePath 36 try{ 37 $Path = (Get-ItemProperty "HKLM:SOFTWARE\Microsoft\System Center Operations Manager\12\Setup\Powershell\V2").InstallDirectory 38 $Path1 = $Path + "OperationsManager\OperationsManager.psm1" 39 if (Test-Path $Path1) 40 { 41 Import-Module $Path1 42 } 43 else 44 { 45 Import-Module OperationsManager 46 } 47 New-SCOMManagementGroupConnection 48 #Retrieve all windows computers which can be used as watcher nodes 49 $allServers = Get-SCClass | where { $_.Name -eq ("Microsoft.Windows.Computer")} | get-scommonitoringobject 50 } 51 catch{ 52 write-eventlog -logname "Operations Manager" -Source "Health Service Script" -EventID 999 -Message "TCP Port Monitoring: $_" -EntryType Information 53 } 54 #Read line by line from configuration file and create instance of TCP Port Monitoring Class 55 $contents | ForEach-Object{ 56 $ServerName = $_.ServerName 57 $PortNumber = $_.PortNumber 58 $WatcherNode = $_.WatcherNode 59 $NoOfRetries = $_.NoOfRetries 60 $TimeWindowInSeconds = $_.TimeWindowInSeconds 61 $Config = "$ServerName"+":"+"$PortNumber" # Will be used as display name 62 write-eventlog -logname "Operations Manager" -Source "Health Service Script" -EventID 555 -Message "Checking servers" -EntryType Information 63 $allServers | ForEach-Object{ 64 #Create instance only if the watcher node is managed by SCOM as the instance will hosted by the watcher node. 65 #The hosting object is windows computer whose display name is equal to watcher node value from CSV 66 #If there is no matching windows computer managed by SCOM, then the instance cannot be hosted. Hence the instance is not discovered. 67 if((($_.DisplayName).toLower()).contains($WatcherNode.toLower())){ 68 write-eventlog -logname "Operations Manager" -Source "Health Service Script" -EventID 555 -Message "Creating Instance for $Config" -EntryType Information 69 $instance = $discoveryData.CreateClassInstance("$MPElement[Name='GKLab.TCP.Port.Monitoring.Class']$") 70 $instance.AddProperty("$MPElement[Name='GKLab.TCP.Port.Monitoring.Class']/ServerName$", $ServerName) 71 $instance.AddProperty("$MPElement[Name='GKLab.TCP.Port.Monitoring.Class']/Port$", $PortNumber) 72 $instance.AddProperty("$MPElement[Name='GKLab.TCP.Port.Monitoring.Class']/NoOfRetries$", $NoOfRetries) 73 $instance.AddProperty("$MPElement[Name='GKLab.TCP.Port.Monitoring.Class']/TimeWindowInSeconds$", $TimeWindowInSeconds) 74 #The hosting object is windows computer whose display name is equal to watcher node value from CSV 75 $instance.AddProperty("$MPElement[Name='Windows!Microsoft.Windows.Computer']/PrincipalName$", $_.DisplayName) 76 $instance.AddProperty("$MPElement[Name='System!System.Entity']/DisplayName$", $Config) 77 $discoveryData.AddInstance($instance) 78 return 79 } 80 } 81 } 82 } 83 $discoveryData 84 Remove-variable api 85 Remove-variable discoveryData 86 </ScriptBody> 87 <Parameters> 88 <Parameter> 89 <Name>sourceId</Name> 90 <Value>$MPElement$</Value> 91 </Parameter> 92 <Parameter> 93 <Name>managedEntityId</Name> 94 <Value>$Target/Id$</Value> 95 </Parameter> 96 <Parameter> 97 <Name>filePath</Name> 98 <Value>$Config/filePath$</Value> 99 </Parameter> 100 </Parameters> 101 <TimeoutSeconds>300</TimeoutSeconds> 102 </ProbeAction> 103 </MemberModules> 104 <Composition> 105 <Node ID="Probe"> 106 <Node ID="DS" /> 107 </Node> 108 </Composition> 109 </Composite> 110 </ModuleImplementation> 111 <OutputType>System!System.Discovery.Data</OutputType> 112 </DataSourceModuleType> 113 </ModuleTypes> 114 </TypeDefinitions>

Step 6: Now that we have created discovery data source, we will create a discovery GKLab.TCP.Port.Monitoring.Discovery.

Below is the discovery xml fragment. The UNC Path is mentioned in filePath.

1 <Monitoring> 2 <Discoveries> 3 <Discovery ID="GKLab.TCP.Port.Monitoring.Discovery" Enabled="false" Target="SC!Microsoft.SystemCenter.AllManagementServersPool" ConfirmDelivery="true" Remotable="true" Priority="Normal"> 4 <Category>Discovery</Category> 5 <DiscoveryTypes> 6 <DiscoveryClass TypeID="GKLab.TCP.Port.Monitoring.Class" /> 7 </DiscoveryTypes> 8 <DataSource ID="DS" TypeID="GKLab.TCP.Port.Monitoring.Discovery.DataSource"> 9 <IntervalSeconds>500</IntervalSeconds> 10 <SyncTime>00:00</SyncTime> 11 <filePath>\\SCOM2012R2\Configs\TCPMonitoringConfig.csv</filePath> 12 </DataSource> 13 </Discovery> 14 </Discoveries> 15 </Monitoring>

Step 8: Add Language Packs and close the ManagementPack tag.

1 <LanguagePacks> 2 <LanguagePack ID="ENU" IsDefault="true"> 3 <DisplayStrings> 4 <DisplayString ElementID="GKLab.TCP.Port.Monitoring"> 5 <Name>GKLab TCP Port Monitoring</Name> 6 <Description>This Management pack monitors the list of ports discovered from config file.</Description> 7 </DisplayString> 8 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Class"> 9 <Name>GKLab TCP Port Monitoring Class</Name> 10 <Description>Class Contains Instances of TCP Ports that needs to be monitored from specific watcher nodes</Description> 11 </DisplayString> 12 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Class" SubElementID="NoOfRetries"> 13 <Name>No Of Retries</Name> 14 </DisplayString> 15 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Class" SubElementID="Port"> 16 <Name>Port</Name> 17 </DisplayString> 18 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Class" SubElementID="ServerName"> 19 <Name>Server Name</Name> 20 </DisplayString> 21 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Class" SubElementID="TimeWindowInSeconds"> 22 <Name>Time Window In Seconds</Name> 23 </DisplayString> 24 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Discovery"> 25 <Name>GKLab TCP Port Monitoring Discovery</Name> 26 <Description>Discovers TCP Port Monitoring Configs from given CSV file.</Description> 27 </DisplayString> 28 <DisplayString ElementID="GKLab.TCP.Port.Monitoring.Discovery.DataSource"> 29 <Name>GKLab TCP Port Monitoring Discovery Data Source</Name> 30 <Description>Data Source used by TCP Port Monitoring Discovery Rule</Description> 31 </DisplayString> 32 </DisplayStrings> 33 </LanguagePack> 34 </LanguagePacks> 35 </ManagementPack>

Step 9: Now import the management pack in SCOM and check if the configuration from CSV are discovered. 

Go to Discovered Inventory in SCOM Console and change target to “GKLab TCP Port Monitoring Class” to view the discovered items.


Step 10: Now you can develop custom monitors and rules targeting this class.

Thus the entire configuration can be maintained in a CSV file located in a shared location. For any new or modification in requirement, the CSV file can be updated accordingly. There is no changes required in SCOM side unless any additional headers are added and need to be absorbed in SCOM.

I will post details about monitors and rules for TCP Port monitoring in future posts.


Happy SCOMing!!!


PowerShell: Retrieve Group Policy details for Remote Computer

There are multiple scenarios as a part of AD management where we need to retrieve Group Policy information for managed computers. There are generally two methods to get the information.

Method 1:

Most common method is to use gpresult.exe command which is detailed in this technet article. This works well only if that User executing the command has logged-in once at-least in the target computer. Else it throws below error.

The user does not have RSOP Data

Method 2:

Method 2 is to use Get-GPResultantSetOfPolicy PowerShell command-let which is detailed here. This command also works similar to Method 1 and requires User to login at-least once.

Using Method 1 and Method 2, even if we want the group policy information only for the computer irrespective of user, it is not possible without the user logged in at-least once as the command retrieves resulting set of policies that are enforced for specified user on the target computer.


To overcome these issues, using Group Policy Management COM Object which is the base for gpresult.exe and Get-ResultantSetOfPolicy PS command-let serves better. We can use the COM object in VB or PS scripting. Here we will discuss about using it in PS Scripting.

#Initialize Variables

$OutputFile = “C:\Temp\GPOExport.html”

$ComputerName = “”

$UserName = “john”

The first thing we do is create an instance of the GPMgmt.GPM object. We can use this object if the Group Policy Management Console is installed in the computer.

$gpm = New-Object -ComObject GPMgmt.GPM

Next step is to obtain all constants and save it in a variable.

$constants = $gpm.GetConstants()

Now create reference RSOP object using required constants.

$gpmRSOP = $GPM.GetRSOP($Constants.RSOPModeLogging,$null,0)

Next step is to specify Target Computer and User.

$gpmRSOP.LoggingComputer = $ComputerName

$gpmRSOP.LoggingUser = $UserName

Note: If we need the RSOP data for only Computer without considering User imposed Group Policy data, we need to use “RsopLoggingNoUser” constant value instead of $gpmRSOP.LoggingUser.

$gpmRSOP.LoggingFlags = $Constants.RsopLoggingNoUser

Next step is to query the target computer for RSOP GPO data.


To export data to a output file below command is used.





Thus using GPMgmt.GPM COM object, we can obtain Resulting Set of Group Policies for Target Computer with or without considering the User and also without requirement of user logging at-least once.

Happy Scripting 🙂

PowerShell Script to Simulate Outlook Web Access URL User Logon

Recently I came across with a requirement to do user logon synthetic transaction on Outlook Web Access URL and capture its performance. This can be accomplished using Invoke-WebRequest PowerShell command let. The command let returns form elements which needs to be filled with username and password and the login page is invoked with the post data. The output is analyzed for successful login and logon results are returned.

#Parameters Block

#URL = “






#Initialize default values

$Result = $False
$StatusCode = 0
$Latency = 0

$Username = $Domain + “\” + $Username

#Work around to Trust All Certificates is is from this post

add-type @”
using System.Net;
using System.Security.Cryptography.X509Certificates;
public class TrustAllCertsPolicy : ICertificatePolicy {
public bool CheckValidationResult(
ServicePoint srvPoint, X509Certificate certificate,
WebRequest request, int certificateProblem) {
return true;
[System.Net.ServicePointManager]::CertificatePolicy = New-Object TrustAllCertsPolicy

#Initialize Stop Watch to calculate the latency.

$StopWatch = [system.diagnostics.stopwatch]::startNew()

#Invoke the login page
$Response = Invoke-WebRequest -Uri $URL -SessionVariable owa

#Login Page – Fill Logon Form

if ($Response.forms[0].id -eq “logonform”) {
$Form = $Response.Forms[0]
$Form.fields.username= $Username
$form.Fields.password= $Password
$authpath = “$URL/auth/owaauth.dll”
#Login to OWA
$Response = Invoke-WebRequest -Uri $authpath -WebSession $owa -Method POST -Body $Form.Fields
if ($Response.forms[0].id -eq “frm”) {
#Retrieve Status Code
$StatusCode = $Response.StatusCode
# Logoff Session
$logoff = “$URL/auth/logoff.aspx?Cmd=logoff&src=exch”
$Response = Invoke-WebRequest -Uri $logoff -WebSession $owa
#Calculate Latency
$Latency = $StopWatch.Elapsed.TotalSeconds
$Result = $True
#Fill Out Language Form, if it is first login
elseif ($Response.forms[0].id -eq “lngfrm”) {
$Form = $Response.Forms[0]

#Set Default Values

$langpath = “$URL/lang.owa”
$Response = Invoke-WebRequest -Uri $langpath -WebSession $owa -Method $form.Method -Body $form.fields
#Retrieve Status Code
$StatusCode = $Response.StatusCode
# Logoff Session
$logoff = “$URL/auth/logoff.aspx?Cmd=logoff&src=exch”
$Response = Invoke-WebRequest -Uri $logoff -WebSession $owa
#Calculate Latency
$Latency = $StopWatch.Elapsed.TotalSeconds
$Result = $True
elseif ($Response.forms[0].id -eq “logonform”) {
#We are still in LogonPage
#Retrieve Status Code
$StatusCode = $Response.StatusCode
#Calculate Latency
$Latency = $StopWatch.Elapsed.TotalSeconds
$Result = “Failed to logon $username. Check the password or account.”


#Catch Exception, If any
#Retrieve Status Code
$StatusCode = $Response.StatusCode
if ($StatusCode -notmatch ‘\d\d\d’) {$StatusCode = 0}
#Calculate Latency
$Latency = $StopWatch.Elapsed.TotalSeconds
$Result = $_.Exception.Message

#Display Results

Write-Host “Status Code: $StatusCode`nResult: $Result`nLatency: $Latency Seconds”

Happy Scripting..

SCOM: Data Source and Probe Modules

Microsoft has provided various data source modules and probe action modules for custom MP authoring. These default available modules prove effective in most scenarios but not all. If we dig deep in to some of the useful data source modules by unsealing the Microsoft MPs, you could find them as a combination of unique modules. Through various combinations, these unique core modules can give us best result in better way!!

SCOM Authoring  is not only about doing right thing but it’s all about doing right thing in a right way.

SCOM: Unix/Linux Shell Command Monitoring – Unique Requirement

When I was working with a customer recently, I came across an requirement, to execute a Shell Command and based on the results, the monitor state needs to be set for Target Server. One of the best example solution available in web is this.


But the uniqueness however  in this requirement is that the computer where the Shell Command needs to be executed and the Target Server are not the same.

5-14-2014 17-56-03

As Illustrated above, The monitor is targeted to a “Class A” which has several instances including “Server A”. The Shell Command however needs to be executed on “Server B” and the results needs to be manipulated and the state of “Monitor A” needs to be set.


The “Microsoft.Unix.WSMan.Invoke.ProbeAction” probe based on which the Unix/Linux monitoring data sources are built in SCOM , has a parameter called “TargetSystem”.

In normal scenarios, the value would always be “$Target/Property[Type=”MicrosoftUnixLibrary!Microsoft.Unix.Computer”]/NetworkName$”. Thus the Shell Command would be executed in “Server A” and the “Monitor A” would have a state based on results.

But the value of “TargetSystem” certainly need not to be Target Server’s name. This can be changed to any server which has a SCOM Unix Agent installed with valid certificate for authentication. Then the Shell Command will be executed in the “Server B” rather than in Target Server.

Additionally you can pass the Target Server’s name along with Shell Command as parameter if you have used the Promote Option for Shell Command

Find the XML code in PDF file here — Example.Unix.ShellCommand.Monitoring

Have great SCOMing!!!

SCOM: What’s wrong with my Unix agents? Why they are grey out?

It is quite difficult to work with the Offline Unix agents especially in a large monitoring environment. Though SCOM offers native heartbeat monitor, it is hard to quickly determine whether the computer is actually down or something wrong with the agent configuration.

An Unix agent may be down due to various reasons like issue with SCX process not running or a run as account password got changed or certificate got reset or the computer might be down. SCOM has “UNIX/Linux Heartbeat Monitor”, “WS-Management Run As Account Health” and “WS-Management Certificate Health” monitors to monitor each of above mentioned criteria and alert for offline agents. But it would be tedious job for support guy to handle multiple alerts for same issue and correlating them to fix the agent which may cost considerable time.

Will it not be easy to have only one alert in case of heartbeat failure with the status of all other monitors in the summary?

But wait, should we also track down the ping status in the alert summary so that the support guy knows what he should do first?

Yes, that’s what we are going to do now using powershell. The below script which can be run in any management server logs event in “Operations Manager” event log.

You can create a rule to look for the events and create an alert. The alert will indicate the agent which is offline and details of other monitors.


Import-Module OperationsManager


$mc = get-scclass -name Microsoft.Unix.Computer

$agents = get-scommonitoringobject -class $mc | where {$_.isavailable -ne ‘True’}

foreach ($agent in $agents) {

$maintmode = $agent.InMaintenanceMode

# Ignore Servers in Maintenance

if ($maintmode -eq $false){

$agentname = $agent.displayname

$RespondsToPing = Test-Connection -ComputerName $agent.displayname -quiet

$sh = $agent.GetMonitoringStateHierarchy()

$avail_mon = $sh.childnodes | where {$_.item.MonitorDisplayName -eq ‘Availability’}

$hb_mon = $avail_mon.childnodes | where {$_.item.MonitorDisplayName -eq ‘Unix/Linux Heartbeat Monitor’}

$hb_mon_state = $hb_mon.item.healthstate

if ($hb_mon_state -ne “Success” -and $hb_mon_state -ne “Uninitialized”){

$config_mon = $sh.childnodes | where {$_.item.MonitorDisplayName -eq ‘Configuration’}

$cert_mon = $config_mon.childnodes | where {$_.item.MonitorDisplayName -eq ‘WS-Management Certificate Health’}

$runas_mon = $config_mon.childnodes | where {$_.item.MonitorDisplayName -eq ‘WS-Management Run As Account Health’}

$cert_mon_state = $cert_mon.item.healthstate

$runas_mon_state = $runas_mon.item.healthstate

if ($RespondsToPing){$pingable = “Pingable”}

else{$pingable = “Not Pingable”}

$status = “PING_STATUS: $pingable HEARTBEAT_STATUS: $hb_mon_state CERTIFICATE_STATUS:

$cert_mon_state, USER_ACCOUNT_STATUS: $runas_mon_state”

write-eventlog -LogName ‘Operations Manager’ -source ‘Health Service Script’ -id 1041 -entrytype Error -Category 0 -Message “UNIX SCOM agent on $agentname is not sending a heartbeat – $status”