Orphaned Monitoring Instance - SAM Template Deleted

March 16, 2018, 8:35 am

≫ Next: Network Sonar Discovery - Exclude Devices

≪ Previous: Appinsight SQL Job Agent info UNKNOWN

I'll start off with saying that I'm going to open a support case on this as well but wanted to put it out here as well.

I've hit a case where a SAM admin created a new template, built out the components for it, assigned it to a ~100 nodes, and then deleted the template. I am now unable to remove the orphaned monitoring instances as they don't show up anywhere in the UI that I can find except on the node page. When I attempt to select the Application on the Node page, I get an error message that says "Unexpected Website Error Object reference not set to an instance of an object." Does anyone have any ideas on how I can remove them? I have confirmed they do not show up within the application monitor list within SAM settings. I have also confirmed that I can't just remove them from the database as it generates an error "Dynamic SQL generation is not supported against multiple base tables".

↧

Network Sonar Discovery - Exclude Devices

March 16, 2018, 9:58 am

≫ Next: Monitoring and Alerting on Event ID - Every occurrence of Event ID without missed or duplicate alerts

≪ Previous: Orphaned Monitoring Instance - SAM Template Deleted

We have multiple locations and configure each location to be on a weekly discovery schedule. Every job is configured to scan a dedicated subnet that we have defined for non-client machines. Our smaller sites use that same range for other devices like UPS, network switches, etcs. The issue that we are having is, the auto-discovery scans everything within that subnet. Devices that do not SNMP setup, Solarwinds tries to scan it, and because of access limitation the devices that it is trying to scan sends out emails for unauthorized access every time the scan runs. We added those devices to the Ignore list, but it looks like all it is doing is stopping the devices from showing up in the import list. In some cases we are able setup SNMP credentials, but there are cases where it is not possible.

Is there a way for SolarWinds to not scan IPs that we have defined during discovery?

↧

Monitoring and Alerting on Event ID - Every occurrence of Event ID without missed or duplicate alerts

August 20, 2015, 5:40 am

≫ Next: Cloud Fever, throwing all the SolarWinds into AWS

≪ Previous: Network Sonar Discovery - Exclude Devices

Probably making this more difficult than it is and am not real confident in what I think may be the solution.

I have a Windows Event Log Monitor setup to monitor a specific event ID on a server. The monitor itself seems to work fine, from the Application Component Details page I can see every occurrence of the event listed in the Event Log Message Details pane. The component status is set to a Warning state "Based on Event Count" >= 1 for a single poll and the polling frequency is 300 seconds.

The issue is the alerting. I am not sure what I should use for evaluation frequency and reset condition to ensure I trigger an alert for every instance of the event.

If I use the standard Evaluation Frequency of Alert = to every 1 minute and the reset condition = "trigger condition no longer true" then I miss alerts if when multiple events occur because it is always triggered.

If I use the No Reset Condition and trigger every 1 minute I receive duplicates I assume because the alert freq is 1 min and the polling of the monitor is 3 minutes.

This wouldn't be a big deal if this was an event that only occurred a couple of times per hour or day but it could occur multiple times per polling frequency.

It "seems" to work if I set the alert evaluation frequency to 4 or 5 minutes but I am not sure that is the best solution to ensure nothing is missed or duplicated.
Thanks

↧

Cloud Fever, throwing all the SolarWinds into AWS

March 16, 2018, 12:28 pm

≫ Next: SSL Certificate Expiration - SNI capable

≪ Previous: Monitoring and Alerting on Event ID - Every occurrence of Event ID without missed or duplicate alerts

So after picking the amazing adatole's brain after his 3/13 LIVE WEBCAST: IF AN APPLICATION FAILS IN THE DATACENTER AND NO USERS ARE ON IT, WILL IT CUT A TICKET? presentation about migrating our SolarWinds environment to AWS, he recommended I let the THWACK community weigh in.

Here's the my situation in a nutshell:

Currently, our Orion SQL DB is in a SQL cluster shared by other applications. The Orion DB is killing performance for the other DBs in that cluster. Based on current resource utilization and guidelines for future growth I'm looking at getting the DB it's own "server" whether virtual or physical with at least 128GB of RAM. The SQL cluster has a shared RAM of 64GB of RAM. We're not able to deploy a virtual server with that much ram in our virtual environment.

So what about a physical server?

Well...

My company has caught the "Cloud Fever" and the only cure is more Cloud!

Our parent company based in France has pushed out an IT edict that all 26 of it's international entities (North America, UK, China, etc...) must convert 50% of it's data center based hardware and virtual servers into "the cloud" by 2020. Unfortunately, cost concerns and performance be ****** nobody is asking the important questions like "Why?"

So I'm being told that with this edict in play, any requests for a new hardware server would be instantly denied.

With this I'm trying to make this work as best as possible given the situation. The slight advantage I have is that there is massive amounts of money being thrown at this cloud effort so, I can leverage that to make this as smooth as possible.

Another side issue is that our IT department is extremely siloed. My title is "Network Engineer" which means I'm a member of the "Network Team". However, it's 2018 and IT should all be on the same team. The server team is full of old school Microsoft fanboys and girls that have fought AWS tooth and nail (and not for logical reasons). We have a very developed and robust Orion environment with 3 very dedicated individuals maintaining it and end users actively using it across many teams including some outside of IT.

The server team uses a neglected instance of SCOM 2012 to monitor servers, AD, and databases using mostly out-of-the-box alerts that only is sent to members of their team and whose web portal is only accessible by them.

I have graciously offered to take on the task of assisting them with integrating our servers and AD environment into SAM which would incur no cost as we are already licensed. I get immediate kickback with no logical reasons, almost like it's some sort of childish turf war for them. So, asking for any assistance with the Orion servers from them is a pain because I offered to help them.

Here's what my Orion environment looks like now:

NAM 3000 with ACM 250 We're using NPM, SAM, NTA, NCM, IPAM, VNQM, UDT, and WPM.

Main PE: polling 12893 Elements with a job weight of 5650.

APE in Canada whose local subnets aren't routed to the "Main" network hence the need for an APE: polling 5008 elements with a job weight of 1929.

APE in our SCADA industrial controls system DMZ (we're planning on rolling this into the MPE since we should be able to poll these nodes with "routing and firewall magic"

Our AWS environment is in the very earliest stages, only 1 test application has been migrated so far, so I have a lot of freedom to plan out how to monitor that.

Our AT&T managed MPLS cloud has a direct connect into our AWS instance so, that should help alleviate some latency issues with our remote location polling.

Some of the advice Leon offered includes the following:

Installing SW into AWS
The first and most important thing you need to ensure is that the timing between the primary poller and the database remains low – under 1500 miliseconds. If you have latency that is longer than that, you are going to experience errors and data corruption
The second (and only slightly less important) thing is to ensure that your database is set up for the transaction volume – in on-prem terms, it needs to be RAID 10 or flash. Not RAID 5.
The third thing is that you will likely be monitoring your on-prem environment using an additional polling engine, unless you have less than 100 devices on-prem that you wish to monitor

With all of that said, there is a guide to help you:

https://support.solarwinds.com/@api/deki/files/40251/SolarWinds_AmazonWebService_Deployment.pdf?revision=2

1) Put the primary poller and the db in the cloud so that your timing between them is as short as possible. The primary poller will have very little to monitor (at least right now) and That’s OK ™

2) Put an additional poller in the main site, and another APE in your secondary site. They cost nothing, so why not. They can be virtual. You can play with the hardware they’re assigned until you’ve salted to taste.

3) If you can, install the AWS-based instance of DPA (it’s in the Amazon store) and watch your SW database with it. You will have the ability to see how it’s truly performing and where any bottlenecks might crop up.

a. It’s also a great “advertisement” to your DBA team to show the capabilities of the tool. No I’m not trying to upsell you. It’s just a nice tool in your toolbox if you don’t have something else. And it’s natively cloud-based, so you can score some points from corporate.

4) Make sure you add your cloud credentials to SAM. Again, score some corp brownie points.

Paging jbiggley at adatole suggestion to weigh in.

TL;DR I have to move my Orion environment to AWS because of corporate politics. Any advice is very appreciated.

So, what THWACKsters out there have installed/migrated Orion in the Cloud either by choice or by corporate politics gunpoint?

The first and most important thing you need to ensure is that the timing between the primary poller and the database remains low – under 1500 miliseconds. If you have latency that is longer than that, you are going to experience errors and data corruption
The second (and only slightly less important) thing is to ensure that your database is set up for the transaction volume – in on-prem terms, it needs to be RAID 10 or flash. Not RAID 5.
The third thing is that you will likely be monitoring your on-prem environment using an additional polling engine, unless you have less than 100 devices on-prem that you wish to monitor

Edit for grammar DERP.

↧

SSL Certificate Expiration - SNI capable

October 9, 2015, 8:50 am

≫ Next: Slack - Alert Integration - Node Memory

≪ Previous: Cloud Fever, throwing all the SolarWinds into AWS

↧

Slack - Alert Integration - Node Memory

March 18, 2016, 11:37 am

≫ Next: Hardware Health Monitoring Issue

≪ Previous: SSL Certificate Expiration - SNI capable

↧

Hardware Health Monitoring Issue

March 10, 2018, 6:21 am

≫ Next: Different between application status & component status

≪ Previous: Slack - Alert Integration - Node Memory

I have an issue with hardware health monitoring in my environment.

I have vCenter 6.0 and 3 clusters.

2 Clusters include IBM hosts and other one Include HP hosts.

In monitoring side I have NPM 12.01 SAM 6.3.0 VIM 7.0.0 WPM 2.2.1 and SRM 6.3.0.

I have also additional poller.

Vcenter and All ESXi servers poll through additional poller with SNMP and Vmware Polling.

Also I added vcenter to VMAN .

now problem is that Solarwinds dont show any Hardware Health info for this ESXi Servers

I installed OEM iso version of ESXi on each server.

Hardware health info for hosts shows successfully in vSphere webclient and also I check states in vCenter MOB. Every things is OK.

Also I checked "All Settings---> Manage pollers ---> Hardware Health Sensors" Scan Results is "Not a match".

Additional poller version also is the same with main solarwinds server.

I am so confused.

Any help could be great for my situation.

Thanks

Also I read this links before sending this post :

SAM Hardware Health Supports Lenovo ThinkServer

Lenovo x3650M5

Latest required software used to monitor hardware health - SolarWinds Worldwide, LLC. Help and Support

Hardware monitoring and VMware

Troubleshooting Guide for Hardware Health for ESX host Server Polled Through vCenter - SolarWinds Worldwide, LLC. Help a…

↧

Different between application status & component status

March 18, 2018, 9:06 pm

≫ Next: FUJITSU PRIMERGY SERVER

≪ Previous: Hardware Health Monitoring Issue

Hello, any expert can share with me what is the different between application status and component status (up, warning).

Not every sure what does warning under the component status? UP- Is already monitoring?

Remarks: I retrieve it from the reports.

Please help as I am really new to Solarwinds. Many thanks!

↧

FUJITSU PRIMERGY SERVER

March 12, 2018, 6:33 am

≫ Next: Sam 6.6/ Exchange 2013 / Powershell

≪ Previous: Different between application status & component status

Hi.

We have ~500 servers... HP and Dell are perfectly monitored and integrated in Orion using HP Insight Manager and Dell OpenManage, so all hardware, Raid Controller, Disks, Fans, Power Supply are monitored automatically.

We have some Fujitsu Primergy servers with the standard Fujitsu software... they are not "included" in Orion, the only possibility is UDT...

Is it possible that Orion will include Fujitsu servers too in the standard monitoring like HP and Dell?

Thanks.

↧

Sam 6.6/ Exchange 2013 / Powershell

March 19, 2018, 10:51 am

≫ Next: Do you participate in polls?

≪ Previous: FUJITSU PRIMERGY SERVER

Howdy-

Sam 6.6 is still showing powershell 2.0 as a requirement in the documentation- I was under the impression that this would now have newer versions of powershell to make the exchange insight piece work.

Our exchange architect really wants the SAM Insight working- but not at the cost of the lower versions of powershell. Any ideas?

Manually Configuring Exchange Server

↧

Do you participate in polls?

December 10, 2013, 12:04 pm

≫ Next: Disable some components within Appinsight

≪ Previous: Sam 6.6/ Exchange 2013 / Powershell

↧

Disable some components within Appinsight

March 19, 2018, 8:12 pm

≫ Next: Hardware Health Extensions

≪ Previous: Do you participate in polls?

Hello Solarwinds expert,

Are we able to disable any individual components of either the SQL or Exchange AppInsight monitors? The reason to do so is the limit the number of alert we received for SQL.

I had read thru that we can simply remove the warning and critical thresholds by editing the application and we will never be alerted or notified about this components.

Is there any other method to advise on the issue?

Thanks expert!

↧

Hardware Health Extensions

December 4, 2017, 5:38 am

≫ Next: Understanding Java Remote JMX - Initial Setup/Config Overview

≪ Previous: Disable some components within Appinsight

What hardware platforms would you most like to see integrated with SAM to support capabilities such a hardware health?

↧

Understanding Java Remote JMX - Initial Setup/Config Overview

September 27, 2017, 8:53 am

≫ Next: Hardware monitoring HPE Proliant Gen10

≪ Previous: Hardware Health Extensions

Overview

I wanted to outline some misunderstanding and give some guidance on monitoring java web apps via remote JMX. The flavor of how you're hosting the java application, e.g. Tomcat, Websphere, etc, all follow the same foundation.

First, the java web application can be accessed via a web browser (typically on port 8080) but the detailed performance info is accessed on a different port called remote JMX. Java applications do not have this remote JMX port enabled by default. You have to manually enable it. The "how" part can vary based on tomcat or websphere and can even change based on the version you're running of each. This is where the initial confusion can begin. Since I can't outline every way to configure it, i'll outline the basic concept and let you use some google fu to fill in the blanks.

For Tomcat deployments

We need to find the JAVA_OPTS startup setting. It is normally defined in whatever startup script java uses.

For Linux, it's typically a file called setenv.sh and is located in the bin folder in the application's install directory.
For Windows it's typically a GUI app with a java tab and java options section.
- Start > All Programs > Apache Tomcat > Tomcat Configuration
- Or open the tomcat7w.exe or tomcat8w.exe (depending on your version of java) via command prompt or powershell.
For WebSphere, it's found in the admin web interface for that application.

When you find it, you need to add the following 4 options. For linux, they need to be all on the same line with a space separating them. For Windows, multi-line usually works.

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=8686

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

For WebSphere/WebLogic deployments

First, we need to enable the remote JMX on a global level. In the WebSphere administration, there is a setting called "Platform MBean Server" that needs to be enabled. It's location and exact name varies based on your version of WebSphere but it is typically found in the Admin WebUI: Domain > Configuration > General > Advanced > Platform MBean Server Enabled Checkbox.

Once that is enabled then we need to add the following to each webspehere instance we want to monitor. Find the JAVA_OPTS area of startup and add these lines.

-Dcom.sun.management.jmxremote

-Dcom.sun.management.jmxremote.port=8686

-Dcom.sun.management.jmxremote.authenticate=false

-Dcom.sun.management.jmxremote.ssl=false

-Djava.rmi.server.hostname=<websphere server ip>

*The last setting, replace <websphere server ip> with the IP of your webspehere server.

Generalities

Thejmxremote.port=can be any port you want that isn't already in use. The templates in SAM default to using port 8686.

*If you have multiple java apps running on the same server like I do, then come up with a standard port convention to make it easier.

*My java apps ran on ports 8040, 8050, 8060, 8070 and 8080. That means that I would have to add those JAVA_OPTS to EACH of those instances that I want to monitor. So each corresponding JMX port was 2040, 2050, 2060, 2070 and 2080.

The jmxremote.authenticate=false and jmxremote.ssl=false are set to false only in this example and for testing purposes. There are a lot more variables that could break and troubleshooting it is a pain. Setting those to true is recommended but involve a lot of extra config which would really fall outside of the scope of this article.

Testing

Port check

Once those settings above are in place it will require a service restart to take effect. If you have a dev server, great! If not, schedule a maintenance window.

First, we need to see if the operating system is using the port. Run the following commands via SSH or on Windows Command Prompt/PowerShell.

Linux:netstat -nlp | grep 8686

Windows:netstat -ano | findstr "8686"

If nothing returns then something isn't setup correctly with the java opts, this is where google will be your friend.

Java jconsole

Second, we want to test with Java JDK. Download and install it on your SolarWinds Orion server. The download link is here: Java SE Development Kit 8 - Downloads

Once installed, browser to the installation directory: C:\Program Files\Java\<jdk version number>\bin. Open the jconsole.exe app inside.

Choose Remote Process and enter the IP:Port and click connect.

Since we originally set the jmxremote.ssl=false we'll get this message, just continue anyways by clicking Insecure connection.

Success

If all of your hard work has paid off you should be presented with this screen.

The MBeans tab is where all of the details of the application are located and it's what SAM is polling metrics from. We'll dive into this a little more in another article.

Finally! On to assigning the SAM java template. Continued on the follow link.

Understanding Java Remote JMX - SAM Template Config

↧

Hardware monitoring HPE Proliant Gen10

November 2, 2017, 2:10 pm

≫ Next: Slack Alert Integration - Overview

≪ Previous: Understanding Java Remote JMX - Initial Setup/Config Overview

We are currently using the HP Device Monitor Service for Agentless Server software solution to monitor Gen8 / Gen9 servers. For Windows server, we use HP's WBEM Providers / System Management Agents. With Gen10 servers, HPE no longer supports this solution. What is the proper way to monitor hardware now on Gen10 servers? We have the agent installed, but not getting the hardware information.

Currently using Orion Platform 2017.3.1 SP1, NPM 12.2, SAM 6.4.0.

↧

Slack Alert Integration - Overview

March 2, 2016, 7:30 pm

≫ Next: Cloud Fever, throwing all the SolarWinds into AWS

≪ Previous: Hardware monitoring HPE Proliant Gen10

Overview

I've created some Powershell scripts that call into Slack's API and pass the necessary info from Orion. This works great to receive alerts to a specific Slack channel.

Prerequisites:

Need to install Powershell v3 or greater on your Solarwinds Orion server. We specifically use the Invoke-RestMethod cmdlet. You can download that here.
Need to create a team with Slack if you have not already done so.
- Need to setup an incoming Web Hook. Here is a link that talks about how that can be set up.
Download the scripts at links right below this section
- Place them in a directory on your Orion server
- Edit each script and fill out the necessary sections in the beginning. That information is most likely going to be exactly the same for each script.

Configuration

Below are links that outline different alert types in Orion, each with it's own instructions and custom Powershell script.

Slack - Alert Integration - Node

Slack - Alert Integration - Node CPU

Slack - Alert Integration - Node Memory

Slack - Alert Integration - Node Disk

Slack - Alert Integration - Node Response Time

Slack - Alert Integration - Node Packet Loss

Slack - Alert Integration - Interface

Slack - Alert Integration - Application

Slack - Alert Integration - Component

Screenshot of how this is formatted in Slack.

Please check back often to see if there have been any improvements or bug fixes to this script.

**Change Log**

2016-03-01 : Initial release
2016-03-18 : Added new scripts and refreshed already existing ones.
2016-05-06 : Major overhaul. revamped message to Slack attachments for better formatting. Please also update your 'Network path to external program' in the alert action as some variables have been changed/added.

If you find this useful feel free to rate this article.

↧

Cloud Fever, throwing all the SolarWinds into AWS

March 16, 2018, 12:28 pm

≫ Next: SAM Template for ISE

≪ Previous: Slack Alert Integration - Overview

Here's the my situation in a nutshell:

So what about a physical server?

Well...

My company has caught the "Cloud Fever" and the only cure is more Cloud!

So I'm being told that with this edict in play, any requests for a new hardware server would be instantly denied.

Here's what my Orion environment looks like now:

NAM 3000 with ACM 250 We're using NPM, SAM, NTA, NCM, IPAM, VNQM, UDT, and WPM.

Main PE: polling 12893 Elements with a job weight of 5650.

APE in Canada whose local subnets aren't routed to the "Main" network hence the need for an APE: polling 5008 elements with a job weight of 1929.

APE in our SCADA industrial controls system DMZ (we're planning on rolling this into the MPE since we should be able to poll these nodes with "routing and firewall magic"

Our AWS environment is in the very earliest stages, only 1 test application has been migrated so far, so I have a lot of freedom to plan out how to monitor that.

Our AT&T managed MPLS cloud has a direct connect into our AWS instance so, that should help alleviate some latency issues with our remote location polling.

Some of the advice Leon offered includes the following:

Installing SW into AWS
The first and most important thing you need to ensure is that the timing between the primary poller and the database remains low – under 1500 miliseconds. If you have latency that is longer than that, you are going to experience errors and data corruption
The second (and only slightly less important) thing is to ensure that your database is set up for the transaction volume – in on-prem terms, it needs to be RAID 10 or flash. Not RAID 5.
The third thing is that you will likely be monitoring your on-prem environment using an additional polling engine, unless you have less than 100 devices on-prem that you wish to monitor

With all of that said, there is a guide to help you:

https://support.solarwinds.com/@api/deki/files/40251/SolarWinds_AmazonWebService_Deployment.pdf?revision=2

4) Make sure you add your cloud credentials to SAM. Again, score some corp brownie points.

Paging jbiggley at adatole suggestion to weigh in.

TL;DR I have to move my Orion environment to AWS because of corporate politics. Any advice is very appreciated.

So, what THWACKsters out there have installed/migrated Orion in the Cloud either by choice or by corporate politics gunpoint?

The first and most important thing you need to ensure is that the timing between the primary poller and the database remains low – under 1500 miliseconds. If you have latency that is longer than that, you are going to experience errors and data corruption
The second (and only slightly less important) thing is to ensure that your database is set up for the transaction volume – in on-prem terms, it needs to be RAID 10 or flash. Not RAID 5.
The third thing is that you will likely be monitoring your on-prem environment using an additional polling engine, unless you have less than 100 devices on-prem that you wish to monitor

Edit for grammar DERP.

↧

SAM Template for ISE

March 1, 2018, 5:06 am

≫ Next: Configure Monitoring for OS X Sierra machine

≪ Previous: Cloud Fever, throwing all the SolarWinds into AWS

Hello Community,

Has anyone thrown together an Cisco ISE template for tracking the processes and other key areas of ISE? I have tried adding some of the SSH templates and running the "show application status ise" however, they keep timing out at the prompt. Has anyone figured this out yet?

Thank you,

JAF

↧

Configure Monitoring for OS X Sierra machine

September 26, 2017, 11:37 pm

≫ Next: SSL Certificate Expiration Date template alert

≪ Previous: SAM Template for ISE

I need to configure monitoring for a Mac OS X Sierra (10.12.3) machine. For this, I came to know that I have to configure SNMP v2 on the Mac VM. I tried following the steps given in the SolarWinds blog below, but looks like hostconfig no longer exists in the newer versions of Mac. I am not getting much help with this. Can anyone please help me and provide the detailed steps to configure SNMP v2 on Mac OS X Sierra (10.12.3) so that I can enable monitoring on this virtual machine?

https://support.solarwinds.com/Success_Center/Network_Performance_Monitor_(NPM)/Configure_Mac_OS_X_for_SNMP_Monitoring_with_Orion

↧

SSL Certificate Expiration Date template alert

March 20, 2018, 9:02 pm

≫ Next: Possible to monitor server startup and/or shutdown?

≪ Previous: Configure Monitoring for OS X Sierra machine

Hi,

I'm feeling a bit slow at the moment so be gentile.

I have taken the SSL Certificate Expiration Date template in SAM and assigned it to 4 nodes (2 servers and 2 externals by url).

I'm now trying to create an alert to trigger when the days to expiration is less than X.

I'm really struggling to work out what field/variable and/or how to test in the trigger.

I have been searching thwack and google without much luck.

Can someone please point me in the correct direction?

↧