'test failed with "unknown" status' - Windows PowerShell Monitor

July 26, 2017, 2:30 pm

Getting 'test failed with "unknown" status' on a Windows PowerShell Monitor such as the one below. This happens on multiple nodes, local or remote host, etc. Orion does get the stats back as evident from the screenshots - so why doesn't it get the exit code? Or is something else at play?

Write-Host 'Message: test' 
Write-Host 'Statistic: 0'

Image may be NSFW.
Clik here to view.

Did review the following, checked out Configuring and Integrating PowerShell.pdf - didn't help.

(Running Orion Platform 2017.1.3 SP3, NPM 12.1, SAM 6.4.0.)

Image may be NSFW.
Clik here to view.

↧

Oracle DR days gap

April 20, 2018, 8:16 am

≫ Next: Website configuration failed Access to the path SolarWinds.CloudMonitoring.Strings.dll is denied

≪ Previous: 'test failed with "unknown" status' - Windows PowerShell Monitor

↧

Website configuration failed Access to the path SolarWinds.CloudMonitoring.Strings.dll is denied

July 5, 2017, 8:01 am

≫ Next: Using SAM to monitor O365 event log

≪ Previous: Oracle DR days gap

Anyone see this when running the configuration wizard?

Website configuration failed:

• Access to the path 'E:\Program Files (x86)\SolarWinds\Orion\CloudMonitoring\SolarWinds.CloudMonitoring.Strings.dll' is denied.

↧

Using SAM to monitor O365 event log

April 20, 2018, 11:19 am

≫ Next: SAM: Trying to understand why an alert didn't send an email

≪ Previous: Website configuration failed Access to the path SolarWinds.CloudMonitoring.Strings.dll is denied

Hi all,

I am hoping that some of you may have already done the heavy lifting and created a monitoring component that can sift through O365 event logs for AD Group changes. I have a script that can check for these changes on our on-prem DCs, but I haven't has the same success with the O365 monitor. I found a doc from Microsoft that says that you can use an API to search the event logs. I haven't started building this solution, yet. Has anyone tried this?

Thanks in advance.

↧

SAM: Trying to understand why an alert didn't send an email

April 20, 2018, 11:22 am

≫ Next: How to use new O365 Templates?

≪ Previous: Using SAM to monitor O365 event log

Good afternoon all,

I must start by stating I am a complete SolarWinds noob, and am completely overwhelmed when attempting to look into this issue.

The problem: At 6:40am, a service on a server apparently wasn't running however no one was notified of the issue.

My questions:

When looking at the Application Summary for the specific service I'm attempting to investigate, is there a way in here that I can see the specific error that it is supposed to trigger if conditions are met?
Should I be looking at SolarWinds Orion, and the Alerting system as two separate products that are loosely connected?
For Advanced Monitoring, if the time is set between 3am and 2:30am, would that cause the alarm to not trigger?
For the applications, where is the list of components obtained for things that you can monitor?
Is there a way to see if an alert should've been triggered, and if so which alert specifically?

I apologize for my questions being all over the place. I'm investigating this at the same time as typing this out, so my questions are typing by the minute. Thanks for any and all help you can provide. If you have any questions, I will do the best I can to answer them.

Thanks!

↧

How to use new O365 Templates?

December 19, 2017, 8:03 am

≫ Next: Caught type error - malformed XML returned as Tomcat server response!

≪ Previous: SAM: Trying to understand why an alert didn't send an email

As mentioned in the release notes: SAM 6.5 Release Notes - SolarWinds Worldwide, LLC. Help and Support

There are a bunch of new Office 365 templates, but I'm unsure of how to use them. Mainly, what "nodes" do you assign them too? and what O365 credentials do you need? I'm finding virtually no documentation on how to use these. Any help would be appreciated.

Thanks!

↧

Caught type error - malformed XML returned as Tomcat server response!

August 21, 2017, 4:26 am

≫ Next: Alert show only two decimal places

≪ Previous: How to use new O365 Templates?

Hi guys

I'm quite new to SAM, I've been trying to attache the "Tomcat Server" Application Monitor to my Apache Tomcat/7.0.54 instance.

When I do and run a test, I get the error message "Caught type error - malformed XML returned as Tomcat server response!". I have checked the output of "http://myserver/manager/status/?XML=true" and can confirm that it is returning valid XML.

Please help!

↧

Alert show only two decimal places

August 1, 2014, 9:35 am

≫ Next: SQL for a group of nodes.

≪ Previous: Caught type error - malformed XML returned as Tomcat server response!

What I need to do, to the alert shows only 2 decimal places.

For example:

Variable: Statistic data: ${StatisticData}

Result: Statistic data: 5,84696769714355

I need that show only 2 decimal places

↧

SQL for a group of nodes.

April 20, 2018, 11:47 am

≫ Next: SAM - Elevating Severity Alerts

≪ Previous: Alert show only two decimal places

for subplots for a bunch of servers. On each pane of glass they was different site

subplot = 'Prod Environment Cleveland' AND Volumes.Caption <>'cached memory' and Volumes.VolumeType='Fixed Disk' and VolumePercentUsed >= 90

subplot = 'Prod Environment Parma' AND Volumes.Caption <>'cached memory' and Volumes.VolumeType='Fixed Disk' and VolumePercentUsed >= 90

subplot = 'Test Environment - Canton' AND Volumes.Caption <>'cached memory' and Volumes.VolumeType='Fixed Disk' and VolumePercentUsed >= 95

subplot = 'Test Environment UCRC' and Volumes.Caption <>'cached memory' and Volumes.VolumeType = 'Fixed Disk' and VolumePercentUsed >=40

I cannot get anything to appear in the top one no matter what I change that last number to. There are nodes in that "subplot" I copied it from the custom properties of one of the nodes in there.

Why is it not displaying anything (it doesn't indicate an error either.

Is there a better why to write the sql query?

↧

SAM - Elevating Severity Alerts

February 21, 2018, 5:43 am

≫ Next: Appinsigt for IIS thresholds can't be changed

≪ Previous: SQL for a group of nodes.

Hey guys,

I've hit a bit of a wall and I'm coming up on a deadline for getting SAM setup to replace our current monitoring solution (Icinga). While I've been able to setup all of the basic monitors and alerts we had in the last environment, I'm having some struggles getting the email alerting to do what I expect, and I'm wondering if I'm just building all of our alerts completely backwards or if I'm missing something obvious.

I've set the trigger conditions in such a way that we have thresholds and severity (as well as HTML emails) which send as an issue increases in severity like such:

Informational CPU:

Trigger: 80 <= X < 90

Reset: X < 80

Warning CPU:

Trigger: 90 <= X < 95

Reset: X < 80

Critical CPU:

Trigger: 95 <= X

Reset: X < 80

My reasoning is that I wanted the alert to clear (and another alert email to trigger) if the CPU moved into the next threshold, but I didn't want the reset (a Green "all clear" email) to fire unless the issue was resolved, not just moving from Warning to Critical. I also wanted to prevent the NOC view from seeing an Informational, Warning, and Critical alert for the same machine, as we've already had issues with the team ignoring a Warning as they also saw the Informational alert in the view.

As a note, I have tried removing the upper thresholds and only setting a reset on the lowest threshold. This worked, but the additional alerts in the NOC view confused our team. I've also tried adding the thresholds with only the lower reset trigger, but if we get too large of a jump between polls (Moving right from OK to Warning back to OK, we never get the reset/all clear email. Lastly, I've considered adding an "OK" alert, however this added clutter to both the Node/Object views (since they always had "triggered alerts") and added additional confusion to the NOC team when they went to the "All Alerts" view.

Am I coming at this backwards or is there a simple setting I'm missing? The NOC view works perfectly now, however the emails aren't behaving as I want (definitely a configuration issue on my side). Any advice or pointers would be greatly appreciated.

Thank you,

-JD

Image may be NSFW.
Clik here to view.

↧

Appinsigt for IIS thresholds can't be changed

April 22, 2018, 8:37 am

≫ Next: Monitor Oracle ASM with Agent

≪ Previous: SAM - Elevating Severity Alerts

i worked on adjusting the CPU and memory thresholds of the IIS monitor,

but i got into a problem that for some reason the field of amount of consecutive polls is no longer editable and for some reason it contains an unusable value which breaks our monitoring.

the threshold was edited to be "at least 4 consecutive polls" but somehow got saved as the below.

Image may be NSFW.
Clik here to view.Image may be NSFW.
Clik here to view.

Image may be NSFW.
Clik here to view.

i was hoping someone else ran into this or knows how it can be fixed.

i tried upgrading the ASM to 6.6 and to re-import the default templates.

↧

Monitor Oracle ASM with Agent

April 18, 2018, 5:04 am

≫ Next: Can we add audit event for muted, unmanaged, resumed alert status across the nodes?

≪ Previous: Appinsigt for IIS thresholds can't be changed

I need to know is there any possibility to monitor Oracle ASM when I monitor node with Solarwinds Agent?

Also I saw this article Oracle Automatic Storage Management

But if there is any way to do this just with agent it so great!!

↧

Can we add audit event for muted, unmanaged, resumed alert status across the nodes?

April 22, 2018, 10:28 pm

≫ Next: Exchange 2016 CU3 - Unknown status / Access Denied for one server

≪ Previous: Monitor Oracle ASM with Agent

Can we add audit event for muted, unmanaged, resumed alert status across the nodes?

↧

Exchange 2016 CU3 - Unknown status / Access Denied for one server

January 20, 2017, 4:05 am

≫ Next: Linux Agent

≪ Previous: Can we add audit event for muted, unmanaged, resumed alert status across the nodes?

Hello Twack.

I have Solarwinds with SAM 6.3.0 and I have 3 Exchange 2016 servers setup in DAG at CU3.

After updating to CU3 one of the servers is unable connect to the "AppInSight for Exchange" counters corretly.

The other servers are able to connect and display information on all mounted DB and users with no problem. All of the servers have the same service user credentials configured and all have an Agent installed.

No changes to Powershell Configuration were made during the update.

No changes to the firewall configuration, the windows firewall is off.

All of the machine reside on the same subnet.

When testing the connection to the server it displays:

"WinRM test was successful.

PowerShell Exchange web site testing failed with the following error:

Connecting to remote server (theipaddressishere) failed with the following error message : [ClientAccessServer=(theservernameishere),BackEndServer=,RequestId=2a248506-670e-406d-a15a-9759acf46d28,TimeStamp=20.1.2017 11:52:36] Access Denied For more information, see the about_Remote_Troubleshooting Help topic."

When trying to Configure the Server it displays:

"Remote Configuration Failed

Remote configuration was unsuccessful due to the following: "WinRM test was successful. PowerShell Exchange web site testing failed with the following error:Connecting to remote server (theipaddressishere) failed with the following error message : [ClientAccessServer=(theservernameishere),BackEndServer=,RequestId=dac2df62-ec05-4381-848b-d332a7a3e680,TimeStamp=20.1.2017 11:56:16] Access Denied For more information, see the about_Remote_Troubleshooting Help topic.""

I've checked the following things.

1. Check is IIS server certificate is set to "Secure Exchange" on all servers. Bindings are exactly the same for all servers.

2. Deleting the node and re-adding it to Solarwinds.

3. Checking and adding the Service user credential to Powershell Configuration as suggested in AppInsight for Exchange: Access is Denied - SolarWinds Worldwide, LLC. Help and Support

4. Check the "ListeningOn" suggested by AppInsight for Exchange: WinRM testing failed - SolarWinds Worldwide, LLC. Help and Support with Listening On set to various IP's

Any Ideas?

↧

Linux Agent

January 8, 2017, 8:49 am

≫ Next: Alerting when a component monitor is down, but the node is up.

≪ Previous: Exchange 2016 CU3 - Unknown status / Access Denied for one server

Hello all,

I am interested in using the agent instead of NET-SNMP and I've noticed a few differences regarding information, such as the agent pulling OS version for me. I've also noticed on a linux-agent monitored node I am able to see disk queue length, but on my snmp machine I can not. I believe monitoring with the agent will continue collecting data if the node loses connection with my poller? I also read where aLTeReGo said in a post the Linux agent monitoring of memory utilization is far more accurate than NET-SNMP. I think i have also read you do not need to mess with OiDs when using the linux agent? Assuming this is all true, I am interested in this. On a side note, it seems to only use a bit less than 30 megs of memory as well, has anyone observed more? I'm only testing it on one machine atm.

My big question is, is there a list of pros/cons to using linux agent vs snmp monitoring? This is something I would have to get SysAdmins on board with for it to happen. Thank you!

↧

Alerting when a component monitor is down, but the node is up.

April 23, 2018, 4:01 am

≫ Next: Create Custom Table

≪ Previous: Linux Agent

I have been trying, with partial success, to generate an email alert when somebody adds a member to the local adminstrators groups on servers within our estate.

I have set up the Component Monitor to look for event ID 4732 in the Security logs and an alert to generate an email when that Monitor is in a Down state and it all works OK in terms of generating the alerts I want to see.

However, we also get alerts when servers go offline. I tried adding a third condition to the alert trigger:

Component - Component Name (Component Alerting Properties) - is equal to - Microsoft-Windows-Security-auditing-4732

And

Component - Status - is equal to - Down

And

Node - Status - is equal to - Up

But I still got an alert when a server was powered down this morning.

I'm struggling to see what else I can do. Any suggestions?

Thanks

↧

Create Custom Table

April 20, 2018, 1:52 pm

≫ Next: Multiple Statistic Data with unique alert variables?

≪ Previous: Alerting when a component monitor is down, but the node is up.

Hello All,

I am attempting to create a custom table to place on one of my views in Orion. I am wanting to create a table that displays all trunk ports and etherchannel ports within our environment. I have already created a group that does a dynamic query to gather all of those interfaces although I am not certain that is how I should continue to proceed. The table needs to display the following columns (from left to right) Node, Receive % Utilization, Transmit % Utilization, Receive Discards Today, and Transmit Discards Today.

It might also be nice to know how to best break these out by geographic locations if we wanted to.

Please let me know if I need to clarify further or if you have any questions.

Thanks!

↧

Multiple Statistic Data with unique alert variables?

April 16, 2018, 5:01 pm

≫ Next: Issues with agents following upgrade to SAM 6.2.3?

≪ Previous: Create Custom Table

I know its possible to use the script monitor to retrieve multiple statistic data and monitor those (a limit of 10). I also know you can name these statistic data like this "Message.Item" and "Statistic.Item". If we do this Item will be its own field within the Solarwinds component page. However what variables do i need to use these values uniquely.

For example, I am monitoring a linux service and want to return three things, the status (rather normal nothing fancy), the path of the service, and the name of a service. Then I want to use the path and name in the trigger actions to automatically triage some incidents.

The following variables all return the following:

${N=SwisEntity;M=ComponentAlert.MultiValueMessages} - returns all three values in one line

${N=SwisEntity;M=MultipleStatisticData.StringData} - returns only the last value

Is there any way i can return these three statistic messages uniquely? Or can I manipulate the variable with some sort of python like formatting to remove what i need??

↧

Issues with agents following upgrade to SAM 6.2.3?

February 3, 2016, 3:50 pm

≫ Next: How to know top 10 or top running processes/services in a windows server so that I can implement SAM component monitor on them?

≪ Previous: Multiple Statistic Data with unique alert variables?

Has anyone else had issues with agents following an upgrade to SAM 6.2.3?

One particular install I've been working on went from SAM 6.2.1 and NCM 7.4 to SAM 6.2.3 and NCM 7.4.1, then we put NPM 11.5.3 on top of that. The install seemed to go fine, no errors during the Config Wizard, and the web console loaded fine after each install. Things went downhill from there:

None of the agents are returning any data (working previously), approximately 623 agents.
The Orion Module Engine service is crashing every 30-45 minutes - potentially related to the above
Approximately 2,800 Cisco devices showing Hardware Health is "Unknown" even though Hardware Health polling for these nodes is disabled. This is confirmed via List Resources as well as Manage Pollers. Assume this is from putting NPM on top of an existing NCM/SAM install and enabling hardware health polling (along with VLAN polling, routing polling, etc).

I presume the agent version would be the same regardless of which module I upgraded above as they were all released same day.

Currently have a case open with SolarWinds support but thought I'd post this while waiting for an AE. Case # 934710.

We will give support a bit of time to analyse and hopefully come up with a fix/solution but we have a database backup and server snapshot we can roll back to if it doesn't look likely.

↧

How to know top 10 or top running processes/services in a windows server so that I can implement SAM component monitor on them?

April 23, 2018, 1:45 pm

≫ Next: Greenplum DB monitoring

≪ Previous: Issues with agents following upgrade to SAM 6.2.3?

Hi,

I am trying to implement component monitors or application monitors on the processes running on a set of servers. However, when I see real-time process explorer, each server has more than 60 to 80 process running simultaneously, out of which only 2-3 processes are using CPU. I cannot implement on all 60-80 processes as it will be not required and too time consuming and at the same time, I do not want to implement monitor on just 2 or 3 processes as I feel that is too less of monitoring.

I am confused as to

How do I derive the top 10 or 20 running process among all of them?
Is there any option in Solarwinds which I can use to derive top 10 running processes?

Any help will be highly appreciated.

↧