Quantcast
Channel: THWACK: All Content - Server & Application Monitor
Viewing all 12281 articles
Browse latest View live

Alerting is unreliable, for us, in latest Solarwinds

$
0
0

Alerting has been flaky since the original update to the latest, and is still unreliable, lasts just a few days.  It's been rock solid until recently.

I originally had this in case # 00118398 after upgrading to the latest, I still have the problem despite having the latest hotfix.

Most other issued are resolved or getting there, but alerting silently dies after a while.

 

New case # 00134147 raised for this.  In AdministrativeService logs it show errors.  These persist, not sure if they are related.  It seems the service cannot communicate with the other pollers / web console.   I will restart those to try to clear the errors.

 

To resolve we restart Information Service v3 and Alerting Service v2 - that fixes it - not sure if you need to do both, but definitely Information Service v3.

 

I get alerting from all sorts of other systems so I didn't notice for a day or so.

 

In view of the service still being unreliable, for us, I'm about to instigate daily restarts of all the web services.   Looking at the dependencies it doesn't seem they have any, so you should be able to restart them all in any order.

 

There is a possibiliy nobody else, or very few, have this problem.   We have some alerts which previous versions of Solarwinds didn't correctly upgrade - you can see them as having "Complex Condition" turned on, and those need to be rewritten.   But whether it is those which is causing the service to enter a faulted state I'm not sure.

 

Is anybody else having a problem with alerting going silent ?

 

 


Including Statistic and Message variable values in your Alert

$
0
0

Is there a variable, which can be inserted into an alert, which encompasses the values created in an Application Monitor?

 

I have created a script which does an SNMPGET, checks the value therein, and if it is a match to a desired value, returns

 

Message.Next_Hop: 10.10.10.10  (not the real address)

Statistic.Next_Hop: 1   (A success - no alert)

 

if it fails,

 

Message.Next_Hop: 192.168.1.1 (again, not the real address -- BUT...it is the returned address currently poked in that SNMP OID. )

Statistic.Next_Hop: 0  (A failure - generates alert)

 

When the alert goes off, I want to be able to include the Message, so I can inform the alert recipient of what address has replaced the good address we usually desire.  Currently, I have this message:

 

The application ${N=SwisEntity;M=ApplicationAlert.ApplicationName} on ${N=SwisEntity;M=Node.Caption} (${N=SwisEntity;M=Node.IP_Address}) is currently in a state of "${N=SwisEntity;M=ApplicationAlert.ApplicationAvailability}". The following is a list of components in this application presently in distress.

 

${N=SwisEntity;M=ApplicationAlert.ComponentsWithProblemsFormatted}

 

...and I want to put in something like,

 

This alert indicates that the value of the address has changed from "10.10.10.10" (the desired address) to "192.168.1.1" (the address now inhabiting the OID).

 

Does anyone know if there is a variable which can be inserted in the alert body text to provide the Message or Statistic values?

AWS Monitoring - how to monitor auto scaling groups

$
0
0

My company is looking to enable auto scaling groups in AWS.  Per our security team, we are not allowed to use only SNMP or WMI monitoring on instances in AWS (too many ports to open from our on prem SolarWinds environment). Are only option for monitoring in AWS is to use the SolarWinds agent.  When the automation team spins up a server using Auto Scaling groups, I do not see the "server name" in the agent listings in SolarWinds.  I only see ip-xx-xxx-xxx-xx.domain.com where xxx is the ip address of the instance.  The server name is set as a tag in AWS.

 

My question to the forum is:

 

1.) how does your company handle auto scaling groups in AWS?

2.) is there a way in SolarWinds to get the tags listed in AWS?

3.) is there a way to automate populating custom properties using the API

4.) is there a way to automatically assign application monitoring using the API?

5.) What are your experiences with servers that get destroyed and rebuilt and how you monitoring those servers using SolarWinds?

 

Thanks,

Steve H

Best captain/leader ever!

Greatest Invention Ever!

$
0
0

There are countless important inventions. Only ten are allowed here, so...I know you lot are creative, and I did only play the basics so... (I voted for cuticle remover. Why? Because the cuticle is that piece of skin that starts at the base of your fingernail and you then tear back until it reaches your elbow. Yuck!) GAME ON!

Is VoIP, IP-Video, Application sharing sessions or SIP trunk new IT challenge?

$
0
0

As people trying to be more productive they use new ways how to communicate and collaborate. VoIP is still one of the most important tools people use when talking to people, but it's instant messaging, application webcast, IP-video, conference calls or VoIP via web-based clients like Lync that may cause headache to IT.

 

Are these new technologies and communication channels like Lync, cloud webcasts - Citrix, GoTo Meeting or VoIP as SaaS, new challenge for IT? Is it easy to maintain and deliver high quality and availability or it's a IT's nightmare?

SAM triggering false alerts

$
0
0

We have two virtual vmware ver 5.5, SAM windows server 2012 R2 servers, 6cpu, 12gb ram. Seems that for the past couple weeks outta the blue we'll receive false alerts and rebooting the two nodes resolves issue for only a couple days.

In the event logs I see the following:

Event 1017: Source Perflib, disabled performance counter data collection from the asp.net_64_2.0.50727 service because the performance counter library for that service has blah blah

Event 1022: Source Perflib windows cannot open the 64-bit extensible counter DLL ASP.NET_64.2.0.50727 in a 32-bit environment.

Event 2003: source Perflib the configuration information of the performance library C:\windows\system32\inetsrv\w3ctrs.dll for the w3svc

 

 

In the system events i see alot of dcom errors as well:

event 10028: source distributedCOM, DCOM was unable to communicate with the computer x.x.x.x using any of the configured protocols; requested by the PID 25ac|C:\Program files x86\common files\solarwinds\jobengine.v2\swjobengineworker2.exe.

 

with that said were on v6.2.4 of server & app monitor, I've done the following:

DCOM Errors in System Event Logs related to Solarwinds - SolarWinds Worldwide, LLC. Help and Support

I've also repaired vmware tools as I was seeing issues with this. But as this just started to occur I'm not wanting to reboot these servers all the time, if someone can help that would be helpful please.

Trigger Report with a webrequest

$
0
0

We release our software out to servers on a weekly basis but the timing changes. I'd like to trigger the report from an API / webrequest. Has anyone figured out how to do this?


Passing an array as a script argument

$
0
0

Due to our product teams shuffling around the locations of where their various services run, I need to take an array of potential service names and pass them as an individual argument. Unfortunately I cannot make this work inside of Orion, and the script editor/text box isn't robust enough for me to find what's happening or why it isn't working. The only information I can get is it results in 0 services being found each time. I've searched and can't find how to correctly format the script argument field to take an array. What should the syntax be on the array?

 

Things that have not worked so far:

"Service1","Service2","Service3"

@(service1,service2,service3)

@("Service1","Service2","Service3")

@("service1", "service2", "service3")

forcibly casting my $services variable as an array

Hardware Health Extensions

$
0
0

What hardware platforms would you most like to see integrated with SAM to support capabilities such a hardware health?

SQL Monitor Transaction Log Space %

$
0
0

His there a way to monitoring the transaction log space by percentages? I need Solarwinds to alert me when a there is %50 usages.

Orion SQL Filter examples under a specific resource help

$
0
0

NPM-12.2

SAM-6.6.0

 

 

I am trying to remove the "System Idle Process" process's with a SQL filter under the Edit  button from this resource and it takes me to this link.

 

I see these Processes due to my WMI check, checking on these processes.  I want to remove these 'System Idle Process'  from this resurce

 

SolarWinds SQL Filter examples

SQL Filter Syntax Examples

 

 

 

 

Is there a list of Orion tables or values available to use with these SQL filters?  The examples they list aren't that great.

 

 

Can you help me with what my filter needs to look like here?

 

 

 

 

Thanks!!!

Where do I find logs for emails sent for an alert?

$
0
0

I'm trying to troubleshoot a handful of alerts that I think aren't sending emails, but I don't know where to look to confirm. Where does Orion log its mail sending actions?

Create Dependency based on the pooler?

$
0
0

Hello All,

 

We are monitoring systems and network devices using solarwinds Orion for more than 7+ years.Currently we have 4 Solarwinds Pooler which is used to pull information from different location .I want to create dependency based on the Pooler so that if the connectivity to the pooler is lost from main Orion Server Node down alert will not be triggered.

 

I am trying to communicate with the Solarwinds Support Team and the support is worst and they don't want to enter the exact matter .I am waiting reply from this Forum .

SAM O365 Modules - Additional Installation needed

$
0
0

I am working on setting up an O365 dashboard to kick the tires a bit but am a bit confused at the component settings.  I have the license usage over time and under the component settings box I have this -

 

Which is a link to installing Powershell for O365.  So what else am I supposed to do so Solarwinds can get this information?  I seem to be missing part of this or instructions somewhere.

 

Thanks! - Dave


Power Control Unit not updating data

$
0
0

I have upgraded to SAM 6.6.1 on Orion platform 2018.2 HF3

I do not have NPM. Only SAM.

 

I have a custom SNMP monitor application that I've been using to monitor my UPS devices. I was really happy to see the new Power Control Unit Status in the upgrade. At least I was happy for a few minutes.

 

The PCU is showing initial data when it first discovers an APC UPS. After that it never updates the data.

 

I removed UPS from List Resources. Then went back in and added it again. Same thing. It displays accurate data on first discovery. Then it never updates that data. My custom SNMP application is still showing all current data.

 

Any suggestions?

How to use new O365 Templates?

$
0
0

As mentioned in the release notes:  SAM 6.5 Release Notes - SolarWinds Worldwide, LLC. Help and Support

 

There are a bunch of new Office 365 templates, but I'm unsure of how to use them.   Mainly, what "nodes" do you assign them too? and what O365 credentials do you need?  I'm finding virtually no documentation on how to use these.  Any help would be appreciated.

Thanks!

SAM Alerting on SSL Cert expiry

$
0
0

Hi,

I am new to SolarWinds as we about to buy SAM 

One question I have been asked is if we can monitor when SSL Certificates ?

I would want to Alert when they are 1 month from expiry is this something possible with the latest SAM

Thanks in advance

Paul

Windows Update Monitoring - Days passed from last Windows Update randomly going up and down

$
0
0

Hi,

 

I have a problem with the "Days passed from last Windows Update" monitor in the Windows Update Monitoring template. It randomly goes down, one server at a time, effecting between 2 and 25 servers. I have several hundred other monitors, that all work fine, only this one monitor is showing this behavior.

 

When I go to component status, I get the following error:

 

PowerShell script error. Scripting Error: Script does not contain the expected parameters or is improperly formatted. 'Statistic' missing.

 

Can you tell me how to troubleshoot and resolve the issue?

 

Thanks,

 

Mike

Best captain/leader ever!

Viewing all 12281 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>