Quantcast
Channel: THWACK: All Content - Server & Application Monitor
Viewing all 12281 articles
Browse latest View live

Are you responsible for database performance & tuning? Get answers from the expert - SQL Server MVP Thomas LaRock


SAM Oracle template - table space, autoextend and file system space?

$
0
0

So our DBAs like the SAM Oracle templates and we've had a look at the components that calculate the table space free and used percentages. However, my DBAs inform me that most of the table spaces they set up have "autoextend" enabled and so really for the table space % free/used components to be of any use, they also need to take into account the free space on the file system where the data files for the table space sit (paraphrasing here, so excuse me if I've not worded that quite correctly!).

 

Questions then:

1. How can we make the Oracle table space components take autoextend and file system space into account in the calculation? Has anyone done this yet?

2. Yes, we already track the free space on file systems with NPM on the Oracle servers, but that's a separate component. Is there any way to combine the two?

 

B

NPM and SAM to monitor hardware sensors on Oracle linux

$
0
0

hi all

 

anyone managed to pull back hardware info on a HP server running Oracle flavour of Linux?  all we can see is cpu and ram..

 

cheers

dan

MYSQL Replication Monitoring

$
0
0

Hi All,

 

Is there anyway we can monitoring mysql master/slave database replication through Solarwinds?

 

Thanks

VBS Polling

$
0
0

Hello,

 

Is there any way to poll for information using a VB script? I would like to run a script that I have to find Microsoft Office keys on our machines. Is there any way to incorporate code into, say, a custom property or a chart?

All help is greatly appreciated.

 

Thanks,

Isaac

"Script file creation failed. The created file is corrupted."

$
0
0

Attempting to test a short Perl script in SAM 6.0, getting "Script file creation failed. The created file is corrupted."

 

What does this mean? How do I troubleshoot this?

 

The script runs fine on the target by itself. The target is OSX Server 10.6.8. A very simple script runs fine from SAM as well; once I attempt to port a script that's already working stand-alone, getting the above error message. Trying googling for it, nothing comes up.

 

When I watch a /tmp folder on the target, I do see a temp file being created, then disappearing quickly.

 

Thanks!

Disk alert for amount of space remaining, 10gb and 2gb?

$
0
0

I just posted a feature request about this because as it stands today this doesn't seem possible.  Since they want the value in bytes, and there's a hard limit of the equivalent of 2gb in bytes...I can't make this alert.  I need to be able to start alerting on disks when they have less than 10gb free and then less than 5gb free for critical.  I could find nothing on how to do this without a lot of complexity.  It seems something is wrong with disks being several TB in todays world that they are still using BYTES for the monitoring value??

LDAP Connection Monitor, varying return values

$
0
0

So we've deployed the LDAP Connection Monitor in SAM to about 100 or so DCs. All have LDAP working and they're in a whole bunch of locations globally. However, the results we're seeing are pretty varied. On one poll, the service will show as up and responds fine with a statistic value of 3 (indicating the LDAP version I believe). Then on the next poll, it'll often show as down - mostly with the error message below:

 

The return code is different than expected. Testing on node '10.1.2.3' failed with 'Down' status ('Down' might be different if script exits with a different exit code).Can not connect to LDAP Server at 10.1.2.3. (-2147016646). Error Code:

-2147016646. Error Message: The server is not operational.

 

 

Yet on the following poll, it'll work again. Eg:

 

orion-ldap.jpg

 

This variability in results is undermining the AD team's confidence in Orion's ability to monitor LDAP correctly. So question is, is the error code indicative of network connectivity issues, LDAP connection issues or an Orion issue? And how do we fix it?


File server share on a drive

$
0
0

So we have a file server which will have a S: drive (which we will be monitored) but the question came up  can Solarwind's monitor a user share with in the drive to monitor it for size and alert on it when it gets to a certain threshold.

AppInsight of SQL - SQL Agent Job Component

$
0
0

I am working with AppInsight for SQL template, and trying to adjust component thresholds to reflect our event management requirements. One item that is particularly frustrating is the SQL Agent Job Component. No matter what I supply for warning or critical state it does not appear to follow similar rules in it's effect upon overall instance state.

 

For example, Buffer Cache Hit Ratio. If you leave the warning and critical thresholds empty (not configured) that component will always show green/up status regardless of current value collected. This in turn rolls up into the overall instance state contributing green/up status to the pool of other component monitors.

 

In the case of SQL Agent Job Info.  If you leave the warning and critical threshold values empty or set a very high number for warning and critical state on this component, if one job fails that component is identified in critical state and the instance is then identified in critical state.

 

Is there a way around this issue, minus disabling the SQL Agent Job component on the AppInsight template?

Monitoring Active Directory without Domain Admin account?

$
0
0

I was curious if anyone has been successful in using another account that is not a Domain Administrator to monitor their Domain Controllers with a service account. We have a policy, which is out of my hands to change, that domain administrator account passwords must be changed every 60 days - thus causing a lot of breakage in my environment when someone forgets to change it in Solarwinds.

 

 

Anyone have any luck with this that can offer some guidance?

Template export, modification and re-import - how to do it right?

$
0
0

Hi all,


We're using SAM to monitor SQL using the standard SQL DB template. We're creating a template per instance because of course each component in the template needs to point at the instance name that we want to monitor. So then we apply that template to the server where the instance exists. That's fine and we can copy the templates using the GUI and rename etc. However, we need to do this on an enterprise scale - doing it through the GUI isn't going to cut it. So I've tried to export the SQL template, rename the file and modify the template <Name></Name> field in the XML to match the new name of the file. I've also updated all the <SQLInstanceName> in each of the components to the appropriate instance name for this template. (Actually I've got a cool Powershell script to do this on a large scale for lots of server/instance pairs from a CSV file).

 

Problem is when I try to import the newly made templates again, Orion throws an error :

 

ApplicationTemplateImportFromXmlStream failed, check fault information

 

Digging into the error further I see:

 

Time: 05/30/2014 10:50:17.2351

Server: Microsoft-IIS/7.5

Pipeline: Integrated

User Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/6.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; .NET4.0C; .NET4.0E)

Error Instance: 440b5ba79ef140c4a4d78c4a63ded28f

User: *MYUSER*

URL: *URL REMOVED*

Referrer: *OTHER URL REMOVED*

Message: ApplicationTemplateImportFromXmlStream failed, check fault information.

ErrorSite: SolarWinds.APM.Common.BusinessLayerFactory.BusinessLayerExceptionHandler

ErrorType: SolarWinds.APM.Common.ApmBusinessLayerException

Stack:

at SolarWinds.APM.Common.BusinessLayerFactory.BusinessLayerExceptionHandler(Exception ex)

at SolarWinds.APM.Common.APMBusinessLayerProxy.ApplicationTemplateImportFromXmlStreamEx(Stream data)

at Orion_APM_Admin_ImportApplicationTemplate.submitButton_Click(Object sender, EventArgs e)

at System.Web.UI.WebControls.LinkButton.OnClick(EventArgs e)

at System.Web.UI.WebControls.LinkButton.RaisePostBackEvent(String eventArgument)

at System.Web.UI.WebControls.LinkButton.System.Web.UI.IPostBackEventHandler.RaisePostBackEvent(String eventArgument)

at System.Web.UI.Page.RaisePostBackEvent(IPostBackEventHandler sourceControl, String eventArgument)

at System.Web.UI.Page.RaisePostBackEvent(NameValueCollection postData)

at System.Web.UI.Page.ProcessRequestMain(Boolean includeStagesBeforeAsyncPoint, Boolean includeStagesAfterAsyncPoint)

 

 

Do I need to make some other modifications to the edited template before I can upload it? I see a template ID in the XML, but don't know if that's relevant? Any one every do this? Maybe I need to log a support call?

Users connected to Server and Services per User

$
0
0

Hey guys i need some input on this, im a newbie so please understand..

 

i need to generate a template that shows me how many users are connected to our terminal service servers and what applications are they using..  but im not getting anything on SW on how to do it...

1.- CAN IT BE DONE??

2.- CAN SOME ONE PROVIDE ME WITH SOME GUIDENCE?

 

thanks in advance..

Assign SAM template to groups?

$
0
0

(or "Is SAM teasing me *again*?!?)

 

On the SAM settings page, at the very bottom, there is a tantalizing sentence:

"Application monitors may be added to groups."

 

Yet, clicking the link "Manage Groups" takes you to the good old "manage groups" screen.

 

So what is up here - Can you, in fact, assign a template to a Group (not just a group, but a Group), or is this just another case of SAM playing with my emotions?

 

- Leon

SAM 6.1.1 Upgrade Broke UI

$
0
0

All services are started and all looks well, can't get around this one though....any ideas?

 

There was an error communicating with the Orion server

There was no endpoint listening at net.tcp://jarjarbinks:17777/orion/core/businesslayer that could accept the message. This is often caused by an incorrect address or SOAP action. See InnerException, if present, for more details.

 

Help me fix this
  1. Confirm that the SolarWinds Information Service is running. 
    • On your local SolarWinds server, click Start > Administrative Tools > Services.
    • Right-click SolarWinds Information Service V3, and then click Start.
  2. Confirm that the SolarWinds Orion Module Engine service is running. 
    • On your local SolarWinds server, click Start > Administrative Tools > Services.
    • Right-click SolarWinds Orion Module Engine, and then click Start.
  3. Examine the specific details of the error for environmental hints to help resolve your issue.
  4. Refresh the web console browser.
  5. If still unresolved, click the troubleshooting help links below.
  6. After completing any changes, repeat the same task that generated this error.

Universal Disk Free Space Monitoring (One Template Will Handle All Logical Disks + Exceptions And Overrides)

$
0
0

Dear All Thwack Members,


It is my pleasure to present you with this elegant template to help you out with monitoring free disk space across all your windows servers. As a matter of fact - 80% of all our incidents are disk space related and I hope this will help you out to handle this part in a simple and very flexible way


What's in the tin?

  • Monitor all your disks on all servers (requires just 1 SAM license per server)
  • Set global threshold based on Free MBytes AND Free % levels. This works perfectly well for both large (TBytes) disks and small (MBytes) disks
  • Differentiate between warning and critical level
  • Set overrides on a per disk per server level (very granular approach)
  • Exclude disks you do not want monitoring (either completely or setup overrides)
  • Track usage as a graph
  • Have it on your dashboard as a green/yellow/red blob (which is not possible out-of-the-box with SAM volumes)


Screenshots:


Component

001.JPG

Multi stat chart

002.JPG

Script Arguments:

003.JPG

Global Statistic:

004.JPG

Individual Disk Statistic for global overrides (example for "J" drive)

005.JPG

Dashboard:

006.JPG

007.JPG


Benefits:

  • If you are short on licenses - this will help you to monitor all disks with just 1 license per server
  • The biggest benefit for me (as my SAM is unlimited and first point is not really applicable) is that I can have it as a group item on  my dashboard. Very handy. Screenshot above
  • Ability to compare against both MBytes and Percentage levels definitely makes it very self-sustained and self-managed template. It just works for every single disk. I only have very few exceptions configured across 700+ volumes
  • When you add a new disk (in virtual world this is quite common) - you have your monitoring automatically enabled for it (without the need to discover new disk in SAM, what I often forget)


Additional info:


User description notes (copy-paste from template, for those craving more info):

---------------------------------

This smart script will monitor free space on all fixed local disks on Windows server.

 

* Global threshold, which is going to be applied for all disks on a server by default, is based on free percentage AND free bytes (for example, a particular disk has to be less than 5% AND less than 20GB to fire off warning alert). This works perfectly well for large disks measured in TBytes and small disks measured in MBytes

* You can exclude particular disk on a particular server from being monitored by global threshold and set custom threshold for this disk instead

* You can differentiate between critical/warning threshold levels.

>>> For the global threshold - this is achieved by incrementing "Statistic" counter in 100s for any critical breached disk and in 1s for any warning breached disk. (for example: Statistic value 102 means that we have 1 disk breached critical threshold and 2 disks have breached warning. Note, that when disk falls in critical level it will also be in breach of warning level as well. So, value 101 will indicate that 1 disk has fallen into critical level; likewise value 102 will indicate 1 disk critical and 1 disk warning)

>>> For the custom threshold - you just simply set values in SAM template for warning and for critical level for a particular disk

 

How to exclude disk / configure custom threshold:

When you need to override global threshold for a particular disk - you would first exclude this disk from being monitored by global threshold and second - you would configure separate threshold, which is based on free MBytes value, in SAM template below.

For example: disk F needs to trigger an alert when it drops below 150GB. In this case you would override script arguments with "${IP} 2 5000 5 20000 A,B,F" and you would set critical threshold to "less than 150000" as a SAM threshold value. You can also set SAM warning threshold as 200000 to be notified in advance

 

Limitations (all with very limited impact on usability and flexibility):

- You can only define custom thresholds for the first 8 disks (C,D,E,F,G,H,I,J). Note, that global threshold will still be applicable for all logical fixed disks, regardless of the quantity (unless you exclude any of them as explained above). In fact - this is very limited limitation as most of the servers will probably not have as many disks and even if they will - it is very rarely when you will need to define custom thresholds anyway - so, probability of this limiting your monitoring abilities is very low

- When you exclude particular disk - you can only set "MBytes" threshold value (no percentage threshold here). This is also very very limited issue, because at the time when you exclude particular disk - you already know exact size of it and you can work out yourself at what level in MBytes you want an alert to come through.

---------------------------------


Enjoy, comment, like, rate


To Your Monitoring Success,

Alex


Hardware fault Email not sending correct data

$
0
0

Hi,

 

We're looking to make better use of the APM Hardware Monitoring and use the "Alert me when a component goes into warning or critical" however out of the box the default alert keeps sending out the same Device Details in the table.

 

The alert correctly shows the part that's failed but the Model Number (${APM_HardwareAlertData.Model}) and Serial Number (${APM_HardwareAlertData.ServiceTag}) are always for the same incorrect server no matter what node creates the alert.

 

Is there a way to correct this? The alert Property to Monitor is APM: Hardware Sensor and the Condition used is APM Hardware Sensor Status.

 

Many thanks

Jason

Do you participate in polls?

We are trying to find out if we can do end point monitoring solution for REST with SAM.

$
0
0

The metrics we need are:     

1. end point transaction response time     

2. average number of transaction per "interval" (we should be able to get metric information per minutes/hours/days/months/etc)     

3. SQL transaction response time.     

4. ability of monitor different states of requests as well as number of requests in the Node queue     

5. http header information for the request without authentication information     

6. Ability to monitor live call stack (stack trace) for a running request   

7. Ability to monitor CPU and memory utilization per request.

 

This will be on a Windows platform using IIS and Node.js

Java Application Server monitoring via SNMP

Viewing all 12281 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>