Ok, so I've got a Linux server running a boatload of different processes (approx 1200-1300), and when I try and poll even one process (tried with command line filtering on and off to see if it made a dif) the net-snmp process puts a serious load on the server's cpu (10-15%) and stays there indefinitely. I'm assuming the cpu load is that high because Orion APM is scanning the OID table (HOST-RESOURCES-MIB:hrSWRunTable) for the particular proc and has to scan thru 1200-1300 entries. That raises the question of which specific OID's does Orion hit to check for a process? I tried excluding certain OID's in snmpd.conf and narrowed it down to 5 - 1.3.6.1.2.1.25.4.2.1.1, 1.3.6.1.2.1.25.4.2.1.2, 1.3.6.1.2.1.25.4.2.1.3, 1.3.6.1.2.1.25.4.2.1.4, and 1.3.6.1.2.1.25.4.2.1.5. Excluding any of those 5 OID's in snmpd.conf breaks the snmp process monitor. The problem is I need to monitor about 30-40 processes on this server and in doing that the net-snmp proc eats up a ton of cpu, and APM times out several times a day trying to poll.. Am I out of luck due to the large amount of processes running on the server, or is there a way around?
Here's some background info on the target server and my Orion setup:
Target server -
HP ProLiant BL680c G5 blade(in a HP c7000 enclosure)
4 x Quad-Core Intel Xeon MP, 2400 MHz
64gb RAM
OS - Red Hat Enterprise Linux AS rel 4 (Nahant Upd 7)
Net-Snmp 5.1.2-13.el4_7.3
Orion servers -
NPM/APM server
Dell PowerEdge 2950
2 x Intel Xeon 5150 Dual core 2.66ghz cpus
16gb RAM
OS Windows Server 2003 Enterprise R2 Sp2
Orion NPM SL2000 ver 10.1.2 Sp1
Orion APM ALX ver 4.0.2 Sp2
Orion DB server
Dell PowerEdge R710
2 x Intel Xeon E5620 Quad core 2.40ghz cpus
96gb RAM
db is on 6 x 146gb 15k RPM 6gbps disks in Raid 10 setup
OS Windows 2008 R2 Enterprise 64bit
MS SQL Server 2008 R2 64bit Standard ed.
I opened a support case #239934, but I'm hoping maybe somebody else has run into this problem and has some suggestions..
Thanks!