Polling time frequently higher than 5 minutes

Denika Performance Trender™ is a 100% web configurable view into the performance of your IT and IS resources. Denika answers: How much of the resource (network bandwidth, CPU, hard drive, etc.) is being consumed? What is the Up-Time of an important application, server and or network device? What is the Response-Time of an important application, server and or network device?

Moderators: scottr, Moderator Team

Locked
crwood1
Posts: 4
Joined: Thu Jun 19, 2014 10:19 am

Polling time frequently higher than 5 minutes

Post by crwood1 » Thu Jun 19, 2014 10:33 am

Hello,

I'm having an issue where the Polling time graphs show that the Denika polling time is frequently taking longer than 5 minutes. It started to occur Friday June 13th and just got worse over several days. This seems to be causing voids in the bandwidth utilization graphs. I've tried to reboot the server, and have removed 200 reports that were no longer required in Denika, but it has done nothing to allieviate the situation. Can someone please provide me with further troubleshooting steps?

Thanks.

seanh
Posts: 20
Joined: Thu Oct 17, 2013 8:50 am
Location: Kennebunk, ME

Re: Polling time frequently higher than 5 minutes

Post by seanh » Tue Jun 24, 2014 8:19 am

Hello Crwood1,
I know you had a brief conversation with Jake and we tried changing the SNMP retries and timeouts.

Another area way we can working on the timeout is utilizing more system resources for Denika via the reports per poll file setting.

May you open up the following: Menu -> Control Center -> Configuration Editor. Once this is open press General under Denika. Here will be an option for Reports/Poll File.
From the settings description:

“By default, the polling engine automatically places 1000 reports per polling file. Providing a number here allows you to adjust the number of reports per file. Leaving it blank defaults to 1000 reports. Changing this is not recommended, unless directed by Plixer.”

A good rule is we can have ~ 1 pole file per CPU thread. So if you have 4,000 reports and 8 CPU threads, that is 500 reports per poll file. We can try doing 500-600 here to utilize more resources.

How much CPU and RAM do you have? Some other areas to look at would be the Disk I/O, are you running this in RAID 10? We have customers who run Dell R720s with 40 Cores, 64 GB RAM, 15k RAID 10 SAS drives running 25,000 Denika reports in 20 seconds.

Thanks,
Sean

crwood1
Posts: 4
Joined: Thu Jun 19, 2014 10:19 am

Re: Polling time frequently higher than 5 minutes

Post by crwood1 » Tue Jun 24, 2014 9:19 am

Hi Sean,

The server is a VM, which is running a Xeon X5650 2.67GHz processor. According to the specs on Intel's website, it's a 6 core processor and has 12 threads. We are also running this server with 4GB of RAM. There are three HDDs set up on the VM, one for OS, one for Paging (5GB), and one for Applications (denika), which is 50GB.

We currently have about 4500 Reports. I recently removed about 200 Reports to try and clean it up.

Thanks.

Chris.

pauld
Posts: 222
Joined: Mon Jan 04, 2010 10:05 am
Location: Kennebunk, Maine
Contact:

Re: Polling time frequently higher than 5 minutes

Post by pauld » Tue Jun 24, 2014 9:39 am

Hi Chris,

If Denika has been working well with polling ~4,500 reports for awhile, I think we need to investigate what has changed. What was the date that Denika started to have problems?

You mentioned that Denika is running on a VM. Are the CPU specs you listed dedicated to Denika or shared with a number of other VM's? Were any additional VM's added to the host that Denika is running on that day? We see this all the time were a VM will no longer have the resources it needs because the host is over provisioned.

Was antivirus added to the Denika VM? If so, please exclude the PlixerTools directory from active scanning.

Were additional devices discovered that pushed the Denika VM over the edge?

Did you see any improvement when changing the amount of reports per poll file setting to 500-600?

On the Denika homepage in the bottom right, there's a graph that shows the amount of time a poll cycle is taking in Denika. Please post the daily, weekly, monthly, yearly graphs for poll time so we can get an idea of how the server has been performing historically.

Thanks,
Paul

crwood1
Posts: 4
Joined: Thu Jun 19, 2014 10:19 am

Re: Polling time frequently higher than 5 minutes

Post by crwood1 » Tue Jun 24, 2014 9:55 am

Hi Sean,

This issue started to occur on June Friday the 13th (Go Figure), but got worse over the weekend.

The VMs do share resources, and I know new VMs are definitely being added.

I didn't change the reports per file yet. You gave an example of 500 per CPU on an 8 core. Should I try 750 per file if the processor is supposed to be a 6 core? Or do you want me to just try 500 - 600?

Thanks.

Chris.

Image

pauld
Posts: 222
Joined: Mon Jan 04, 2010 10:05 am
Location: Kennebunk, Maine
Contact:

Re: Polling time frequently higher than 5 minutes

Post by pauld » Tue Jun 24, 2014 10:31 am

Hi Chris,

Looking at the last 30 days poll time graph, there was something that changed during week 24 that spiked poll times significantly. It appears that Denika is no longer able to get access to all of the resources it needs.

Tweaking the number of reports per poll file may help some, but I don't think it's going to resolve this issue. You mentioned the host had a 6 core processor, but how many CPU's have been assigned to the Denika VM?

I need you to start investigating the changes that took place on the day of the spike. Here are some places to start investigating:

- Were any additional VM's added to the VM host that Denika is running on?

- Were any of the resources of the VM's increased that day?

- Are the data stores on the VM host local and dedicated or part of a NAS/SAN? If they're part of a NAS/SAN, how many applications/servers using the NAS? Did some of the VM's using the NAS change and now the NAS is over provisioned?

- Does the Denika VM have high disk queue lengths and write times?

- Was antivirus added to the Denika VM? If so, please exclude the PlixerTools directory from active scanning.

- Were a lot of additional devices/reports added in Denika?

Thanks,
Paul

crwood1
Posts: 4
Joined: Thu Jun 19, 2014 10:19 am

Re: Polling time frequently higher than 5 minutes

Post by crwood1 » Wed Jul 02, 2014 10:10 am

Hi Paul,

- Were any additional VM's added to the VM host that Denika is running on?
At this time our servers team has been working to migrate servers off the cluster that our Denika VM sits on to a new system. They've actually removed 25 so far, I just found out. So that should answer several of the questions.

- Are the data stores on the VM host local and dedicated or part of a NAS/SAN? If they're part of a NAS/SAN, how many applications/servers using the NAS? Did some of the VM's using the NAS change and now the NAS is over provisioned?
Part of a SAN.

- Does the Denika VM have high disk queue lengths and write times?
I'll try to find this out.

- Was antivirus added to the Denika VM? If so, please exclude the PlixerTools directory from active scanning.
No.

pauld
Posts: 222
Joined: Mon Jan 04, 2010 10:05 am
Location: Kennebunk, Maine
Contact:

Re: Polling time frequently higher than 5 minutes

Post by pauld » Wed Jul 02, 2014 10:41 am

Thanks for the update, Chris.

Since VM's were moved off of that host, this sounds like less of a CPU or memory issue, so I would recommend to start investigating disk I/O bottlenecks.

One way you can accomplish this is to migrate the Denika VM to a dedicated local datastore (off of the NAS) to rule out the NAS as a problem. We regularly see inconsistent disk performance when writing to a busy NAS/SAN.

Let me know what you find.

Thanks,
Paul

Locked

Who is online

Users browsing this forum: No registered users and 1 guest