Tuesday, November 13, 2007

HTTPERR Connections_Refused

Symptom

IIS 6.0 website stops responding to requests.  Using telnet to connect to the website port can't make a connection.

Regardless on whether you start at the "top", and worked your way down the application stack, or started at the "bottom" and worked your way up, you would eventually find the HTTPERR log files for http.sys and they would lead you to the answer: Google for "Connections_Refused" and you would have hit the number one ranked response by David Wang - HOWTO: Diagnose IIS6 Failing to Accept Connections Due to CONNECTIONS_REFUSED.   The hint here to start at the "bottom" vs. the "top" is that there is no application level response. e.g. HTTP 500, or anything else for that matter.

If this is the problem your experiencing, stop now, read the article and see if that's the end of your journey.  What follows is just my account of backing up the information found in the article.

As I never seemed to make a connection to the web service, I looked in the HTTPERR logs.  Sure enough, there was the tattletale entries of "Connections_Refused" which led me directly to David's article. 

Sample HTTPERR logs

[..snip..]

2007-11-13 20:01:44 - - - - - - - - - 3_Connections_Refused -
2007-11-13 20:01:49 - - - - - - - - - 4_Connections_Refused -
2007-11-13 20:01:54 - - - - - - - - - 3_Connections_Refused -
2007-11-13 20:01:59 - - - - - - - - - 4_Connections_Refused -
2007-11-13 20:02:04 - - - - - - - - - 3_Connections_Refused -
2007-11-13 20:02:09 - - - - - - - - - 4_Connections_Refused -

[..snip..]

 

Since I've never actually experienced a condition where I've not had non-paged pool memory available, I continued to follow the steps in the article to validate (I'm curious that way).  First thing I noticed was that the server's non-paged pool memory was a little more than 109MB, as seen below. 

 

 

Normally you have 256MB of non-paged pool memory available to an x86 Windows 2003 Server, however, we were running with the /3GB switch enabled in the boot.ini, which further reduced that non-paged pool memory half to 128MB.  There was no requirement for the server to run with the /3GB switch, so we'll remove that at the earliest opportunity, but why the high memory utilization?

Running poolmon -b showed the kernel memory allocations (paged and non-paged pool) sorted with the highest allocations.  Turns out, there was a driver that had allocated over 69MB of non-paged pool memory all to itself, as shown below. 

 

Http.sys behavior is to stop accepting new connections when available non-paged pool memory falls below 20MB.  128MB - 109MB = 19MB available non-paged pool memory, if I read that correctly, which translates to "Connections_Refused". 

This driver belonged to our virus scanning software, so I punted to our infrastructure group with a "Whats up with this?" email, basically the contents of this post.  They flipped it to the vendor, who identified that yes it is a problem, and yes its already been fixed.  Matter of fact, it had been delivered to us the day before, we were just waiting for a regularly scheduled change window to apply the fix. 

References

3 comments:

  1. Microsoft has a solution for your problem
    http://support.microsoft.com/kb/934878

    I had the same problem and is fixed now.

    ReplyDelete
  2. This was a great solution to my problem, I could not figured out before this.

    I tried everything from restarting IIS, Amending the Metabase.xml file, and everything in between looking at DNS and nothing worked until I tried the solutions from Ruud above.

    ReplyDelete
  3. This is was the temporary solution to my problem, my website had gone down and the exchange web access was not working.

    I tried to restart IIS, tried to fix the Metabase XML file, tried to look at my DNS, tried to see if something was moved, and it was not until I looked at the httperr file that I drilled down into this solution.

    I went through all possible scenarios was close to the point of deceptions but this solution opened my understanding that windows is full of little secrets.

    lets hope that I can figure out what is driver or program is leaking in the server.

    Thank you Zach for the post and Ruud for the follow up...

    ReplyDelete