HTTP 503 error repeatedly causing website to crash on RHEL 5.10
Hello,
We have a Coldfusion 9 website sitting on a server that is running RHEL 5.10 and Apache 2.2.3. The website accesses a Postgresql 8.4 database from just a few of its pages; the majority of the website pages are static. We've never had any trouble before last Tuesday, when the website suddenly started crashing and giving 503 errors. We had to restart the coldfusion and apache services and then it would come back up, but would crash again in 1-3 hours. We at first thought it was a coldfusion issue, but there were no errors in any of the CF logs, but a lot of 503 errors in the httpd access_log. We noticed that the majority of the 503 errors were on the 2 pages of the website the do the most database access. Some discussion with the server host company suggested our apache was not able to handle the amount of requests coming in. This is a very busy time of year for our site and it would be getting more than normal hits on those pages. We did get some suggestions for tuning our coldfusion server from Adobe support, but making those changes did not help. We looked into how to tune the apache and came upon a suggestion to increase the max clients.
We had this in our httpd.conf:
StartServers 2
MaxClients 150
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
So we changed it to this:
ServerLimit 16
StartServers 2
MaxClients 400
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
Then restarted httpd service. This did not seem to help any, so the next day we changed it again, to this:
ServerLimit 40
StartServers 2
MaxClients 1000
MinSpareThreads 25
MaxSpareThreads 75
ThreadsPerChild 25
MaxRequestsPerChild 0
and again restarted the httpd service. This seemed to make the 503 errors slow down some, but the website was still crashing multiple times a day.
Finally we took the 2 pages that were getting the most traffic (that do the most database access) and moved them to another server, and in the script on this server just redirected to the other server. This resolved the issue for the time being; the website stopped crashing and we've had no more 503 errors. However, we'd like to be able to tune the apache to handle the load so we can keep those pages on the server where the website resides. Does anyone have any other httpd.conf settings to suggest? (BTW, we are planning to upgrade the server to RHEL 6.5, CF 10, and Postgres 9.2 but have to wait for Sept for funding.)
Many thanks,
Julie
Responses