** This post is more of the issue I’ve seen with the setup in one of our environments than implementation**. So here is what happened.
Issue: Accessing the URL ( lets say http://www.example.com ) is going to maintenance static page running in S3 but none of the nodes are offline.
As in general we have a cloudfront configured with origin as static website configured in S3 for failover. For primary we have public ELB managing web nodes which have connectivity to app nodes through private ELB and connecting to RDS DB backend.
- Checked the “Route 53” configuration. We have a healthcheck configured for the site ( URL: https://www.example.com, HTTPS, Request interval, Failure threshold, Health checker regions ) and as expected it is failing and the same reason why the site is going to maintenance while accessing from browser.
- More on “Route 53”. Checking hostedzone for the resource records – www.example.com is an ALIAS to dualstack ELB with Failover – Primary + Healthcheck configured. Failover – Secondary is a ALIAS to Cloudfront.
- ELB – Availabile in dual zone – port 80:80, 443:80, crosszone load balancing enabled.The important bit here is we have enabled TCP port checks – for port 80 and it is always up during the issue.
- No issues were seen on the instances configured through public ELB.
- We have internal load balancer managing app nodes in three availability zones with TCP port checks on 80 configured and no issues there as well.
So in summary all port checks are OK – they are up but the actual site is down.
Logging into one of the web node and running curl
#curl -I http://localhost returning 503 status but actually it should return 200 OK.
So in the end we managed to find that app nodes not getting the required information from DB to return the result to web node.