As mentioned here :
MIN: This will be the minimum number of instances that can run in your auto scale group. If your scale down CloudWatch alarm is triggered, your auto scale group will never terminate instances below this number.
DESIRED: If you trip a CloudWatch alarm for a scale up event, then it will notify the auto scaler to change it’s desired to a specified higher amount and the auto scaler will start an instance/s to meet that number. If you trip a CloudWatch alarm to scale down, then it will change the auto scaler desired to a specified lower number and the auto scaler will terminate instance/s to get to that number.
MAX: This will be the maximum number of instances that you can run in your auto scale group. If your scale up CloudWatch alarm stays triggered, your auto scale group will never create instances more than the maximum amount specified.
If an AutoScalingGroup config has Min=1, Desired=3, Max=5 and there is an Alarm set on an AutoScalingPolicy which says if CPU usage is <50% for consecutive 10 mins then
Remove 1 instances then it will keep reducing the instance count by 1 whenever the alarm is triggered until the DesiredCount = MinCount.
Lessons Learnt: Set the MinCount to be > 0 or = DesiredCount. This will make sure that the application is not brought down when the mincount=0 and CPU usage goes down.
The below example which made me curious:
The suggestion from AWS was as follows:
We are always working to make our systems more responsive, but it is challenging to provision virtual servers automatically with a response time of a few seconds as your use case appears to require. Perhaps there is a workaround that responds more quickly or that is more resilient when requests begin to increase.
Have you observed whether the site performs better if you use a larger instance type or a larger number of instances in the steady state? That may be one method to be resilient to rapid increases in inbound requests. Although I recognize it may not be the most cost-effective, you may find this to be a quick fix.
Another approach may be to adjust your alarm to use a threshold or a metric that would reflect (or predict) your demand increase sooner. For example, you might see better performance if you set your alarm to add instances after you exceed 75 or 100 users. You may already be doing this. Aside from that, your use case may have another indicator that predicts a demand increase, for example a posting on your Facebook page may precede a significant request increase by several seconds or even a minute. Using CloudWatch custom metrics to monitor that value and then setting an alarm to Auto Scale on it may also be a potential solution.