How to make a HTTP call reaching all instances behind amazon AWS load balancer?

one of the way I'd solve this problem is by

  1. writing the data to an AWS s3 bucket
  2. triggering a AWS Lambda function automatically from the s3 write
  3. using AWS SDK to to identify the instances attached to the ELB from the Lambda function e.g. using boto3 from python or AWS Java SDK
  4. call /refresh on individual instances from Lambda
  5. ensuring when a new instance is created (due to autoscaling or deployment), it fetches the data from the s3 bucket during startup
  6. ensuring that the private subnets the instances are in allows traffic from the subnets attached to the Lambda
  7. ensuring that the security groups attached to the instances allow traffic from the security group attached to the Lambda

the key wins of this solution are

  • the process is fully automated from the instant the data is written to s3,
  • avoids data inconsistency due to autoscaling/deployment,
  • simple to maintain (you don't have to hardcode instance ip addresses anywhere),
  • you don't have to expose instances outside the VPC
  • highly available (AWS ensures the Lambda is invoked on s3 write, you don't worry about running a script in an instance and ensuring the instance is up and running)

hope this is useful.


While this may not be possible given the constraints of your application & circumstances, its worth noting that best practice application architecture for instances running behind an AWS ELB (particularly if they are part of an AutoScalingGroup) is ensure that the instances are not stateful.

The idea is to make it so that you can scale out by adding new instances, or scale-in by removing instances, without compromising data integrity or performance.

One option would be to change the application to store the results of the reference data reload into an off-instance data store, such as a cache or database (e.g. Elasticache or RDS), instead of in-memory.

If the application was able to do that, then you would only need to hit the refresh endpoint on a single server - it would reload the reference data, do whatever analysis and manipulation is required to store it efficiently in a fit-for-purpose way for the application, store it to the data store, and then all instances would have access to the refreshed data via the shared data store.

While there is a latency increase adding a round-trip to a data store, it is often well worth it for the consistency of the application - under your current model, if one server lags behind the others in refreshing the reference data, if the ELB is not using sticky sessions, requests via the ELB will return inconsistent data depending on which server they are allocated to.