What are Hystrix benefits over normal exception handling?

I think you are saying that we could just implement the entire circuit breaker logic? And you are correct. But why is better use something already proven as Hystrix? I would say:

  1. Circuit Break logic already bullet proof.
  2. Metrics out of the box, such as this dashboard
  3. Defines a pattern to deal with cascade failures of your interconnected services. Meaning, if one service goes down, you already had though on what to do to keep serving request on your very own service.
  4. It helps developers to change its way of thinking when writing code against external dependencies -design for failure-, simple by making them think on what if it fails? Regularly, developers don't do that. You assume that it will work.

I think there's no magic definition under hystrix. It is a simple problem that developers don't usually think about it.


Hysterix is used to stop cascading failures, I'll give you an example to explain what I mean: Lets pretend you have 3 components: 1) Frontend, 2) Backend A and 3) Backend B.
Frontend talks to Backend A and Backend A asks backend B to do some sort of lookup. The Frontend receives 50k requests per second, which means 50k requests are going to Backend A and another 50k requests going to Backend B. If Backend B Becomes unhealthy, that is 50k sockets you're holding open between Backend B to Backend A, and another 50k sockets open between Backend A and the Frontend. What will end up happening is all the servers involved in the transaction will all start to hang because all the sockets are being held open. The sockets will fill up really fast, at 50k a second, with a 20 second timeout, thats 1 million open sockets between each server! the result of Backend B timing out will mean requests to Backend A will timeout which will mean requests to the Frontend will also time out. Hysterix (or the idea of circuit breaking) is pretty much introducing a switch where when a server becomes unhealthy, it will have some sort of way to deal with the errors such as stopping all future requests and just give a predefined response instantly, resulting in the sockets closing straight away and no cascading failures occuring. This results in increased resilience and better fault tolerance.


As you said, it can be simply wrapped under try-catch block then why choose Hystrix or some other library? What i experienced:

  • Already test proven library.
  • Ability to skip original intended calls and fallback. Note that if you wrap it under try-catch, there will be still be an attempt to connect and send command which will eventually timeout due to degraded dependency. Knowing this information prior to call will enable to skip the calls for sometime (as per configuration) and you can save those resources
  • Provides circuit breaking using Sliding Time Window as well
  • Metrics and Dashboarding provided Out of the Box which can help you peek into your system and dependent connection
  • Implements BulkHead by using different Thread Pools
  • Lower maintenance cost
  • Health check ability. It provides a health check class which plugins with Health monitoring APIs

The main difference is that Hystrix opens the circuit (it is an analogy to electrical circuits) when it detects an error and does not invoke downstream services until some time has elapsed. This behaviour prevents an avalanche of errors in cascading. It is similar to a smart traffic light that turns red and doesn't let you pass because it knows that you are going to have an accident a little later. After a configurable time, the circuit is closed again. You can see 'Circuit opened / closed' at Hystrix dashboard:

enter image description here

It is also well explained by Chris Richardson at pattern - circuit breaker