Ribbon load balancer with webclient differs from rest template one (not properly balanced)

I did simple POC and everything works exactly the same with web client and rest template for default configuration.

Rest server code:

@SpringBootApplication
internal class RestServerApplication

fun main(args: Array<String>) {
    runApplication<RestServerApplication>(*args)
}

class BeansInitializer : ApplicationContextInitializer<GenericApplicationContext> {
    override fun initialize(context: GenericApplicationContext) {
        serverBeans().initialize(context)
    }
}

fun serverBeans() = beans {
    bean("serverRoutes") {
        PingRoutes(ref()).router()
    }
    bean<PingHandler>()
}

internal class PingRoutes(private val pingHandler: PingHandler) {
    fun router() = router {
        GET("/api/ping", pingHandler::ping)
    }
}

class PingHandler(private val env: Environment) {
    fun ping(serverRequest: ServerRequest): Mono<ServerResponse> {
        return Mono
            .fromCallable {
                // sleap added to simulate some work
                Thread.sleep(2000)
            }
            .subscribeOn(elastic())
            .flatMap {
                ServerResponse.ok()
                    .syncBody("pong-${env["HOSTNAME"]}-${env["server.port"]}")
            }
    }
}

In application.yaml add:

context.initializer.classes: com.lbpoc.server.BeansInitializer

Dependencies in gradle:

implementation('org.springframework.boot:spring-boot-starter-webflux')

Rest client code:

@SpringBootApplication
internal class RestClientApplication {
    @Bean
    @LoadBalanced
    fun webClientBuilder(): WebClient.Builder {
        return WebClient.builder()
    }

    @Bean
    @LoadBalanced
    fun restTemplate() = RestTemplateBuilder().build()
}

fun main(args: Array<String>) {
    runApplication<RestClientApplication>(*args)
}

class BeansInitializer : ApplicationContextInitializer<GenericApplicationContext> {
    override fun initialize(context: GenericApplicationContext) {
        clientBeans().initialize(context)
    }
}

fun clientBeans() = beans {
    bean("clientRoutes") {
        PingRoutes(ref()).router()
    }
    bean<PingHandlerWithWebClient>()
    bean<PingHandlerWithRestTemplate>()
}

internal class PingRoutes(private val pingHandlerWithWebClient: PingHandlerWithWebClient) {
    fun router() = org.springframework.web.reactive.function.server.router {
        GET("/api/ping", pingHandlerWithWebClient::ping)
    }
}

class PingHandlerWithWebClient(private val webClientBuilder: WebClient.Builder) {
    fun ping(serverRequest: ServerRequest) = webClientBuilder.build()
        .get()
        .uri("http://rest-server-poc/api/ping")
        .retrieve()
        .bodyToMono(String::class.java)
        .onErrorReturn(TimeoutException::class.java, "Read/write timeout")
        .flatMap {
            ServerResponse.ok().syncBody(it)
        }
}

class PingHandlerWithRestTemplate(private val restTemplate: RestTemplate) {
    fun ping(serverRequest: ServerRequest) = Mono.fromCallable {
        restTemplate.getForEntity("http://rest-server-poc/api/ping", String::class.java)
    }.flatMap {
        ServerResponse.ok().syncBody(it.body!!)
    }
}

In application.yaml add:

context.initializer.classes: com.lbpoc.client.BeansInitializer
spring:
  application:
    name: rest-client-poc-for-load-balancing
logging:
  level.org.springframework.cloud: DEBUG
  level.com.netflix.loadbalancer: DEBUG
rest-server-poc:
  listOfServers: localhost:8081,localhost:8082

Dependencies in gradle:

implementation('org.springframework.boot:spring-boot-starter-webflux')
implementation('org.springframework.cloud:spring-cloud-starter-netflix-ribbon')

You can try it with two or more instances for server and it works exactly the same with web client and rest template.

Ribbon use by default zoneAwareLoadBalancer and if you have only one zone all instances for server will be registered in "unknown" zone.

You might have a problem with keeping connections by web client. Web client reuse the same connection in multiple requests, rest template do not do that. If you have some kind of proxy between your client and server then you might have a problem with reusing connections by web client. To verify it you can modify web client bean like this and run tests:

@Bean
@LoadBalanced
fun webClientBuilder(): WebClient.Builder {
    return WebClient.builder()
        .clientConnector(ReactorClientHttpConnector { options ->
            options
                .compression(true)
                .afterNettyContextInit { ctx ->
                    ctx.markPersistent(false)
                }
        })
}

Of course it's not a good solution for production but doing that you can check if you have a problem with configuration inside your client application or maybe problem is outside, something between your client and server. E.g. if you are using kubernetes and register your services in service discovery using server node IP address then every call to such service will go though kube-proxy load balancer and will be (by default round robin will be used) routed to some pod for that service.


You have to configure Ribbon config to modify the load balancing behavior (please read below).

By default (which you have found yourself) the ZoneAwareLoadBalancer is being used. In the source code for ZoneAwareLoadBalancer we read:
(highlighted by me are some mechanics which could result in the RPS pattern you see):

The key metric used to measure the zone condition is Average Active Requests, which is aggregated per rest client per zone. It is the total outstanding requests in a zone divided by number of available targeted instances (excluding circuit breaker tripped instances). This metric is very effective when timeout occurs slowly on a bad zone.

The LoadBalancer will calculate and examine zone stats of all available zones. If the Average Active Requests for any zone has reached a configured threshold, this zone will be dropped from the active server list. In case more than one zone has reached the threshold, the zone with the most active requests per server will be dropped. Once the the worst zone is dropped, a zone will be chosen among the rest with the probability proportional to its number of instances.

If your traffic is being served by one zone (perhaps the same box?) then you might get into some additionally confusing situations.

Please also note that without using LoadBallancedFilterFunction the average RPS is the same as when you use it (on the graph all lines converge to the median line) after the change, so globally looking both load balancing strategies consume the same amount of available bandwidth but in a different manner.

To modify your Ribbon client settings, try following:

public class RibbonConfig {

  @Autowired
  IClientConfig ribbonClientConfig;

  @Bean
  public IPing ribbonPing (IClientConfig config) {
    return new PingUrl();//default is a NoOpPing
  }

  @Bean
  public IRule ribbonRule(IClientConfig config) {
    return new AvailabilityFilteringRule(); // here override the default ZoneAvoidanceRule
  }

}

Then don't forget to globally define your Ribbon client config:

@SpringBootApplication
@RibbonClient(name = "app", configuration = RibbonConfig.class)
public class App {
  //...
}

Hope this helps!