Load Balancing

Let’s build on what we’ve put together so far. We have service instances registered with eureka. The eureka registry is hierarchical: we have a list of registered applications, and for each application, a list of running instances and their information: home page url, etc.. In this lab we’re going to dig deeper into how clients can load-balance across these multiple instances.

The basic eureka API that we’re using, eurekaClient.getNextServerFromEureka(..), has simple round-robin logic: it increments an index (modulo the list of service instances) and returns the service instance at that index.

Netflix offers a project named Ribbon that gives us more control, and more options over load balancing. The dependency we added to our project in the last lab, spring-cloud-starter-eureka, already has a transitive dependency on Ribbon. And so it turns out we already have Ribbon on the classpath. It’s just a matter of using its load balancing API.

Gradle and Transitive Dependencies

With gradle, to produce a tree view of the transitive dependencies for a gradle configuration, invoke a command similar to this:

$ gradle dependencies --configuration compile

Another, more targeted command for obtaining insight into how a dependency is pulled in to your classpath is dependencyInsight:

$ gradle dependencyInsight --dependency ribbon

Spring Cloud Netflix offers the LoadBalancerClient interface (implemented by RibbonLoadBalancerClient) which we can Autowire into our Spring application and use in place of the existing EurekaClient. The difference is that it will use Ribbon for the load balancing.

It’s worth noting that Ribbon can be used as a standalone library. When used in this way, we must configure Ribbon with the list of service instances to load-balance across. Here however, we’re making use of a ribbon-to-eureka integration that uses the dynamic list of service instances from the eureka registry to configure Ribbon. So, there’s a compelling synergy here when we use eureka and ribbon in combination.

1. Using the Ribbon API

⇒ Open the greeting application’s FortuneServiceClient class, and review the code. Your task is to replace the EurekaClient with an autowired instance of a LoadBalancerClient. It’s important to note that:

The class LoadBalancerClient has a choose() method that one invokes instead of the previous getNextServerFromEureka(), and
This method does not return the eureka-specific InstanceInfo type as before. It returns instead a Spring-defined class named ServiceInstance which, in a sense, is a generalization of the same concept that is not specific to eureka.

Here’s a sample diff after the change has been applied:

+++ greeting-app/src/main/java/io/pivotal/training/greeting/FortuneServiceClient.java
@@ -1,10 +1,9 @@
 package io.pivotal.training.greeting;

-import com.netflix.appinfo.InstanceInfo;
-import com.netflix.discovery.EurekaClient;
 import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
 import lombok.extern.slf4j.Slf4j;
+import org.springframework.cloud.client.ServiceInstance;
+import org.springframework.cloud.client.loadbalancer.LoadBalancerClient;
 import org.springframework.stereotype.Component;
 import org.springframework.web.client.RestTemplate;

@@ -14,11 +13,11 @@ import java.util.Map;
 @Slf4j
 public class FortuneServiceClient {
   private RestTemplate restTemplate;
-  private EurekaClient eurekaClient;
+  private LoadBalancerClient loadBalancerClient;

-  public FortuneServiceClient(RestTemplate restTemplate, EurekaClient eurekaClient) {
+  public FortuneServiceClient(RestTemplate restTemplate, LoadBalancerClient loadBalancerClient) {
     this.restTemplate = restTemplate;
-    this.eurekaClient = eurekaClient;
+    this.loadBalancerClient = loadBalancerClient;
   }

   @HystrixCommand(fallbackMethod = "defaultFortune")
@@ -31,8 +30,8 @@ public class FortuneServiceClient {
   }

   private String lookupUrlFor(String appName) {
-    InstanceInfo instanceInfo = eurekaClient.getNextServerFromEureka(appName, false);
-    return instanceInfo.getHomePageUrl();
+    ServiceInstance instance = loadBalancerClient.choose(appName);
+    return String.format("http://%s:%s", instance.getHost(), instance.getPort()); (1)
   }

   public String defaultFortune() {

1	Construct a URL from the instance hostname and port number

This code is a straightforward substitution of one delegate for another. The difference is that we’re now load-balancing with Ribbon. Let’s test it..

2. Testing load balancing behavior across two instances of fortune

Start the eureka server
Start the fortune service
Start a second instance of the fortune service on a different port, like this:
```
$ cd fortune-service
$ SERVER_PORT=8082 gradle bootRun
```

The above example inlines the setting of an environment variable SERVER_PORT before launching the application. It’s rather convenient, and works on macs, linux operating systems, and windows machines running bash (e.g. cygwin). If you’re using a simple windows dos shell, the same is easily achieved with these commands:

$ cd fortune-service
$ set SERVER_PORT=8082
$ gradle bootRun

At this point, it’s worthwhile verifying that both instances of fortune-service are registered with eureka. Bring up the eureka dashboard on http://localhost:8761/:

The status column’s "UP (2)" means two instances of FORTUNE are up and running. Terrific. We can also visit each instance in a browser and make sure they’re serving up fortunes.

Finally, start up an instance of the greeting application. Now, simply visit the greeting application at http://localhost:8080 multiple times and watch the log for each of the fortune service instances, and see how each greeting request is handled in alternating fashion across them. Here is a "side-by-side" screenshot of the log output from both fortune instances, whose timestamps show each instance being called in turn:

2.1. Spring Cloud Contract Revisited

Re-run the test in FortuneServiceClientTests. It should be failing.

⇒ Retrofit the test to mock the LoadBalancerClient instead of the EurekaClient. Here’s a diff summarizing updating this test to make it pass once more:

+++ greeting-app/src/test/java/io/pivotal/training/greeting/FortuneServiceClientTests.java
@@ -9,6 +9,8 @@ import org.mockito.Mock;
 import org.springframework.beans.factory.annotation.Autowired;
 import org.springframework.boot.test.context.SpringBootTest;
 import org.springframework.boot.test.mock.mockito.MockBean;
+import org.springframework.cloud.client.ServiceInstance;
+import org.springframework.cloud.client.loadbalancer.LoadBalancerClient;
 import org.springframework.cloud.contract.stubrunner.spring.AutoConfigureStubRunner;
 import org.springframework.test.context.junit4.SpringRunner;

@@ -26,16 +28,17 @@ public class FortuneServiceClientTests {

   @Autowired private FortuneServiceClient fortuneServiceClient;

-  @MockBean EurekaClient eurekaClient;
-  @Mock InstanceInfo instanceInfo;
+  @MockBean LoadBalancerClient loadBalancerClient;
+  @Mock ServiceInstance instance;

   private static final String ExpectedFortune = "a random fortune";

   @Before
   public void setup() {
     initMocks(FortuneServiceClientTests.class);
-    when(instanceInfo.getHomePageUrl()).thenReturn("http://localhost:8081/");
-    when(eurekaClient.getNextServerFromEureka(anyString(), anyBoolean())).thenReturn(instanceInfo);
+    when(instance.getHost()).thenReturn("localhost");
+    when(instance.getPort()).thenReturn(8081);
+    when(loadBalancerClient.choose(anyString())).thenReturn(instance);
   }

   @Test

3. Replacing LoadBalancerClient with a LoadBalanced RestTemplate

Spring Cloud Netflix provides an alternative, more elegant mechanism to make load-balanced REST API calls, which transparently invokes the Ribbon load balancer via a cross-cutting annotation named @LoadBalanced.

Let’s see how this works by refactoring our existing implementation:

In our application class, GreetingApplication, explicitly annotate our RestTemplate bean with the @LoadBalanced annotation, as shown in this diff:

+++ greeting-app/src/main/java/io/pivotal/training/GreetingApplication.java
@@ -4,6 +4,7 @@ import org.springframework.boot.SpringApplication;
 import org.springframework.boot.autoconfigure.SpringBootApplication;
 import org.springframework.cloud.client.circuitbreaker.EnableCircuitBreaker;
 import org.springframework.cloud.client.discovery.EnableDiscoveryClient;
+import org.springframework.cloud.client.loadbalancer.LoadBalanced;
 import org.springframework.context.annotation.Bean;
 import org.springframework.web.client.RestTemplate;

@@ -17,6 +18,7 @@ public class GreetingApplication {
   }

   @Bean
+  @LoadBalanced
   public RestTemplate restTemplate() {
     return new RestTemplate();
   }

In FortuneServiceClient we effectively undo all of our work, and pretend that our fortune instances are accessible at "http://fortune/". So the code looks very natural, and all of the lookup and load balancing concerns are now effectively encapsulated in the RestTemplate call:

+++ greeting-app/src/main/java/io/pivotal/training/greeting/FortuneServiceClient.java
@@ -2,8 +2,6 @@ package io.pivotal.training.greeting;

 import com.netflix.hystrix.contrib.javanica.annotation.HystrixCommand;
 import lombok.extern.slf4j.Slf4j;
-import org.springframework.cloud.client.ServiceInstance;
-import org.springframework.cloud.client.loadbalancer.LoadBalancerClient;
 import org.springframework.stereotype.Component;
 import org.springframework.web.client.RestTemplate;

@@ -13,27 +11,19 @@ import java.util.Map;
 @Slf4j
 public class FortuneServiceClient {
   private RestTemplate restTemplate;
-  private LoadBalancerClient loadBalancerClient;

-  public FortuneServiceClient(RestTemplate restTemplate, LoadBalancerClient loadBalancerClient) {
+  public FortuneServiceClient(RestTemplate restTemplate) {
     this.restTemplate = restTemplate;
-    this.loadBalancerClient = loadBalancerClient;
   }

   @HystrixCommand(fallbackMethod = "defaultFortune")
   public String getFortune() {
-    String baseUrl = lookupUrlFor("FORTUNE");
-    Map<String,String> result = restTemplate.getForObject(baseUrl, Map.class);
+    Map<String,String> result = restTemplate.getForObject("http://fortune/", Map.class);
     String fortune = result.get("fortune");
     log.info("received fortune '{}'", fortune);
     return fortune;
   }

-  private String lookupUrlFor(String appName) {
-    ServiceInstance instance = loadBalancerClient.choose(appName);
-    return String.format("http://%s:%s", instance.getHost(), instance.getPort());
-  }
-
   public String defaultFortune() {
     log.info("Default fortune used.");
     return "Your future is uncertain";

Of course, behind the scenes, Spring is automatically parsing the url, and using a LoadBalancerClient to make the call against Ribbon, which in turn has obtained the "server list" from eureka. It then obtains the actual coordinates for the selected service instance, and substitutes it back into the url before making the http request.

Feel free to re-test that greeting is still functioning properly.

4. Configure Spring Cloud Contract to Stub Service Discovery

It turns out we can configure Spring Cloud Contract to automatically mock eureka. Spring Cloud Contract will arrange for eureka’s answer to "give me the url to the fortune service" to be the url of the Wiremock stub it started for us.

To configure this behavior, instead of hard-coding the port our stub should listen on, we supply the stub runner with an "artifact-id to eureka-service-id" mapping. Here’s how this works:

Remove our own hard-coded mocking of eureka:

greeting-app/src/test/java/io/pivotal/training/greeting/FortuneServiceClientTests.java

@@ -1,46 +1,24 @@
 package io.pivotal.training.greeting;

-import com.netflix.appinfo.InstanceInfo;
-import com.netflix.discovery.EurekaClient;
-import org.junit.Before;
 import org.junit.Test;
 import org.junit.runner.RunWith;
-import org.mockito.Mock;
 import org.springframework.beans.factory.annotation.Autowired;
 import org.springframework.boot.test.context.SpringBootTest;
-import org.springframework.boot.test.mock.mockito.MockBean;
-import org.springframework.cloud.client.ServiceInstance;
-import org.springframework.cloud.client.loadbalancer.LoadBalancerClient;
 import org.springframework.cloud.contract.stubrunner.spring.AutoConfigureStubRunner;
 import org.springframework.test.context.junit4.SpringRunner;

 import static org.assertj.core.api.Assertions.assertThat;
-import static org.mockito.Matchers.anyBoolean;
-import static org.mockito.Matchers.anyString;
-import static org.mockito.Mockito.when;
-import static org.mockito.MockitoAnnotations.initMocks;
 import static org.springframework.boot.test.context.SpringBootTest.WebEnvironment.NONE;

 @RunWith(SpringRunner.class)
 @SpringBootTest(webEnvironment = NONE)
-@AutoConfigureStubRunner(workOffline = true, ids = "io.pivotal.training.springcloud:fortune-service:+:stubs:8081")
+@AutoConfigureStubRunner(workOffline = true, ids = "io.pivotal.training.springcloud:fortune-service:+:stubs")
 public class FortuneServiceClientTests {

   @Autowired private FortuneServiceClient fortuneServiceClient;

-  @MockBean LoadBalancerClient loadBalancerClient;
-  @Mock ServiceInstance instance;
-
   private static final String ExpectedFortune = "a random fortune";

-  @Before
-  public void setup() {
-    initMocks(FortuneServiceClientTests.class);
-    when(instance.getHost()).thenReturn("localhost");
-    when(instance.getPort()).thenReturn(8081);
-    when(loadBalancerClient.choose(anyString())).thenReturn(instance);
-  }
-
   @Test
   public void shouldReturnAFortune() {
     assertThat(fortuneServiceClient.getFortune()).isEqualTo(ExpectedFortune);

Create a new configuration file named application.yml file under the test resources directory, configured as follows:
greeting-app/src/test/resources/application.yml
```
stubrunner:
  ids-to-service-ids:
    fortune-service: fortune
eureka:
  client:
    enabled: false
```

We’re basically saying: the stub with artifact id 'fortune-service' should map to the eureka service id 'fortune'.

It’s important to clarify that:

In Spring Cloud, the configuration property spring.application.name is used as the eureka service id, i.e. the key under which the service is registered with eureka.
The artifact id on the other hand, in a gradle project is either explicitly specified in a file named settings.gradle, or inferred from the gradle project’s directory name.

Re-run the test FortuneServiceClientTests and it should pass.

5. Customizing the Load Balancing Rule

The Ribbon API defines a simple interface, named IRule, that governs the rules (or strategy) that Ribbon uses in selecting a server instance. Out of the box, Ribbon provides a RandomRule, the RoundRobinRule (default) and a number of other interesting, some specialized, rules. We are also free to implement and use our own custom IRule.

In this section, we’re going to explore the WeightedResponseTimeRule, which is a strategy that favors service instances that respond quickly, by assigning them a greater weight.

The Spring Cloud reference guide contains a section that discusses how to customize Ribbon via simple configuration properties. Let’s do precisely that:

⇒ Open the greeting application’s application.yml file and configure the Ribbon client for the fortune service to use the rule class named WeightedResponseTimeRule, like this:

greeting-app/src/main/resources/application.yml

@@ -9,3 +9,6 @@ management:
 greeting:
   displayFortune: true

+fortune:
+  ribbon:
+    NFLoadBalancerRuleClassName: com.netflix.loadbalancer.WeightedResponseTimeRule

We see here that the name of the configuration property is NFLoadBalancerRuleClassName, and that the value is a fully qualified class name.

More interestingly, the "fully qualified" property name follows the convention <clientname>.ribbon.<propertyname>. The idea is that an application can have multiple Ribbon clients, each configured differently and calling out to different backing services. When using the ribbon-to-eureka integration, the client name corresponds to the service id that Ribbon is calling out to. In this case, the client name is fortune.

So much for the configuration. Let’s turn our attention to how to simulate running two instances of the fortune service with differing response times.

⇒ In the fortune service’s FortuneController, expose an artificial delay that can be configured via an environment variable, as illustrated in the following diff:

fortune-service/src/main/java/io/pivotal/training/fortune/FortuneController.java

@@ -1,6 +1,7 @@
 package io.pivotal.training.fortune;

 import lombok.extern.slf4j.Slf4j;
+import org.springframework.beans.factory.annotation.Value;
 import org.springframework.web.bind.annotation.GetMapping;
 import org.springframework.web.bind.annotation.RestController;

@@ -13,6 +14,9 @@ public class FortuneController {

   private FortuneService fortuneService;

+  @Value("${delay.ms:0}")
+  private int delayMs = 0;
+
   public FortuneController(FortuneService fortuneService) {
     this.fortuneService = fortuneService;
   }
@@ -22,9 +26,21 @@ public class FortuneController {
     String fortune = fortuneService.getFortune();
     log.info("retrieving fortune '{}'", fortune);

+    artificialDelay();
+
     Map<String, String> map = new HashMap<>();
     map.put("fortune", fortune);
     return map;
   }

+  private void artificialDelay() {
+    if (delayMs <= 0) return;
+
+    try {
+      Thread.sleep(delayMs);
+    } catch (InterruptedException e) {
+      e.printStackTrace();
+    }
+  }
+
 }

With this change in place, start up our system once more, as follows:

Start the eureka-server
Start the first fortune service instance with a 10ms delay, like this:
```
$ DELAY_MS=10 gradle bootRun
```
Start the second fortune service instance on a different port (8082), and with a 100ms delay:
```
$ SERVER_PORT=8082 DELAY_MS=100 gradle bootRun
```
Again, if you’re on a windows environment, the above command would have to be invoked slightly differently, like this:
```
$ set SERVER_PORT=8082
$ set DELAY_MS=100
$ gradle bootRun
```
Check the eureka dashboard and make sure that both service instances are registered with eureka
Finally, start up the greeting application

At this point we have two fortune service instances where one is configured to take longer to respond than the other.

Refresh the greeting page and watch Ribbon once more load-balance requests across the two fortune service instances.
Scan the greeting application’s logs on the console: you should begin to see log messages indicating that a background thread is running that computes response time statistics.
Within a short time, the greeting application will adapt to invoking the faster fortune instance in proportion to its response time; i.e. it will be invoked more often than the slower instance.

It’s worth mentioning Netflix introduces latency when testing its services via a component from its Simian Army named the Latency Monkey; see here for more information.

6. Congratulations

Once again, we’ve covered a lot of ground in this lab on Ribbon. Our system is becoming more sophisticated, but at the same time, our code remains simple and manageable, thanks to Spring Cloud’s LoadBalanced RestTemplate, and Contract’s automated stubbing of the discovery service.

Imagine at this point how the deployment of a system of microservices in the cloud is served by the simple concepts we’ve just covered: circuit breakers, service discovery, and client-side load-balancing. We end up with a system that’s more flexible, elastic, fault-tolerant, and with higher operational visibility.