May 24, 2017

Reactive Java Performance Comparison


This blog post covers the basic concepts of reactive programming and presents a comparison of the performance of a standard Spring MVC WebApp with Cassandra as data layer to a reactive one.

Reactive programming has become a popular topic in recent years. With the upcoming Spring 5 release, reactive programming is finally emerging into the mainstream. Reactive programming should improve the performance of I/O-heavy applications. A regular standard Spring MVC application typically runs in an application server like Jetty or Tomcat on a servlet stack. Each request is bound to a thread that is passed through the servlet container to the user code before it finally gets some data. Normally, the majority of these operations are I/O-bound. The main disadvantage of this model is that the allocated resources (threads) are waiting for these I/O operations to complete, which can take the majority of the thread time. In the meantime, these resources cannot handle any further requests. With reactive programming, the program's code itself is no longer in charge of resource allocation, but instead provides callbacks that are invoked when resources (e.g., new data from the database) are available.



This, of course, requires a different style of coding. The usual declarative way of programming no longer works. Spring requires the Reactor framework to support the "new way" of programming.

The two most important concepts are Mono and Flux. These two types wrap the results of all operations. Mono can have a zero or one result, while Flux can handle a continuous stream of elements.

The code example below emits the given elements, applies some mapping transformations, and prints them out to a standard:
     Flux.just("keyword 1", "keyword 2")  
         .map(s -> s.replace(" ", ""))  
         .map(String::toUpperCase)  
         .subscribe(System.out::println);  


The API looks quite similar to the Java Streaming API. For our follow-up example, imagine the stream of data is emitted from the data source and the standard out is the output stream of the HTTP response.

Spring MVC vs Spring MVC Webflux

This code snippet shows a Spring MVC Rest Controller. It has just one mapping that returns all the data found in the repository. First, it loads all the data, and then it writes the response:

 @RestController  
 public class TestController {  
   @Autowired  
   private TestDataRepository testDataRepository;  
   @GetMapping("/api/")  
   public Iterable<TestData> getTestData() {  
     return this.testDataRepository.findAll();  
   }  
 }  


Using Webflux, the controller does the same as the controller above, except that it returns a Flux object. This object writes data to the client as soon as it is available:
 @RestController  
 public class TestController {  
   @Autowired  
   private TestDataRepository testDataRepository;  
   @GetMapping("/api/")  
   public Flux<TestData> getTestData() {  
     return this.testDataRepository.findAll();  
   }  
 }  


With these two implementations in place, it is time to compare their performances.

Performance Comparison

Disclaimer: Performance testing itself is a complex topic. These performance tests should just give a first hint about which direction to pursue. The results for each use case and configuration will be different. Furthermore, they don't show absolute performance numbers, just performance figures relative to each other. All performance tests were done on the same setup and repeated a couple of times to get these results.

Our test use case is an application that provides a REST interface to load data from a Cassandra data store. There are 10,000 entries in the database. Each request loads all of these entries. This leads to enough I/O to show the advantage of the reactive implementation.

The code for the example applications can be found in our github repository.

Please check the repository for the driver versions used here.

Gatling Test

The code snippet shows the gatling test that was used for performance testing. It includes just one HTTP call to the REST interface. It ramps up 600 requests over one minute, so ~ 10 per second:

  val httpProtocol = http  
   .baseURL("http://localhost:8080/api/")  
   .inferHtmlResources()  
   .acceptHeader("*/*")  
   .acceptEncodingHeader("gzip, deflate, sdch")  
   .acceptLanguageHeader("de-DE,de;q=0.8,en-US;q=0.6,en;q=0.4")  
   .userAgentHeader("Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36")  
  val headers_0 = Map(  
   "Content-Type" -> "application/json",  
   "Accept" -> "application/json")  
  val headers_1 = Map("Pragma" -> "no-cache")  
  val scn = scenario("RecordedSimulation")  
  .exec(http("getData")  
    .get("")  
    .headers(headers_0))  
  setUp(scn.inject(rampUsers(600) over (1 minutes)))  
   .protocols(httpProtocol)  

Plain Spring MVC + Cassandra

This test shows 600 requests over one minute. As we can see, the application is over its limit, with only 6.1 requests per second.


Spring Webflux + Cassandra Reactive

In comparison, observe the new reactive implementation. The same test shows that it can handle 600 requests in one minute quite well. As we know, the threads are waiting most of the time to get data via I/O. Not blocking these threads from handling new incoming requests results in quite a good performance advantage.



As shown in the figure below, we can get quite a lot more requests per second, but the response times will suffer as a result. If we do not limit the threads on Tomcat, we can see that the Cassandra connection pool is exhausted before the Tomcat thread pool is. This is exactly as we expected. All in all, this allows for a better use of the system's resources.



Summary

Reactive programming is not a new way of programming that solves all problems; rather, it is a different approach that fits certain scenarios better. For each use case, you must decide if this approach best fits the needs. Using the new reactive functionality requires the complete stack, including the database drivers, to be reactive. Even with frameworks like Spring 5 and/or Reactor, reactive programming increases the complexity of the application. For this reason, there needs to be a reason, like high performance requirements, to justify this additional complexity. As this quick performance comparison has shown, it is worth considering using reactive programming if there is a high demand for performance in an application that performs a lot of I/O operations.

1 comment:

  1. I do love programming in java..enjoyed reading this

    ReplyDelete