In our last article about Microprofile Fault Tolerance we explained the motivation for this project and the need to provide a few design patterns under the microservice friendly Microprofile spec, namely:

  • Bulkhead – isolate failures in part of the system.
  • Circuit breaker – offer a way to fail fast.
  • Retry – define criteria on when to retry.
  • Fallback – provide an alternative solution for a failed execution.

We also presented some of the libraries that implement this Microprofile specification, including the Geronimo Safegard library, the one used on TomEE 7.1.

Lets now dive a bit deeper into the spec and explain some use cases. The behavior is controlled with annotations placed at the class and method level and can be changed in runtime using the Microprofile Configuration spec. Most attributes of each annotation have default values that you can set to your needs:

Timeout

If you annotate a method with this @Timeout, the execution will be aborted after the defined threshold. For this example that threshold is 500ms.

import org.eclipse.microprofile.faulttolerance.Timeout;

@Timeout(500)
public String callSomeService() {...}

Retry

This annotation allows you to define a strategy in case of failure, as an example, an IOException.

import org.eclipse.microprofile.faulttolerance.Retry;

@Retry(retryOn = IOException.class)
public String callSomeService() {...}

You can also stack multiple annotations, in here, the method will be retried once if the execution times out after 500ms.

import org.eclipse.microprofile.faulttolerance.Timeout;
import org.eclipse.microprofile.faulttolerance.Retry;

@Timeout(500)
@Retry(maxRetries = 1)
public String callSomeService() {...}

You can define very fine-grained behaviors. On the next example, if an IOException occurs, there will be 2 retries with a delay of 200ms minus or plus a random value between -100ms and +100ms, because of the jitter attribute. This helps to reduce peak loads.

@Retry(delay = 200, maxRetries = 2, jitter = 100, retryOn = IOException.class)
public String callSomeService() {...}

Fallback

The @Fallback annotation provides you an alternative execution path in case of failure, thus increasing the success rate of the requests.

Here is how you can ask for a different method in the same class to be executed in case of any Exception. The fallback method can be non-public:

import org.eclipse.microprofile.faulttolerance.Fallback;

@Fallback(fallbackMethod = "fallbackMethodInSameClass")
public String callSomeService() {...}
...
public String fallbackMethodInSameClass() {...}

In the next example, we combine different annotation. After 500ms of execution, the callSomeService() method will be re-called once, and if it times out again, the fallback method will be executed instead.

The CallAppologyService class implements the FallbackHandler interface

import org.eclipse.microprofile.faulttolerance.Timeout;
import org.eclipse.microprofile.faulttolerance.Retry;
import org.eclipse.microprofile.faulttolerance.Fallback;

@Timeout(500)
@Retry(maxRetries = 1)
@Fallback(CallAppologyService.class)
public String callSomeService() {...}

Circuit Breaker

The failure of a service results in higher latency for clients due to timeouts. This can trigger a cascading effect and propagate the failure to other services. If we know a service has problems, a circuit breaker can be used to force calls to fail immediately and stop subsequent invocations of that service.

In this example, the circuit breaker is closed or working normally, and we use a rolling window of the last 4 requests. If 75% (3 requests out of 4) fail, then the circuit will stay open, rejecting all subsequent requests for 1000ms.

After the 1000ms delay, the circuit is placed to half-open. At this point, trial calls will probe the destination and after 10 consecutive successes, the circuit will be placed back to closed (normal operation).

import org.eclipse.microprofile.faulttolerance.CircuitBreaker;

@CircuitBreaker(requestVolumeThreshold = 4, failureRatio=0.75, delay = 1000, successThreshold = 10)
public String callSomeService() {...}

You can combine the circuit breaker with other patterns, like the retry or the timeout. This way you can control the failures that lead to an open circuit.

import org.eclipse.microprofile.faulttolerance.CircuitBreaker;
import org.eclipse.microprofile.faulttolerance.Retry;
import org.eclipse.microprofile.faulttolerance.Timeout;

@CircuitBreaker(requestVolumeThreshold = 4, failureRatio = 0.75, delay = 1000, successThreshold = 10, )
@Retry(retryOn = {RuntimeException.class, TimeoutException.class}, maxRetries = 7)
@Timeout(500)
public String callSomeService() {...}

A @Fallback can be specified and it will be invoked if the CircuitBreakerOpenException is thrown.

Bulkhead

The @Bulkhead annotation also prevents the failure of a service from triggering a cascading effect to other services. In this case, it’s only effective if you are calling the method from multiple contexts.

The bulkhead works by limiting the number of concurrent requests to the method. This can be achieved in 2 ways:

Semaphore style

There a hard limit to the number of concurrent requests. In this example, after 5 parallel requests, the extra calls will receive a BulkheadException.

import org.eclipse.microprofile.faulttolerance.Asynchronous;
import org.eclipse.microprofile.faulttolerance.Bulkhead;

@Bulkhead(5)
public String callSomeService() {...}

Thread pool style

This will use a thread pool to hold the extra requests, up to a limit. In the example, 5 concurrent requests are allowed and 8 can be placed on a waiting queue. If the waiting queue is exhausted, the extra calls will receive a BulkheadException.

import org.eclipse.microprofile.faulttolerance.Bulkhead;

@Asynchronous
@Bulkhead(value = 5, waitingTaskQueue = 8)
public Future callSomeService() {...}

Asynchronous

The @Asynchronous annotation means that the method execution will happen in a separate thread. Any methods marked with this annotation must return one of:

java.util.concurrent.Future
java.util.concurrent.CompletionStage

There is an ongoing discussion around the @Asynchronous annotation and its implications when mixed with the other Fault Tolerance annotations. This behavior will be clarified in the upcoming 1.2 version.

Conclusion

Fault tolerance will improve the user experience and reduce support needs. If you are developing a distributed system with multiple services this can bring you a cutting-edge advantage.

For more details, you can take a look at this foundational post from Emily Jiang and to the Microprofile Fault Tolerance documentation.

In the next installment, we will provide a tutorial so that you can learn how to implement the Fault Tolerance in your microservices using TomEE.

Bruno Baptista

Bruno Baptista

Bruno is a well versed Java and Open Source technology developer and a Senior Software Engineer at Tomitribe. With over 10 years as an enterprise level engineer, he has lead QA and development teams, garnered skills in design and development process.
brunobat_

Leave a Reply