jhalterman / failsafe

вторник, 14 июня 2016 г. в 03:14:07

Java
Simple, sophisticated failure handling

Failsafe

Simple, sophisticated failure handling.

Introduction

Failsafe is a lightweight, zero-dependency library for handling failures. It was designed to be as easy to use as possible, with a concise API for handling everday use cases and the flexibility to handle everything else. Failsafe features:

Retries
- Retry policies
- Synchronous and asynchronous retries
Circuit breakers
- Configuration
Execution context
Event listeners
Asynchronous API integration
CompletableFuture and functional interface integration
Execution tracking

Supports Java 6+ though the documentation uses lambdas for simplicity.

Setup

Add the latest Failsafe Maven dependency to your project.

Usage

Retries

One of the core Failsafe features is retries. To start, define a RetryPolicy that expresses when retries should be performed:

RetryPolicy retryPolicy = new RetryPolicy()
  .retryOn(ConnectException.class)
  .withDelay(1, TimeUnit.SECONDS)
  .withMaxRetries(3);

Then use your RetryPolicy to execute a Runnable or Callable with retries:

// Run with retries
Failsafe.with(retryPolicy).run(() -> connect());

// Get with retries
Connection connection = Failsafe.with(retryPolicy).get(() -> connect());

Java 6 / 7 is also supported:

Connection connection = Failsafe.with(retryPolicy).get(new Callable<Connection>() {
  public Connection call() {
    return connect();
  }
});

Retry Policies

Failsafe's retry policies provide flexibility in allowing you to express when retries should be performed.

A policy can allow retries on particular failures:

RetryPolicy retryPolicy = new RetryPolicy()
  .retryOn(ConnectException.class, SocketException.class);
  .retryOn(failure -> failure instanceof ConnectException);

And for particular results or conditions:

retryPolicy
  .retryWhen(null);
  .retryIf(result -> result == null);

It can add a fixed delay between retries:

retryPolicy.withDelay(1, TimeUnit.SECONDS);

Or a delay that backs off exponentially:

retryPolicy.withBackoff(1, 30, TimeUnit.SECONDS);

It can add a max number of retries and a max retry duration:

retryPolicy
  .withMaxRetries(100)
  .withMaxDuration(5, TimeUnit.MINUTES);

It can also specify which results, failures or conditions to abort retries on:

retryPolicy
  .abortWhen(true)
  .abortOn(NoRouteToHostException.class)
  .abortIf(result -> result == true)

And of course we can combine these things into a single policy.

Synchronous Retries

With a retry policy defined, we can perform a retryable synchronous execution:

// Run with retries
Failsafe.with(retryPolicy).run(this::connect);

// Get with retries
Connection connection = Failsafe.with(retryPolicy).get(this::connect);

Asynchronous Retries

Asynchronous executions can be performed and retried on a ScheduledExecutorService or custom Scheduler implementation, and return a FailsafeFuture. When the execution succeeds or the retry policy is exceeded, the future is completed and any listeners registered against it are called:

Failsafe.with(retryPolicy)
  .with(executor)
  .run(this::connect)
  .onSuccess(connection -> log.info("Connected to {}", connection))
  .onFailure((result, failure) -> log.error("Connection attempts failed", failure));

Circuit Breakers

Circuit breakers are a way of creating systems that fail-fast by temporarily disabling execution as a way of preventing system overload. Creating a CircuitBreaker is straightforward:

CircuitBreaker breaker = new CircuitBreaker()
  .withFailureThreshold(3, 10)
  .withSuccessThreshold(5)
  .withDelay(1, TimeUnit.MINUTES);

We can then execute a Runnable or Callable with the breaker:

Failsafe.with(breaker).run(this::connect);

When a configured threshold of execution failures occurs on a circuit breaker, the circuit is opened and further execution requests fail with CircuitBreakerOpenException. After a delay, the circuit is half-opened and trial executions are attempted to determine whether the circuit should be closed or opened again. If the trial executions exceed a success threshold, the breaker is closed again and executions will proceed as normal.

Circuit Breaker Configuration

Circuit breakers can be flexibly configured to express when the circuit should be opened or closed.

A circuit breaker can be configured to open when a successive number of executions have failed:

CircuitBreaker breaker = new CircuitBreaker()
  .withFailureThreshold(5);

Or when, for example, the last 3 out of 5 executions have failed:

breaker.withFailureThreshold(3, 5);

Typically, a breaker is configured to delay before attempting to close again:

breaker.withDelay(1, TimeUnit.MINUTES);

The breaker can be configured to close again if a number of trial executions succeed, else it will re-open:

breaker.withSuccessThreshold(5);

The breaker can also be configured to close again if, for example, the last 3 out of 5 executions succeed, else it will re-open:

breaker.withSuccessThreshold(3, 5);

The breaker can be configured to only recognize certain results, exceptions or conditions as failures:

breaker.
  .failWhen(true)
  .failOn(NoRouteToHostException.class)
  .failIf((result, failure) -> result == 500 || failure instanceof NoRouteToHostException);

And the breaker can be configured to recognize executions that exceed a certain timeout as failures:

breaker.withTimeout(10, TimeUnit.SECONDS);

With Retries

A CircuitBreaker can be used along with a RetryPolicy:

Failsafe.with(retryPolicy).with(breaker).get(this::connect);

Execution failures are first retried according to the RetryPolicy, then if the policy is exceeded the failure is recorded by the CircuitBreaker.

Failing Together

A circuit breaker can and should be shared across code that accesses inter-dependent system components that fail together. This ensures that if the circuit is opened, executions against one component that rely on another component will not be allowed until the circuit is closed again.

Standalone Usage

A CircuitBreaker can also be manually operated in a standalone way:

if (breaker.allowsExecution()) {
  try {
    doSomething();
    breaker.recordSuccess();
  } catch (Exception e) {
    breaker.recordFailure(e);
  }
}

Execution Context

Failsafe can provide an ExecutionContext containing execution related information such as the number of execution attempts as well as start and elapsed times:

Failsafe.with(retryPolicy).run(ctx -> {
  log.debug("Connection attempt #{}", ctx.getExecutions());
  connect();
});

Event Listeners

Failsafe supports execution and retry event listeners via the Listeners class:

Failsafe.with(retryPolicy)
  .with(new Listeners<Connection>()
    .onRetry((c, f, stats) -> log.warn("Failure #{}. Retrying.", stats.getExecutions()))
    .onFailure((cxn, failure) -> log.error("Connection attempts failed", failure))
    .onSuccess(cxn -> log.info("Connected to {}", cxn)))
  .get(this::connect);

Non-Java 8 users can extend the Listeners class and override individual event handlers:

Failsafe.with(retryPolicy)
  .with(new Listeners<Connection>() {
    public void onRetry(Connection cxn, Throwable failure, ExecutionStats stats) {
      log.warn("Failure #{}. Retrying.", stats.getExecutions());
    }
  }).get(() -> connect());

Asynchronous completion and failure listeners can be registered via FailsafeFuture:

Failsafe.with(retryPolicy)
  .with(executor)
  .run(this::connect)
  .onSuccess(connection -> log.info("Connected to {}", connection))
  .onFailure((result, failure) -> log.error("Connection attempts failed", failure));

And asynchronous retry and failed attempt listeners can be registered via AsyncListeners:

Failsafe.with(retryPolicy)
  .with(executor)
  .with(new AsyncListeners<Connection>()
    .onRetryAsync((result, failure) -> log.info("Retrying")))
  .get(this::connect);

CircuitBreaker related event listeners can also be registered:

circuitBreaker.onOpen(() -> log.info("The circuit was opened"));

Asynchronous API Integration

Failsafe can be integrated with asynchronous code that reports completion via callbacks. The runAsync, getAsync and futureAsync methods provide an AsyncExecution reference that can be used to manually perform retries or completion inside asynchronous callbacks:

Failsafe.with(retryPolicy)
  .with(executor)
  .getAsync(execution -> service.connect().whenComplete((result, failure) -> {
    if (execution.complete(result, failure))
      log.info("Connected");
    else if (!execution.retry())
      log.error("Connection attempts failed", failure);
  }));

Failsafe can also perform asynchronous executions and retries on 3rd party schedulers via the Scheduler interface. See the Vert.x example for a more detailed implementation.

CompletableFuture Integration

Java 8 users can use Failsafe to retry CompletableFuture calls:

Failsafe.with(retryPolicy)
  .with(executor)
  .future(this::connectAsync)
  .thenApplyAsync(value -> value + "bar")
  .thenAccept(System.out::println));

Functional Interface Integration

Failsafe can be used to create retryable Java 8 functional interfaces:

Function<String, Connection> connect = address -> Failsafe.with(retryPolicy).get(() -> connect(address));

We can retry streams:

Failsafe.with(retryPolicy).run(() -> Stream.of("foo").map(value -> value + "bar"));

Individual Stream operations:

Stream.of("foo").map(value -> Failsafe.with(retryPolicy).get(() -> value + "bar"));

Or individual CompletableFuture stages:

CompletableFuture.supplyAsync(() -> Failsafe.with(retryPolicy).get(() -> "foo"))
  .thenApplyAsync(value -> Failsafe.with(retryPolicy).get(() -> value + "bar"));

Execution Tracking

In addition to automatically performing retries, Failsafe can be used to track executions for you, allowing you to manually retry as needed:

Execution execution = new Execution(retryPolicy);
while (!execution.isComplete()) {
  try {
    doSomething();
    execution.complete();
  } catch (ConnectException e) {
    execution.recordFailure(e);
  }
}

Execution tracking is also useful for integrating with APIs that have their own retry mechanism:

Execution execution = new Execution(retryPolicy);

// On failure
if (execution.canRetryOn(someFailure))
  service.scheduleRetry(execution.getWaitMillis(), TimeUnit.MILLISECONDS);

See the RxJava example for a more detailed implementation.

Example Integrations

Failsafe was designed to integrate nicely with existing libraries. Here are some example integrations:

Public API Integration

For library developers, Failsafe integrates nicely into public APIs, allowing your users to configure retry policies for different opererations. One integration approach is to subclass the RetryPolicy class, then expose that as part of your API while the rest of Failsafe remains internal. Another approach is to use something like the Maven shade plugin to relocate Failsafe into your project's package structure as desired.

Docs

JavaDocs are available here.

Contribute

Failsafe is a volunteer effort. If you use it and you like it, you can help by spreading the word!