jhalterman / failsafe
- вторник, 14 июня 2016 г. в 03:14:07
Java
Simple, sophisticated failure handling
Simple, sophisticated failure handling.
Failsafe is a lightweight, zero-dependency library for handling failures. It was designed to be as easy to use as possible, with a concise API for handling everday use cases and the flexibility to handle everything else. Failsafe features:
Supports Java 6+ though the documentation uses lambdas for simplicity.
Add the latest Failsafe Maven dependency to your project.
One of the core Failsafe features is retries. To start, define a RetryPolicy that expresses when retries should be performed:
RetryPolicy retryPolicy = new RetryPolicy()
.retryOn(ConnectException.class)
.withDelay(1, TimeUnit.SECONDS)
.withMaxRetries(3);
Then use your RetryPolicy to execute a Runnable
or Callable
with retries:
// Run with retries
Failsafe.with(retryPolicy).run(() -> connect());
// Get with retries
Connection connection = Failsafe.with(retryPolicy).get(() -> connect());
Java 6 / 7 is also supported:
Connection connection = Failsafe.with(retryPolicy).get(new Callable<Connection>() {
public Connection call() {
return connect();
}
});
Failsafe's retry policies provide flexibility in allowing you to express when retries should be performed.
A policy can allow retries on particular failures:
RetryPolicy retryPolicy = new RetryPolicy()
.retryOn(ConnectException.class, SocketException.class);
.retryOn(failure -> failure instanceof ConnectException);
And for particular results or conditions:
retryPolicy
.retryWhen(null);
.retryIf(result -> result == null);
It can add a fixed delay between retries:
retryPolicy.withDelay(1, TimeUnit.SECONDS);
Or a delay that backs off exponentially:
retryPolicy.withBackoff(1, 30, TimeUnit.SECONDS);
It can add a max number of retries and a max retry duration:
retryPolicy
.withMaxRetries(100)
.withMaxDuration(5, TimeUnit.MINUTES);
It can also specify which results, failures or conditions to abort retries on:
retryPolicy
.abortWhen(true)
.abortOn(NoRouteToHostException.class)
.abortIf(result -> result == true)
And of course we can combine these things into a single policy.
With a retry policy defined, we can perform a retryable synchronous execution:
// Run with retries
Failsafe.with(retryPolicy).run(this::connect);
// Get with retries
Connection connection = Failsafe.with(retryPolicy).get(this::connect);
Asynchronous executions can be performed and retried on a ScheduledExecutorService or custom Scheduler implementation, and return a FailsafeFuture. When the execution succeeds or the retry policy is exceeded, the future is completed and any listeners registered against it are called:
Failsafe.with(retryPolicy)
.with(executor)
.run(this::connect)
.onSuccess(connection -> log.info("Connected to {}", connection))
.onFailure((result, failure) -> log.error("Connection attempts failed", failure));
Circuit breakers are a way of creating systems that fail-fast by temporarily disabling execution as a way of preventing system overload. Creating a CircuitBreaker is straightforward:
CircuitBreaker breaker = new CircuitBreaker()
.withFailureThreshold(3, 10)
.withSuccessThreshold(5)
.withDelay(1, TimeUnit.MINUTES);
We can then execute a Runnable
or Callable
with the breaker
:
Failsafe.with(breaker).run(this::connect);
When a configured threshold of execution failures occurs on a circuit breaker, the circuit is opened and further execution requests fail with CircuitBreakerOpenException
. After a delay, the circuit is half-opened and trial executions are attempted to determine whether the circuit should be closed or opened again. If the trial executions exceed a success threshold, the breaker is closed again and executions will proceed as normal.
Circuit breakers can be flexibly configured to express when the circuit should be opened or closed.
A circuit breaker can be configured to open when a successive number of executions have failed:
CircuitBreaker breaker = new CircuitBreaker()
.withFailureThreshold(5);
Or when, for example, the last 3 out of 5 executions have failed:
breaker.withFailureThreshold(3, 5);
Typically, a breaker is configured to delay before attempting to close again:
breaker.withDelay(1, TimeUnit.MINUTES);
The breaker can be configured to close again if a number of trial executions succeed, else it will re-open:
breaker.withSuccessThreshold(5);
The breaker can also be configured to close again if, for example, the last 3 out of 5 executions succeed, else it will re-open:
breaker.withSuccessThreshold(3, 5);
The breaker can be configured to only recognize certain results, exceptions or conditions as failures:
breaker.
.failWhen(true)
.failOn(NoRouteToHostException.class)
.failIf((result, failure) -> result == 500 || failure instanceof NoRouteToHostException);
And the breaker can be configured to recognize executions that exceed a certain timeout as failures:
breaker.withTimeout(10, TimeUnit.SECONDS);
A CircuitBreaker can be used along with a RetryPolicy
:
Failsafe.with(retryPolicy).with(breaker).get(this::connect);
Execution failures are first retried according to the RetryPolicy
, then if the policy is exceeded the failure is recorded by the CircuitBreaker
.
A circuit breaker can and should be shared across code that accesses inter-dependent system components that fail together. This ensures that if the circuit is opened, executions against one component that rely on another component will not be allowed until the circuit is closed again.
A CircuitBreaker
can also be manually operated in a standalone way:
if (breaker.allowsExecution()) {
try {
doSomething();
breaker.recordSuccess();
} catch (Exception e) {
breaker.recordFailure(e);
}
}
Failsafe can provide an ExecutionContext containing execution related information such as the number of execution attempts as well as start and elapsed times:
Failsafe.with(retryPolicy).run(ctx -> {
log.debug("Connection attempt #{}", ctx.getExecutions());
connect();
});
Failsafe supports execution and retry event listeners via the Listeners class:
Failsafe.with(retryPolicy)
.with(new Listeners<Connection>()
.onRetry((c, f, stats) -> log.warn("Failure #{}. Retrying.", stats.getExecutions()))
.onFailure((cxn, failure) -> log.error("Connection attempts failed", failure))
.onSuccess(cxn -> log.info("Connected to {}", cxn)))
.get(this::connect);
Non-Java 8 users can extend the Listeners
class and override individual event handlers:
Failsafe.with(retryPolicy)
.with(new Listeners<Connection>() {
public void onRetry(Connection cxn, Throwable failure, ExecutionStats stats) {
log.warn("Failure #{}. Retrying.", stats.getExecutions());
}
}).get(() -> connect());
Asynchronous completion and failure listeners can be registered via FailsafeFuture:
Failsafe.with(retryPolicy)
.with(executor)
.run(this::connect)
.onSuccess(connection -> log.info("Connected to {}", connection))
.onFailure((result, failure) -> log.error("Connection attempts failed", failure));
And asynchronous retry and failed attempt listeners can be registered via AsyncListeners:
Failsafe.with(retryPolicy)
.with(executor)
.with(new AsyncListeners<Connection>()
.onRetryAsync((result, failure) -> log.info("Retrying")))
.get(this::connect);
CircuitBreaker related event listeners can also be registered:
circuitBreaker.onOpen(() -> log.info("The circuit was opened"));
Failsafe can be integrated with asynchronous code that reports completion via callbacks. The runAsync
, getAsync
and futureAsync
methods provide an AsyncExecution reference that can be used to manually perform retries or completion inside asynchronous callbacks:
Failsafe.with(retryPolicy)
.with(executor)
.getAsync(execution -> service.connect().whenComplete((result, failure) -> {
if (execution.complete(result, failure))
log.info("Connected");
else if (!execution.retry())
log.error("Connection attempts failed", failure);
}));
Failsafe can also perform asynchronous executions and retries on 3rd party schedulers via the Scheduler interface. See the Vert.x example for a more detailed implementation.
Java 8 users can use Failsafe to retry CompletableFuture calls:
Failsafe.with(retryPolicy)
.with(executor)
.future(this::connectAsync)
.thenApplyAsync(value -> value + "bar")
.thenAccept(System.out::println));
Failsafe can be used to create retryable Java 8 functional interfaces:
Function<String, Connection> connect = address -> Failsafe.with(retryPolicy).get(() -> connect(address));
We can retry streams:
Failsafe.with(retryPolicy).run(() -> Stream.of("foo").map(value -> value + "bar"));
Individual Stream operations:
Stream.of("foo").map(value -> Failsafe.with(retryPolicy).get(() -> value + "bar"));
Or individual CompletableFuture stages:
CompletableFuture.supplyAsync(() -> Failsafe.with(retryPolicy).get(() -> "foo"))
.thenApplyAsync(value -> Failsafe.with(retryPolicy).get(() -> value + "bar"));
In addition to automatically performing retries, Failsafe can be used to track executions for you, allowing you to manually retry as needed:
Execution execution = new Execution(retryPolicy);
while (!execution.isComplete()) {
try {
doSomething();
execution.complete();
} catch (ConnectException e) {
execution.recordFailure(e);
}
}
Execution tracking is also useful for integrating with APIs that have their own retry mechanism:
Execution execution = new Execution(retryPolicy);
// On failure
if (execution.canRetryOn(someFailure))
service.scheduleRetry(execution.getWaitMillis(), TimeUnit.MILLISECONDS);
See the RxJava example for a more detailed implementation.
Failsafe was designed to integrate nicely with existing libraries. Here are some example integrations:
For library developers, Failsafe integrates nicely into public APIs, allowing your users to configure retry policies for different opererations. One integration approach is to subclass the RetryPolicy class, then expose that as part of your API while the rest of Failsafe remains internal. Another approach is to use something like the Maven shade plugin to relocate Failsafe into your project's package structure as desired.
JavaDocs are available here.
Failsafe is a volunteer effort. If you use it and you like it, you can help by spreading the word!
Copyright 2015-2016 Jonathan Halterman - Released under the Apache 2.0 license.