spring-cloud-circuitbreaker

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

Spring Cloud Circuit Breaker - Quick Reference

Spring Cloud Circuit Breaker - 快速参考

Deep Knowledge: Use
mcp__documentation__fetch_docs
with technology:
spring-cloud-circuitbreaker
for comprehensive documentation.
深度资料:使用
mcp__documentation__fetch_docs
工具,指定技术为
spring-cloud-circuitbreaker
,可获取完整文档。

Dependencies

依赖

xml
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>
<!-- For reactive -->
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-reactor-resilience4j</artifactId>
</dependency>
xml
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-resilience4j</artifactId>
</dependency>
<!-- 响应式场景依赖 -->
<dependency>
    <groupId>org.springframework.cloud</groupId>
    <artifactId>spring-cloud-starter-circuitbreaker-reactor-resilience4j</artifactId>
</dependency>

Circuit Breaker States

断路器状态

         ┌─────────────────────────────────────┐
         │                                     │
         ▼                                     │
    ┌─────────┐    failure threshold    ┌──────────┐
    │ CLOSED  │ ──────────────────────▶ │   OPEN   │
    │ (normal)│                         │(rejecting)│
    └─────────┘                         └──────────┘
         ▲                                     │
         │                                     │
         │      wait duration expires          │
         │                                     ▼
         │                              ┌───────────┐
         └────── success ──────────────│ HALF_OPEN │
                                       │ (testing) │
                                       └───────────┘
         ┌─────────────────────────────────────┐
         │                                     │
         ▼                                     │
    ┌─────────┐    失败阈值触发    ┌──────────┐
    │ CLOSED  │ ──────────────────────▶ │   OPEN   │
    │ (正常状态)│                         │(拒绝请求)│
    └─────────┘                         └──────────┘
         ▲                                     │
         │                                     │
         │      等待时长到期                    │
         │                                     ▼
         │                              ┌───────────┐
         └────── 调用成功 ──────────────│ HALF_OPEN │
                                       │ (测试状态) │
                                       └───────────┘

Basic Configuration

基础配置

application.yml

application.yml

yaml
resilience4j:
  circuitbreaker:
    configs:
      default:
        sliding-window-size: 10
        sliding-window-type: COUNT_BASED
        failure-rate-threshold: 50
        slow-call-rate-threshold: 100
        slow-call-duration-threshold: 2s
        permitted-number-of-calls-in-half-open-state: 3
        wait-duration-in-open-state: 10s
        automatic-transition-from-open-to-half-open-enabled: true
        record-exceptions:
          - java.io.IOException
          - java.net.SocketTimeoutException
        ignore-exceptions:
          - com.example.BusinessException

    instances:
      user-service:
        base-config: default
        failure-rate-threshold: 30
        wait-duration-in-open-state: 5s

      payment-service:
        base-config: default
        failure-rate-threshold: 20
        slow-call-duration-threshold: 1s

  retry:
    configs:
      default:
        max-attempts: 3
        wait-duration: 500ms
        retry-exceptions:
          - java.io.IOException
        ignore-exceptions:
          - com.example.BusinessException

    instances:
      user-service:
        base-config: default
        max-attempts: 5

  timelimiter:
    configs:
      default:
        timeout-duration: 3s
        cancel-running-future: true

    instances:
      user-service:
        timeout-duration: 5s

  bulkhead:
    configs:
      default:
        max-concurrent-calls: 25
        max-wait-duration: 0

    instances:
      user-service:
        max-concurrent-calls: 10

  ratelimiter:
    configs:
      default:
        limit-for-period: 100
        limit-refresh-period: 1s
        timeout-duration: 0

    instances:
      api-calls:
        limit-for-period: 50
        limit-refresh-period: 1s
yaml
resilience4j:
  circuitbreaker:
    configs:
      default:
        sliding-window-size: 10
        sliding-window-type: COUNT_BASED
        failure-rate-threshold: 50
        slow-call-rate-threshold: 100
        slow-call-duration-threshold: 2s
        permitted-number-of-calls-in-half-open-state: 3
        wait-duration-in-open-state: 10s
        automatic-transition-from-open-to-half-open-enabled: true
        record-exceptions:
          - java.io.IOException
          - java.net.SocketTimeoutException
        ignore-exceptions:
          - com.example.BusinessException

    instances:
      user-service:
        base-config: default
        failure-rate-threshold: 30
        wait-duration-in-open-state: 5s

      payment-service:
        base-config: default
        failure-rate-threshold: 20
        slow-call-duration-threshold: 1s

  retry:
    configs:
      default:
        max-attempts: 3
        wait-duration: 500ms
        retry-exceptions:
          - java.io.IOException
        ignore-exceptions:
          - com.example.BusinessException

    instances:
      user-service:
        base-config: default
        max-attempts: 5

  timelimiter:
    configs:
      default:
        timeout-duration: 3s
        cancel-running-future: true

    instances:
      user-service:
        timeout-duration: 5s

  bulkhead:
    configs:
      default:
        max-concurrent-calls: 25
        max-wait-duration: 0

    instances:
      user-service:
        max-concurrent-calls: 10

  ratelimiter:
    configs:
      default:
        limit-for-period: 100
        limit-refresh-period: 1s
        timeout-duration: 0

    instances:
      api-calls:
        limit-for-period: 50
        limit-refresh-period: 1s

Programmatic Usage

编程式使用

CircuitBreakerFactory

CircuitBreakerFactory

java
@Service
@RequiredArgsConstructor
public class UserService {

    private final CircuitBreakerFactory circuitBreakerFactory;
    private final UserClient userClient;

    public User getUser(Long id) {
        CircuitBreaker circuitBreaker = circuitBreakerFactory.create("user-service");

        return circuitBreaker.run(
            () -> userClient.getUserById(id),
            throwable -> getDefaultUser(id, throwable)
        );
    }

    private User getDefaultUser(Long id, Throwable throwable) {
        log.warn("Fallback for user {}: {}", id, throwable.getMessage());
        return User.builder()
            .id(id)
            .name("Unknown")
            .status("FALLBACK")
            .build();
    }
}
java
@Service
@RequiredArgsConstructor
public class UserService {

    private final CircuitBreakerFactory circuitBreakerFactory;
    private final UserClient userClient;

    public User getUser(Long id) {
        CircuitBreaker circuitBreaker = circuitBreakerFactory.create("user-service");

        return circuitBreaker.run(
            () -> userClient.getUserById(id),
            throwable -> getDefaultUser(id, throwable)
        );
    }

    private User getDefaultUser(Long id, Throwable throwable) {
        log.warn("用户{}的降级回退处理:{}", id, throwable.getMessage());
        return User.builder()
            .id(id)
            .name("Unknown")
            .status("FALLBACK")
            .build();
    }
}

Reactive

响应式场景

java
@Service
public class ReactiveUserService {

    private final ReactiveCircuitBreakerFactory circuitBreakerFactory;
    private final WebClient webClient;

    public Mono<User> getUser(Long id) {
        ReactiveCircuitBreaker circuitBreaker = circuitBreakerFactory.create("user-service");

        return circuitBreaker.run(
            webClient.get()
                .uri("/users/{id}", id)
                .retrieve()
                .bodyToMono(User.class),
            throwable -> Mono.just(User.fallback(id))
        );
    }
}
java
@Service
public class ReactiveUserService {

    private final ReactiveCircuitBreakerFactory circuitBreakerFactory;
    private final WebClient webClient;

    public Mono<User> getUser(Long id) {
        ReactiveCircuitBreaker circuitBreaker = circuitBreakerFactory.create("user-service");

        return circuitBreaker.run(
            webClient.get()
                .uri("/users/{id}", id)
                .retrieve()
                .bodyToMono(User.class),
            throwable -> Mono.just(User.fallback(id))
        );
    }
}

Annotation-Based

基于注解的使用

@CircuitBreaker

@CircuitBreaker

java
@Service
public class PaymentService {

    @CircuitBreaker(name = "payment-service", fallbackMethod = "paymentFallback")
    public PaymentResult processPayment(PaymentRequest request) {
        return paymentClient.process(request);
    }

    private PaymentResult paymentFallback(PaymentRequest request, Throwable t) {
        log.error("Payment failed for order {}: {}", request.getOrderId(), t.getMessage());
        return PaymentResult.builder()
            .status("PENDING")
            .message("Payment service unavailable, will retry later")
            .build();
    }
}
java
@Service
public class PaymentService {

    @CircuitBreaker(name = "payment-service", fallbackMethod = "paymentFallback")
    public PaymentResult processPayment(PaymentRequest request) {
        return paymentClient.process(request);
    }

    private PaymentResult paymentFallback(PaymentRequest request, Throwable t) {
        log.error("订单{}支付失败:{}", request.getOrderId(), t.getMessage());
        return PaymentResult.builder()
            .status("PENDING")
            .message("支付服务不可用,将稍后重试")
            .build();
    }
}

@Retry

@Retry

java
@Service
public class NotificationService {

    @Retry(name = "notification-service", fallbackMethod = "notifyFallback")
    public void sendNotification(Notification notification) {
        notificationClient.send(notification);
    }

    private void notifyFallback(Notification notification, Throwable t) {
        log.warn("Failed to send notification, queueing for retry: {}", t.getMessage());
        retryQueue.add(notification);
    }
}
java
@Service
public class NotificationService {

    @Retry(name = "notification-service", fallbackMethod = "notifyFallback")
    public void sendNotification(Notification notification) {
        notificationClient.send(notification);
    }

    private void notifyFallback(Notification notification, Throwable t) {
        log.warn("通知发送失败,已加入重试队列:{}", t.getMessage());
        retryQueue.add(notification);
    }
}

@RateLimiter

@RateLimiter

java
@Service
public class ApiService {

    @RateLimiter(name = "api-calls", fallbackMethod = "rateLimitFallback")
    public ApiResponse callExternalApi(ApiRequest request) {
        return externalClient.call(request);
    }

    private ApiResponse rateLimitFallback(ApiRequest request, Throwable t) {
        throw new TooManyRequestsException("Rate limit exceeded");
    }
}
java
@Service
public class ApiService {

    @RateLimiter(name = "api-calls", fallbackMethod = "rateLimitFallback")
    public ApiResponse callExternalApi(ApiRequest request) {
        return externalClient.call(request);
    }

    private ApiResponse rateLimitFallback(ApiRequest request, Throwable t) {
        throw new TooManyRequestsException("请求频率超出限制");
    }
}

@Bulkhead

@Bulkhead

java
@Service
public class ReportService {

    @Bulkhead(name = "report-service", type = Bulkhead.Type.THREADPOOL)
    public CompletableFuture<Report> generateReport(ReportRequest request) {
        return CompletableFuture.supplyAsync(() -> reportGenerator.generate(request));
    }
}
java
@Service
public class ReportService {

    @Bulkhead(name = "report-service", type = Bulkhead.Type.THREADPOOL)
    public CompletableFuture<Report> generateReport(ReportRequest request) {
        return CompletableFuture.supplyAsync(() -> reportGenerator.generate(request));
    }
}

@TimeLimiter

@TimeLimiter

java
@Service
public class SlowService {

    @TimeLimiter(name = "slow-service", fallbackMethod = "timeoutFallback")
    public CompletableFuture<Result> slowOperation() {
        return CompletableFuture.supplyAsync(() -> {
            // Potentially slow operation
            return performSlowOperation();
        });
    }

    private CompletableFuture<Result> timeoutFallback(Throwable t) {
        return CompletableFuture.completedFuture(Result.timeout());
    }
}
java
@Service
public class SlowService {

    @TimeLimiter(name = "slow-service", fallbackMethod = "timeoutFallback")
    public CompletableFuture<Result> slowOperation() {
        return CompletableFuture.supplyAsync(() -> {
            // 可能执行缓慢的操作
            return performSlowOperation();
        });
    }

    private CompletableFuture<Result> timeoutFallback(Throwable t) {
        return CompletableFuture.completedFuture(Result.timeout());
    }
}

Combined Annotations

组合注解

java
@Service
public class ResilientService {

    @CircuitBreaker(name = "backend", fallbackMethod = "fallback")
    @Retry(name = "backend")
    @RateLimiter(name = "backend")
    @Bulkhead(name = "backend")
    @TimeLimiter(name = "backend")
    public CompletableFuture<Response> resilientCall(Request request) {
        return CompletableFuture.supplyAsync(() -> backendClient.call(request));
    }

    private CompletableFuture<Response> fallback(Request request, Throwable t) {
        log.error("All resilience measures failed: {}", t.getMessage());
        return CompletableFuture.completedFuture(Response.error());
    }
}
java
@Service
public class ResilientService {

    @CircuitBreaker(name = "backend", fallbackMethod = "fallback")
    @Retry(name = "backend")
    @RateLimiter(name = "backend")
    @Bulkhead(name = "backend")
    @TimeLimiter(name = "backend")
    public CompletableFuture<Response> resilientCall(Request request) {
        return CompletableFuture.supplyAsync(() -> backendClient.call(request));
    }

    private CompletableFuture<Response> fallback(Request request, Throwable t) {
        log.error("所有弹性策略均失效:{}", t.getMessage());
        return CompletableFuture.completedFuture(Response.error());
    }
}

Customization

自定义配置

Custom CircuitBreaker Config

自定义CircuitBreaker配置

java
@Configuration
public class CircuitBreakerConfig {

    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> defaultCustomizer() {
        return factory -> factory.configureDefault(id ->
            new Resilience4JConfigBuilder(id)
                .circuitBreakerConfig(CircuitBreakerConfig.custom()
                    .slidingWindowSize(10)
                    .failureRateThreshold(50)
                    .waitDurationInOpenState(Duration.ofSeconds(10))
                    .permittedNumberOfCallsInHalfOpenState(3)
                    .build())
                .timeLimiterConfig(TimeLimiterConfig.custom()
                    .timeoutDuration(Duration.ofSeconds(3))
                    .build())
                .build());
    }

    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> specificCustomizer() {
        return factory -> factory.configure(builder ->
            builder.circuitBreakerConfig(CircuitBreakerConfig.custom()
                .failureRateThreshold(25)
                .build()),
            "payment-service", "critical-service");
    }
}
java
@Configuration
public class CircuitBreakerConfig {

    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> defaultCustomizer() {
        return factory -> factory.configureDefault(id ->
            new Resilience4JConfigBuilder(id)
                .circuitBreakerConfig(CircuitBreakerConfig.custom()
                    .slidingWindowSize(10)
                    .failureRateThreshold(50)
                    .waitDurationInOpenState(Duration.ofSeconds(10))
                    .permittedNumberOfCallsInHalfOpenState(3)
                    .build())
                .timeLimiterConfig(TimeLimiterConfig.custom()
                    .timeoutDuration(Duration.ofSeconds(3))
                    .build())
                .build());
    }

    @Bean
    public Customizer<Resilience4JCircuitBreakerFactory> specificCustomizer() {
        return factory -> factory.configure(builder ->
            builder.circuitBreakerConfig(CircuitBreakerConfig.custom()
                .failureRateThreshold(25)
                .build()),
            "payment-service", "critical-service");
    }
}

Event Listeners

事件监听器

java
@Component
public class CircuitBreakerEventListener {

    @Autowired
    private CircuitBreakerRegistry circuitBreakerRegistry;

    @PostConstruct
    public void init() {
        circuitBreakerRegistry.getAllCircuitBreakers().forEach(cb -> {
            cb.getEventPublisher()
                .onStateTransition(event ->
                    log.info("Circuit breaker {} state changed: {} -> {}",
                        event.getCircuitBreakerName(),
                        event.getStateTransition().getFromState(),
                        event.getStateTransition().getToState()))
                .onFailureRateExceeded(event ->
                    log.warn("Circuit breaker {} failure rate exceeded: {}%",
                        event.getCircuitBreakerName(),
                        event.getFailureRate()))
                .onSlowCallRateExceeded(event ->
                    log.warn("Circuit breaker {} slow call rate exceeded: {}%",
                        event.getCircuitBreakerName(),
                        event.getSlowCallRate()));
        });
    }
}
java
@Component
public class CircuitBreakerEventListener {

    @Autowired
    private CircuitBreakerRegistry circuitBreakerRegistry;

    @PostConstruct
    public void init() {
        circuitBreakerRegistry.getAllCircuitBreakers().forEach(cb -> {
            cb.getEventPublisher()
                .onStateTransition(event ->
                    log.info("断路器{}状态变更:{} -> {}",
                        event.getCircuitBreakerName(),
                        event.getStateTransition().getFromState(),
                        event.getStateTransition().getToState()))
                .onFailureRateExceeded(event ->
                    log.warn("断路器{}失败率超出阈值:{}%",
                        event.getCircuitBreakerName(),
                        event.getFailureRate()))
                .onSlowCallRateExceeded(event ->
                    log.warn("断路器{}慢调用率超出阈值:{}%",
                        event.getCircuitBreakerName(),
                        event.getSlowCallRate()));
        });
    }
}

Metrics and Monitoring

指标与监控

Actuator Endpoints

Actuator端点

yaml
management:
  endpoints:
    web:
      exposure:
        include: health,circuitbreakers,retries,ratelimiters,bulkheads

  health:
    circuitbreakers:
      enabled: true
    ratelimiters:
      enabled: true
bash
undefined
yaml
management:
  endpoints:
    web:
      exposure:
        include: health,circuitbreakers,retries,ratelimiters,bulkheads

  health:
    circuitbreakers:
      enabled: true
    ratelimiters:
      enabled: true
bash
undefined

Circuit breaker status

断路器状态

GET /actuator/circuitbreakers
GET /actuator/circuitbreakers

Specific circuit breaker

指定断路器详情

GET /actuator/circuitbreakers/{name}
GET /actuator/circuitbreakers/{name}

Circuit breaker events

断路器事件

GET /actuator/circuitbreakerevents
GET /actuator/circuitbreakerevents

Health with details

包含详情的健康检查

GET /actuator/health
undefined
GET /actuator/health
undefined

Metrics (Prometheus)

指标(Prometheus)

yaml
management:
  metrics:
    tags:
      application: ${spring.application.name}
    export:
      prometheus:
        enabled: true
Key metrics:
  • resilience4j_circuitbreaker_state
  • resilience4j_circuitbreaker_calls_total
  • resilience4j_circuitbreaker_failure_rate
  • resilience4j_retry_calls_total
  • resilience4j_ratelimiter_available_permissions
  • resilience4j_bulkhead_available_concurrent_calls
yaml
management:
  metrics:
    tags:
      application: ${spring.application.name}
    export:
      prometheus:
        enabled: true
关键指标:
  • resilience4j_circuitbreaker_state
  • resilience4j_circuitbreaker_calls_total
  • resilience4j_circuitbreaker_failure_rate
  • resilience4j_retry_calls_total
  • resilience4j_ratelimiter_available_permissions
  • resilience4j_bulkhead_available_concurrent_calls

Spring Cloud Gateway Integration

Spring Cloud Gateway集成

yaml
spring:
  cloud:
    gateway:
      routes:
        - id: user-service
          uri: lb://USER-SERVICE
          predicates:
            - Path=/api/users/**
          filters:
            - name: CircuitBreaker
              args:
                name: userServiceCB
                fallbackUri: forward:/fallback/users
            - name: Retry
              args:
                retries: 3
                statuses: BAD_GATEWAY,SERVICE_UNAVAILABLE
yaml
spring:
  cloud:
    gateway:
      routes:
        - id: user-service
          uri: lb://USER-SERVICE
          predicates:
            - Path=/api/users/**
          filters:
            - name: CircuitBreaker
              args:
                name: userServiceCB
                fallbackUri: forward:/fallback/users
            - name: Retry
              args:
                retries: 3
                statuses: BAD_GATEWAY,SERVICE_UNAVAILABLE

Best Practices

最佳实践

DoDon't
Set appropriate thresholdsUse defaults in production
Implement meaningful fallbacksReturn errors in fallbacks
Monitor circuit breaker stateIgnore state transitions
Use bulkhead for isolationLet one service exhaust resources
Configure retry with backoffRetry indefinitely
建议禁忌
设置合适的阈值生产环境使用默认配置
实现有意义的降级回退回退中返回错误
监控断路器状态忽略状态变更
使用舱壁实现隔离让单个服务耗尽资源
配置带退避策略的重试无限重试

Production Checklist

生产环境检查清单

  • Circuit breaker configured per service
  • Fallbacks return meaningful responses
  • Metrics exposed to monitoring
  • Alerts on state transitions
  • Proper timeout values
  • Retry with exponential backoff
  • Rate limiting for external APIs
  • Bulkhead for resource isolation
  • Health indicators enabled
  • Events logged for debugging
  • 为每个服务配置独立的断路器
  • 回退返回有意义的响应
  • 暴露指标到监控系统
  • 为状态变更配置告警
  • 设置合理的超时时间
  • 配置指数退避的重试策略
  • 为外部API配置限流
  • 使用舱壁实现资源隔离
  • 启用健康指示器
  • 记录事件用于调试

When NOT to Use This Skill

不适用场景

  • Internal errors - Fix the bug, don't circuit break
  • Hystrix - Deprecated, use Resilience4j
  • Database failures - Use connection pool, retries
  • Simple retries - Spring Retry may be sufficient
  • 内部错误 - 修复bug,不要使用断路器
  • Hystrix - 已废弃,请使用Resilience4j
  • 数据库故障 - 使用连接池和重试
  • 简单重试 - Spring Retry可能已足够

Anti-Patterns

反模式

Anti-PatternProblemSolution
Circuit breaker for everythingOverhead, complexityOnly for external/flaky calls
No fallback definedEmpty responsesProvide meaningful fallback
Wrong thresholdsCircuit opens too early/lateTune based on SLA
No monitoringCan't debug issuesEnable actuator metrics
Ignoring slow callsTimeouts not configuredAdd timeout configuration
反模式问题解决方案
对所有请求使用断路器额外开销与复杂度仅对外部/不稳定调用使用
未定义回退逻辑空响应提供有意义的回退
阈值设置错误断路器过早/过晚打开根据SLA调整阈值
未配置监控无法调试问题启用Actuator指标
忽略慢调用未配置超时添加超时配置

Quick Troubleshooting

快速排查

ProblemDiagnosticFix
Circuit always openCheck failure rateTune threshold, fix underlying issue
Fallback not calledCheck exception typesConfigure correct exception handling
Too many retriesCheck retry configReduce max attempts
Rate limiter too strictCheck permissions/secondIncrease limit
Bulkhead rejectedCheck concurrent callsIncrease max concurrent
问题诊断修复
断路器始终处于打开状态检查失败率调整阈值,修复底层问题
回退未触发检查异常类型配置正确的异常处理
重试次数过多检查重试配置减少最大尝试次数
限流过于严格检查每秒允许的请求数提高限流阈值
舱壁拒绝请求检查并发调用数提高最大并发数

Reference Documentation

参考文档