When writing microservices that span multiple services—like “order → inventory reservation → payment”—you’ve probably had this thought at least once: “I’d love to slap on @Transactional, but since the services are separated, there’s nothing I can do.”

This article tackles that pain point with the Saga pattern, a practical solution that we’ll implement using Spring Boot + Kafka in both the Choreography and Orchestration styles. We’ll focus on the parts that tend to trip people up in real-world work, including compensating transaction design and ensuring idempotency.

Why @Transactional and 2PC Don’t Work

Within a single process, @Transactional does the job in one shot, but the moment you split services apart, the story changes. Because @Transactional is scoped to a single DB connection, it can’t roll back another service’s DB or message broker. We cover transactional behavior within a single service in detail in Spring Boot Transaction Management Guide.

You might think, “Then can’t we just use 2PC (XA)?” but that’s not very realistic in modern microservice architectures. Many middleware components, starting with Kafka, don’t support XA, and even those that do severely compromise availability and throughput.

The approach taken instead is a design premised on Eventual Consistency. Each service’s local transactions are chained together, and if something fails along the way, you “compensate” to undo. This is the basic idea behind the Saga pattern.

Choreography vs. Orchestration

There are two major schools of Saga.

Choreography style has no central manager; each service subscribes to and publishes events autonomously to move things forward. OrderService publishes OrderCreated → InventoryService picks it up and reserves inventory → publishes InventoryReserved → PaymentService picks that up, and so on in a chain.

Orchestration style, on the other hand, has a central component called the orchestrator that calls things in order: “next is inventory reservation, next is payment.” State is consolidated in one place, making it easy to track, but it has the property that the orchestrator easily becomes a single point of failure.

Rough selection criteria are as follows:

  • 3–4 services with simple flow → Choreography
  • Many steps, or complex branching/conditionals → Orchestration
  • Observability and operational ease as top priority → Orchestration

Sample Topic and Dependencies

From here on, we’ll use a 3-service setup of “order → inventory reservation → payment” as our subject. First, the Gradle dependencies.

dependencies {
    implementation 'org.springframework.boot:spring-boot-starter-web'
    implementation 'org.springframework.boot:spring-boot-starter-data-jpa'
    implementation 'org.springframework.kafka:spring-kafka'
    implementation 'io.github.resilience4j:resilience4j-spring-boot3:2.2.0'
    runtimeOnly 'org.postgresql:postgresql'
}

Events shared between services should include at minimum an “event ID,” “correlation ID,” and “payload.” The correlation ID (correlationId) is essential for cross-cutting tracing of all events later, and is issued as a unique ID across the entire Saga, separate from the order ID.

public record OrderCreated(
        String eventId,        // Unique ID per event (for duplicate detection)
        String correlationId,  // Tracking ID that stays the same across the entire Saga
        String orderId,        // Order ID as a business key
        String productId,
        int quantity,
        long amount
) {}

public record InventoryReserved(String eventId, String correlationId, String orderId) {}
public record InventoryReservationFailed(String eventId, String correlationId, String orderId, String reason) {}
public record PaymentCompleted(String eventId, String correlationId, String orderId) {}
public record PaymentFailed(String eventId, String correlationId, String orderId, String reason) {}

We’ll leave the detailed Kafka configuration itself to How to Implement Kafka Producer/Consumer in Spring Boot. If you want to use RabbitMQ, the basic structure is the same, and you can combine it with How to Implement RabbitMQ in Spring Boot to build a similar Saga.

Implementing the Choreography Style

First, OrderService. It saves the order to the DB and publishes OrderCreated. We assume correlationId is received from the caller’s CreateOrderCommand, and if absent, we issue a new one here.

@Service
@RequiredArgsConstructor
public class OrderService {
    private final OrderRepository orderRepository;
    private final KafkaTemplate<String, Object> kafkaTemplate;

    @Transactional
    public void createOrder(CreateOrderCommand cmd) {
        var order = orderRepository.save(Order.pending(cmd));
        // correlationId is a tracking ID that runs through the entire Saga.
        // Prefer the one passed from the request; if none, issue a new one here
        // (treated as separate from orderId).
        var correlationId = Optional.ofNullable(cmd.correlationId())
                .orElseGet(() -> UUID.randomUUID().toString());
        var event = new OrderCreated(
                UUID.randomUUID().toString(), // eventId
                correlationId,
                order.getId(),
                cmd.productId(),
                cmd.quantity(),
                cmd.amount()
        );
        kafkaTemplate.send("order.created", order.getId(), event);
    }
}

Note: The above is minimal code for illustration, calling kafkaTemplate.send directly inside @Transactional. This carries the dual-write problem, where “DB commit succeeded but Kafka send failed” (or vice versa), so it’s inappropriate for production. In real operations, write the event to an outbox table in the same DB first, and have a separate process forward it to Kafka—this is called the Transactional Outbox pattern.

Next is InventoryService. It subscribes to OrderCreated, reserves inventory, and publishes a result event.

@Component
@RequiredArgsConstructor
public class InventoryEventListener {
    private final InventoryService inventoryService;
    private final KafkaTemplate<String, Object> kafkaTemplate;

    @KafkaListener(topics = "order.created", groupId = "inventory")
    public void handle(OrderCreated event) {
        try {
            inventoryService.reserve(event.eventId(), event.productId(), event.quantity());
            kafkaTemplate.send("inventory.reserved", event.orderId(),
                    new InventoryReserved(UUID.randomUUID().toString(), event.correlationId(), event.orderId()));
        } catch (OutOfStockException e) {
            kafkaTemplate.send("inventory.failed", event.orderId(),
                    new InventoryReservationFailed(UUID.randomUUID().toString(), event.correlationId(), event.orderId(), e.getMessage()));
        }
    }
}

PaymentService works the same way, subscribing to inventory.reserved and flowing payment results to payment.completed / payment.failed. The OrderService that catches a failure event updates the order to CANCELLED, and if necessary, publishes an inventory-return event.

Implementing the Orchestration Style

In the orchestrator approach, Saga state is managed in the DB.

@Entity
public class OrderSaga {
    @Id private String orderId;
    @Enumerated(EnumType.STRING)
    private SagaState state;
    private String lastError;

    public enum SagaState {
        STARTED, INVENTORY_RESERVED, PAYMENT_COMPLETED, COMPENSATING, FAILED, COMPLETED
    }

    // Note: getter/setter and state transition methods like markInventoryReserved() are omitted
}

The orchestrator looks at state and calls the next step. The example below wraps the entire method in @Transactional for clarity, but since it includes external REST calls, it becomes a long-running transaction that holds DB locks for too long. In real operations, the standard practice is to split transaction boundaries finely: “commit only the state save → external call → save next state.”

@Service
@RequiredArgsConstructor
public class OrderSagaOrchestrator {
    private final OrderSagaRepository repository;
    private final InventoryClient inventoryClient;
    private final PaymentClient paymentClient;

    @Transactional
    public void start(CreateOrderCommand cmd) {
        var saga = repository.save(OrderSaga.start(cmd.orderId()));
        try {
            inventoryClient.reserve(cmd.orderId(), cmd.productId(), cmd.quantity());
            saga.markInventoryReserved();

            paymentClient.charge(cmd.orderId(), cmd.amount());
            saga.markPaymentCompleted();

            saga.markCompleted();
        } catch (Exception e) {
            compensate(saga, e);
        }
    }

    private void compensate(OrderSaga saga, Exception e) {
        // Save the reached state at the point of failure before updating state to COMPENSATING.
        // Without this, state would be overwritten after markCompensating and compensation
        // judgment would break.
        var failedAt = saga.getState();
        saga.markCompensating(e.getMessage());

        if (failedAt.ordinal() >= OrderSaga.SagaState.PAYMENT_COMPLETED.ordinal()) {
            paymentClient.refund(saga.getOrderId());
        }
        if (failedAt.ordinal() >= OrderSaga.SagaState.INVENTORY_RESERVED.ordinal()) {
            inventoryClient.release(saga.getOrderId());
        }
        saga.markFailed();
    }
}

A simple state machine is sufficient in many cases, but if transitions get complex, introducing Spring StateMachine is also an option.

Designing Compensating Transactions

Compensation performs the “inverse operation” of the original. Inventory return for inventory reservation, refund for payment, cancellation for order confirmation, and so on.

There are 3 points to keep in mind when designing.

  1. Make compensation idempotent too. It needs to withstand retries and duplicate delivery, so build it so the result doesn’t change no matter how many times the same compensation request comes in.
  2. Place irreversible operations later. This is the concept called a pivot transaction, drawing a boundary at irreversible operations like sending emails or physical shipping. Once a Saga has passed the pivot, it can only go forward; if it fails before that, you can roll back via compensation.
  3. Prepare for compensation failures. Compensation itself can fail, so always have a path to push to a dead letter queue and allow human intervention.
@Service
@RequiredArgsConstructor
public class InventoryCompensationService {
    private final InventoryRepository inventoryRepository;
    private final ProcessedEventRepository processedRepository;

    // The entire method commits in the same transaction.
    // Inventory increment and processed record advance atomically, preventing double execution.
    @Transactional
    public void release(String reservationId, String productId, int quantity) {
        if (processedRepository.existsById("release:" + reservationId)) {
            return;
        }
        inventoryRepository.increaseStock(productId, quantity);
        processedRepository.save(new ProcessedEvent("release:" + reservationId));
    }
}

Idempotency, Order Guarantees, and Retries

As long as you use messaging, “the same event arrives twice” and “order gets swapped” are unavoidable. The implementation side needs to have resilience.

For idempotency, the simplest approach is to guarantee it with a processed table keyed by event ID. Upon receiving, first check if it’s already processed, and if not, commit the main processing and the record in the same transaction.

Order guarantees are handled with Kafka’s partition key. Events with the same orderId always go into the same partition, so order doesn’t break on the consumer side. Also, when changing event schemas, avoid breaking changes and consider a strategy of version-managing them through a schema registry using Avro or JSON Schema for safety.

For transient failures, combine with Resilience4j’s Retry. See How to Implement Resilience4j Circuit Breaker in Spring Boot for details.

@Retry(name = "paymentClient")
@CircuitBreaker(name = "paymentClient")
public void charge(String orderId, long amount) {
    paymentApi.charge(new ChargeRequest(orderId, amount));
}

The minimal corresponding application.yml looks like this.

resilience4j:
  retry:
    instances:
      paymentClient:
        max-attempts: 3
        wait-duration: 500ms
        retry-exceptions:
          - java.io.IOException
          - org.springframework.web.client.ResourceAccessException
  circuitbreaker:
    instances:
      paymentClient:
        sliding-window-size: 20
        failure-rate-threshold: 50

That said, retries with side effects, like “re-charging something already paid,” are dangerous, so it’s important to judge retry eligibility per operation.

How to Ensure Observability

Saga’s biggest weakness is that “processing is scattered across multiple services, making it hard to trace.” The countermeasures are pretty well established.

  • Attach correlationId to all events and enable cross-cutting tracing with Micrometer Tracing
  • Put correlationId into MDC during log output to create a state where you can search at once in Loki, Elasticsearch, etc.
  • Persist Saga state in the DB so progress can be viewed from operational dashboards
  • Push failure events to Kafka’s dead-letter topic and fire alerts

With Orchestration style, the state DB itself becomes the visualization target, but even with Choreography style, having a separate “Saga state table” makes operations much easier.

Operational Anti-Patterns

Here are some failure patterns commonly seen in the field.

  • Going to production without implementing compensation
  • No correlation ID, so logs can’t be traced when failures occur
  • Breaking changes to event schemas that break consumers
  • Saga state isn’t persisted anywhere, so progress is lost on restart
  • “Just retry everything” approach that ends up executing operations with side effects multiple times

All of these are extremely painful to fix later, so they’re things you want to nail down at the initial design stage.

Summary

The Saga pattern is a practical answer to the problem of not being able to use ACID transactions across microservices.

Remember the broad selection criterion: Choreography style for simple flows, Orchestration style when there are many steps and you want to emphasize observability. On top of that, as long as you factor in compensating transaction design, idempotency, and tracking via correlation ID from the start, you won’t go far wrong.

For related technologies, reading the articles on Kafka, RabbitMQ, Resilience4j, and Transaction Management together should round out the building blocks that Saga assumes.