This post is part of a series on NServiceBus. In the previous articles, we looked at fundamental concepts for messaging and how to get started with NServiceBus. In this article, we will begin to take a look at sagas, or long-running processes that maintain state, and how they are implemented in NServiceBus. All the code for this series can be found in the NServiceBusTutorial repository on GitHub.
What is a Saga? đ
A saga is a long-lived process that can span multiple messages and transactions. Often, this can include synchronizing integrations from an orchestrating service to many other services while keeping track of failures and coordinating error handling. An instance of a saga is used to maintain the state of a business workflow and coordinate the interactions between multiple services using messages. Sagas are a foundational component in Event-Driven Architecture and a powerful tool for building complex workflows in distributed systems. NServiceBus provides a built-in implementation of sagas that makes it easy to create sagas in your applications.
Simple Saga Example â»ïž
Continuing the domain we have used in previous articles, letâs consider a simple scenario that âthe businessâ has requested we implement. Once a contributor is created in our system, we need to send them a welcome notification and have them confirm their phone number. If they respond within 24 hours, they can be marked as verified. If the contributor fails to confirm their phone number within 24 hours, the system should mark the contributor as not verified. We will use a saga to manage this process.
In the next blog post, we will look at the code for this saga. For now, letâs consider the steps that the saga will need to take for both success and failure verification processes.
First, our Web application receives a request to create a new contributor. The Web application sends a message to the Worker endpoint to create the contributor while responding to the requestor with an accepted payload. The Worker endpoint creates the contributor and publishes an event that the contributor has been created. This is where our saga begins. The saga listens to the ContributorCreated event and does two steps while starting our workflow. First, the saga will register a timeout for it to handle in 24 hours if it is not marked as complete beforehand. Then within the same handler, it will send a command to the Worker endpoint to begin verification. The Worker endpoint processing this command is responsible for sending an SMS notification to the contributorâs phone number. This is where our process diverges to two separate outcomes. If the contributor confirms their phone number within 24 hours, the Web application will mark the contributor as verified and publish an event that the contributor has been verified. The saga will handle this final event and mark the saga as complete. But what if our contributor does not confirm their phone number within 24 hours?
If the contributor doesnât confirm their phone number within 24 hours, the saga will process the timeout registered when it began. This timeout handler will mark the saga as completed and send a command to the Worker endpoint to mark the contributor as not verified. The Worker endpoint will process this command and update the contributorâs status in the database. At the end of the process, the contributorâs verification status has been updated to reflect the outcome of the verification and the saga has completed.
A Note on Alternative Approaches đ
If youâve been focusing on thinking architecturally or practicing your architecture skills, you may be considering options for implementing this process. You might think that a batch job could be scheduled to run every 24 hours to check for unverified contributors. This is a valid approach, but it has some drawbacks. First, it is not real-time. If the business requires real-time verification, this approach will not work. Second, it is not scalable. As the number of contributors grows, the batch job will take longer to run, and the system will become less responsive. Finally, it is not fault-tolerant. If the batch job fails, the system will not be able to mark contributors as not verified. Sagas are a better approach because they are real-time, scalable, and fault-tolerant. If moving away from batch jobs sounds like something youâd like to learn more about, I encourage you to check out this article entitled Death to the Batch Job.
Up Next đ
In the following blog post, we will dive into how to implement this saga using NServiceBus. Stay tuned!