Introduction
On this submit I’ll stroll you thru find out how to leverage AWS Step Features to implement the SAGA Sample.
Put merely, the Saga sample is a failure administration sample, that gives us the means to determine semantic consistency in our distributed purposes by offering compensating transactions for each transaction the place you could have multiple collaborating companies or capabilities.
For our use case, think about we now have a workflow that goes as the next:
- The consumer books a resort
- If that succeeds, we need to ebook a flight
- If reserving a flight succeeds we need to ebook a rental
- If reserving a rental succeeds, we take into account the circulate a hit.
As you could have guessed, that is the completely happy situation. The place all the things went proper (shockingly …).
Nevertheless, if any of the steps fails, we need to undo the adjustments launched by the failed step, and undo all of the prior steps if any.
What if reserving the resort step failed? How will we proceed? What if the reserving resort step passes however reserving a flight fails? We’d like to have the ability to revert the adjustments.
Instance:
- Consumer books a resort efficiently
- Reserving the flight failed
- Cancel the flight (assuming the failure occurred after we saved the flight file within the database)
- Cancel the resort file
- Fail the machine
AWS Step capabilities may help us right here, since we are able to implement these functionalities as steps (or duties). Step capabilities can orchestrate all these transitions simply.
Deploying The Assets
You will discover the code repository right here.
Please consult with this part to deploy the assets.
For the complete record of the assets deployed, try this desk.
DynamoDB Tables
In our instance, we’re deploying 3 DynamoDB tables:
- BookHotel
- BookFlight
- BookRental
The next is the code liable for creating the BookHotel desk
module "book_hotel_ddb" {
supply = "./modules/dynamodb"
table_name = var.book_hotel_ddb_name
billing_mode = var.billing_mode
read_capacity = var.read_capacity
write_capacity = var.write_capacity
hash_key = var.hash_key
hash_key_type = var.hash_key_type
additional_tags = var.book_hotel_ddb_additional_tags
}
Lambda Features
We will probably be counting on 6 Lambda capabilities to implement our instance:
- BookHotel
- BookFlight
- BookRental
- CancelHotel
- CancelFlight
- CancelRental
The capabilities are fairly easy and easy.
BookHotel Perform
exports.handler = async (occasion) => {
...
const {
confirmation_id,
checkin_date,
checkout_date
} = occasion
...
strive {
await ddb.putItem(params).promise();
console.log('Success')
} catch (error) {
console.log('Error: ', error)
throw new Error("Sudden Error")
}
if (confirmation_id.startsWith("11")) {
throw new BookHotelError("Anticipated Error")
}
return {
confirmation_id,
checkin_date,
checkout_date
};
};
For the complete code, please checkout the index.js file
As you’ll be able to see, the perform expects an enter of the next format:
- confirmation_id
- checkin_date
- checkout_date
The perform will create an merchandise within the BookHotel desk. And it’ll return the enter as an output.
To set off an error, you’ll be able to create a confirmation_id that begins with ’11’ this may throw a customized error that the step perform will catch.
CancelHotel Perform
const AWS = require("aws-sdk")
const ddb = new AWS.DynamoDB({ apiVersion: '2012-08-10' });
const TABLE_NAME = course of.env.TABLE_NAME
exports.handler = async (occasion) => {
var params = {
TableName: TABLE_NAME,
Key: {
'id': { S: occasion.confirmation_id }
}
};
strive {
await ddb.deleteItem(params).promise();
console.log('Success')
return {
statusCode: 201,
physique: "Cancel Resort uccess",
};
} catch (error) {
console.log('Error: ', error)
throw new Error("ServerError")
}
};
This perform merely deletes the merchandise that was created by the BookHotel perform utilizing the confirmation_id as a key.
We might have checked if the merchandise was created. However to maintain it easy, and I’m assuming that the failure of the Reserving capabilities all the time occur after the information had been created within the tables.
💡 NOTE: The identical logic goes for all the opposite Ebook and Cancel capabilities.
Reservation Step Perform
# Step Perform
module "step_function" {
supply = "terraform-aws-modules/step-functions/aws"
title = "Reservation"
definition = templatefile("${path.module}/state-machine/reservation.asl.json", {
BOOK_HOTEL_FUNCTION_ARN = module.book_hotel_lambda.function_arn,
CANCEL_HOTEL_FUNCTION_ARN = module.cancel_hotel_lambda.function_arn,
BOOK_FLIGHT_FUNCTION_ARN = module.book_flight_lambda.function_arn,
CANCEL_FLIGHT_FUNCTION_ARN = module.cancel_flight_lambda.function_arn,
BOOK_RENTAL_LAMBDA_ARN = module.book_rental_lambda.function_arn,
CANCEL_RENTAL_LAMBDA_ARN = module.cancel_rental_lambda.function_arn
})
service_integrations = {
lambda = {
lambda = [
module.book_hotel_lambda.function_arn,
module.book_flight_lambda.function_arn,
module.book_rental_lambda.function_arn,
module.cancel_hotel_lambda.function_arn,
module.cancel_flight_lambda.function_arn,
module.cancel_rental_lambda.function_arn,
]
}
}
kind = "STANDARD"
}
This is the code that creates the step perform. I’m counting on a terraform module to create it.
This piece of code, will create a step perform with the reservation.asl.json file as a definition. And within the service_integrations, we’re giving the step perform the permission to invoke the lambda capabilities (since these capabilities are all a part of the step perform workflow)
Beneath is the complete diagram for the step funtion:
The reservation.asl.json is counting on the Amazon State language.
For those who open the file, you’ll discover on the second line the "StartAt" : "BookHotel"
. This tells the step capabilities to start out on the BookHotel State.
Comfortable Situation
"BookHotel": {
"Kind": "Activity",
"Useful resource": "${BOOK_HOTEL_FUNCTION_ARN}",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Catch": [
{
"ErrorEquals": [
"BookHotelError"
],
"ResultPath": "$.error-info",
"Subsequent": "CancelHotel"
}
],
"Subsequent": "BookFlight"
},
The BookHotel state is a Activity. With a “Useful resource” that will probably be resolved to the BookHotel Lambda Perform by way of terraform.
As you might need observed, I’m utilizing a retry block. The place the step perform will retry executing the BookHotel capabilities as much as 3 instances (after the primary try) in case of an error that is the same as any of the next errors:
- “States.Timeout”
- “Lambda.ServiceException”
- “Lambda.AWSLambdaException”
- “Lambda.SdkClientException”
You possibly can ignore the “Catch” block for now, we’ll get again to it within the sad situation part.
After the BookHotel process is finished, the step perform will transition to the BookFlight, as specified within the “Subsequent” subject.
"BookFlight": {
"Kind": "Activity",
"Useful resource": "${BOOK_FLIGHT_FUNCTION_ARN}",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Catch": [
{
"ErrorEquals": [
"BookFlightError"
],
"ResultPath": "$.error-info",
"Subsequent": "CancelFlight"
}
],
"Subsequent": "BookRental"
},
The BookFlight state follows the identical sample. As we retry invoking the BookFlight perform if we face any of the errors specified within the Retry block. If no error is thrown the step perform will transition to the BookRental state.
"BookRental": {
"Kind": "Activity",
"Useful resource": "${BOOK_RENTAL_LAMBDA_ARN}",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Catch": [
{
"ErrorEquals": [
"BookRentalError"
],
"ResultPath": "$.error-info",
"Subsequent": "CancelRental"
}
],
"Subsequent": "ReservationSucceeded"
},
The BookRental state follows the identical sample. Once more we retry invoking the BookRental perform if we face any of the errors specified within the Retry block. If no error is thrown the step perform will transition to the ReservationSucceeded state.
"ReservationSucceeded": {
"Kind": "Succeed"
},
The ReservationSucceeded, is a state with Succeed kind.
On this case it terminates the state machine efficiently
Sad Eventualities
Oh no BookHotel failed
As you recall, within the BookHotel state, I included a Catch block. Within the BookHotel perform, if the confirmation_id begins with 11, a customized error of BookHotelError kind will probably be thrown. This “Catch block” will catch it, and can use the state talked about within the “Subsequent” subject, which is the CancelHotel on this case.
"CancelHotel": {
"Kind": "Activity",
"Useful resource": "${CANCEL_HOTEL_FUNCTION_ARN}",
"ResultPath": "$.output.cancel-hotel",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Subsequent": "ReservationFailed"
},
The CancelHotel is a “Activity” as nicely, and has a retry block to retry invoking the perform in case of an sudden error. The “Subsequent” subject instructs the step perform to transition to the “ReservationFailed” state.
"ReservationFailed": {
"Kind": "Fail"
}
The “ReservationFailed” state is a Fail kind, it would terminate the machine and mark it as “Failed”.
BookFlight is failing
We are able to instruct the BookFlight lambda perform to throw an error by passing a confirmation_id that begins with 22.
The BookFlight step perform process, has a Catch block, that may catch the BookFlightError, and instruct the step perform to transition to the CancelFlight state.
"CancelFlight": {
"Kind": "Activity",
"Useful resource": "${CANCEL_FLIGHT_FUNCTION_ARN}",
"ResultPath": "$.output.cancel-flight",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Subsequent": "CancelHotel"
},
Much like the CancelHotel, the CancelFlight state will set off the CancelFlight lambda perform, to undo the adjustments. Then it would instruct the step perform to go to the following step, CancelHotel. And we noticed earlier that the CancelHotel will undo the adjustments launched by the BookHotel, and can then name the ReservationFailed to terminate the machine.
BookRental is failing
The BookRental lambda perform will throw the ErrorBookRental error if the confirmation_id begins with 33.
This error will probably be caught by the Catch block within the BookRental process. And can instruct the step perform to go to the CancelRental state.
"CancelRental": {
"Kind": "Activity",
"Useful resource": "${CANCEL_RENTAL_LAMBDA_ARN}",
"ResultPath": "$.output.cancel-rental",
"TimeoutSeconds": 10,
"Retry": [
{
"ErrorEquals": [
"States.Timeout",
"Lambda.ServiceException",
"Lambda.AWSLambdaException",
"Lambda.SdkClientException"
],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 1.5
}
],
"Subsequent": "CancelFlight"
},
Much like the CancelFlight, the CancelRental state will set off the CancelRental lambda perform, to undo the adjustments. Then it would instruct the step perform to go to the following step, CancelFlight. After cancelling the flight, the CancelFlight has a Subsequent subject that instructs the step perform to transition to the CancelHotel state, which can undo the adjustments and name the ReservationFailed state to terminate the machine.
Conclusion
On this submit, we noticed how we are able to leverage AWS Step Features to orchestrate and implement a fail administration technique to determine semantic consistency in our distributed reservation utility.
I hope you discovered this text helpful. Thanks for studying … 🤓