A serverless solution to schedule and monitor DynamoDB On-Demand backup can be achieved using AWS Lambda and AWS Event Bridge for scheduling the hourly or daily backup.
Problem Statement:-
- We want to keep a daily backup of DynamoDB tables but we only want to store the 3-day-old backup in our system.
- We want to Monitor the backup job of each selected table and raise alerts for the failure of a job.
Approach Towards Solution:-
Let’s say you want to take a backup of your DynamoDB tables each day. A simple way to achieve this is to use an Amazon Event Bridge rule to trigger an AWS Lambda function daily.
In this scenario, you will have the code required to call the DynamoDB: CreateBackup API operation in your Lambda function. Setting this up requires configuring an IAM role, setting an Event Bridge Rule, and creating a Lambda function. For monitoring, we can use another lambda function which will check the status of each backup job and raise an alert on failure. Below is the high-level architecture of above mention approach which we discuss in detail.
following are the steps to implement the above-explained structure.
1. Create an AWS Event Bridge Rule which will trigger the scheduled backup lambda. You can name it DynamoDBBackupScheduleRule
or something else of your choice. The following attribute will be used in the creation of this Rule.
- RuleName: DynamoDBBackupScheduleRule
- EventBusName: Default (If there is any specific case you can create your EventBus)
- State: enabled (used to enable and disable event rule)
- ScheduleExpression: rate(1 day) -> Create a daily backup of your database tables.
- Target: DynamoDBBackupScheduleLambda(ARN) -> Creating this in the next step.
2. Create a Lambda function that the above-created DynamoDBBackupScheduleRule will trigger and the function will perform the following operation:-
- Create a backup job for each required table, because backups are created on an individual table in DynamoDB.
- Send a message to SQS (Simple Queue Service) DyanmoDBBackupMonitoringQueue with the backup jobs ARN of each table.
Now here comes the most tricky, time-saving, and money-saving step. Most of you may be thinking why are we using SQS for this scenario? Let me explain it to you.
Use of SQS in the Solution:-
Let's suppose you have a big dataset in which the backup process can take up to hours or even a day. So you can't wait for your lambda to complete this process and tell you the backup process result because this can lead you to higher lambda processing time which essentially will blank your pocket.
So what we will be doing with the help of SQS that we will trigger another lambda(Monitoring Lambda) after a specific period of up to 15 minutes. That will check for the failure and success of those backup operations instead of waiting for a long time to complete those processes. I think you understand this, let's move to step third.
3. We will create a queue DyanmoDBBackupMonitoringQueue with a message delay of 300 seconds. Whenever a message will be received in the queue it will available to pool only after 300 seconds because of this delay.
The following attribute will be used in the creation of Queue.
- QueueName: DyanmoDBBackupMonitoringQueue
- ContentBasedDeduplication: true
- DeduplicationScope: message group (use to prevent the duplication of the same messages)
- DelaySeconds: 300
- FifoQueue: true
- MessageRetentionPeriod: 3660 (A value between 60 (1 minute) to 1,209,600 (14 days)).
4. Create a Lambda function that will pool the messages of the above-created. You can name this function DyanmoDBBackupMonitoringLambda. Following are the job of this function.
- For each job ARN in the received message of SQS. It will get the status of each backup job and take action as per that.
- Keep sending and pooling messages into the queue and from the queue respectively until all backup operations are either failed or successful.
- Once all table has been backed up, delete the backup data earlier than the previous 3 days.
## Psuedo code for the operation of this DyanmoDBBackupMonitoringLambda function
if job status is AVAILABLE remove it from the recieved ARNs(list)
else if job status is DELETED remove it from the recieved ARNs(list) and show the job failure error.
else if recieved ARNs(list) is empty, terminate the monitoring process.
else send the message again in DyanmoDBBackupMonitoringQueue with updated backup ARNs(list).
That’s is we have successfully implemented a serverless solution to schedule and monitor DynamoDB backup functionality.
Conclusion:-
A serverless architecture provides a highly flexible and scalable solution for scheduling and monitoring DynamoDB On-Demand backups. By utilizing AWS services such as Lambda, Event Bridge, CloudWatch, and AWS SQS, you can create a fully automated solution that eliminates the need for manual intervention and provides real-time visibility into the backup process.