Synchronize Data from NFS Server to Amazon S3 using AWS DataSync
SHARE THE BLOG
Introduction
AWS DataSync is a service that can be used to transfer files between different storage services, including NFS (Network File System) and Amazon S3 (Simple Storage Service). Below is a runbook that outlines the steps to transfer files from an NFS to an S3 bucket using AWS DataSync.
This runbook provides a high-level overview of transferring files from NFS to S3 using AWS DataSync, and additional configurations and optimizations may be required based on specific use cases and requirements.
Prerequisites
- AWS Account
- NFS Server
- S3 Bucket
- EC2 Instance (for DataSync Agent)
Step 1: Set up the DataSync Agent
Launch EC2 Instance
- Launch an EC2 instance in your AWS account. This instance will host the DataSync Agent.
- Ensure that the instance has internet access through a VPC with an internet gateway.
Install the DataSync Agent
- Connect to the EC2 instance and download the DataSync Agent installation script
wget
- Run the installation script
sudo bash install_datasync_agent.sh
- Follow the prompts to complete the installation
Activate the DataSync Agent
- After installation, go to the AWS DataSync Console.
- Navigate to “Agents” and click on “Create agent”.
- Enter the IP address of the EC2 instance under “Agent Address” to get the activation key.
- Follow the next steps to complete the activation process.
Step 2: Configure Source Location (NFS)
Create Source Location
- In the AWS DataSync Console, navigate to “Locations”.
- Click on “Create location” and select “Network file system (NFS)” as the location type.
- Enter the NFS server details and configure the mount path.
- Select the DataSync Agent created in Step 1.
- Enter the “Domain name or IP address” for the NFS Server
- Enter the Mount path exported by the NFS server or a subdirectory of an exported path.
- Complete the location creation process.
Step 3: Configure Destination Location (S3)
Create Destination Location
- In the AWS DataSync Console, navigate to “Locations”.
- Click on “Create location” and select “Amazon S3” as the location type.• Select the S3 bucket where you want to transfer the files.
- Choose a folder prefix that will be used for the data transfer.
- Choose the IAM role used to access the selected S3 bucket or click “Autogenerate” for DataSync to automatically create an IAM role with the permissions required to access the S3 bucket.
- Configure the necessary settings and complete the location creation process.
Step 4: Create and Start a Task
Create a Task
- In the AWS DataSync Console, navigate to “Tasks”.
- Click on “Create task”.
- Select the source location (NFS) and destination location (S3) created in the previous steps.
- Configure the task settings, including options, filters, and schedule as per your requirements.
- Review the configuration and create the task.
Start the Task
- Once the task is created, select the task and click on “Start”.
- Monitor the progress in the AWS DataSync Console.
Step 5: Monitor and Verify
Monitor Task Execution
- Go to the “Tasks” section in the AWS DataSync Console.
- Select your task and monitor the “Status” and “Progress” of the task execution.
Verify Data Transfer
- Once the task is completed, verify the transferred files in the destination S3 bucket.
- Ensure that all files from the NFS source location are accurately transferred to the S3 bucket.
Step 6: Clean Up (Optional)
Delete Task
If the task is no longer needed, you can delete it from the “Tasks” section in the AWS DataSync Console
Delete Locations
Optionally, delete the source and destination locations from the “Locations” section in the AWS DataSync Console.
Terminate EC2 Instance
If the DataSync Agent is not needed for future tasks, consider terminating the EC2 instance to avoid incurring additional costs.
Conclusion
In summary, AWS DataSync offers an efficient solution for migrating files from NFS to S3. The runbook covers the setup of the DataSync Agent, configuration of source and destination locations, task creation, and monitoring. This guide empowers users to streamline data transfer with confidence, ensuring reliability and scalability. As organizations adopt cloud-native architectures, AWS DataSync simplifies workflows while maintaining data integrity and security. Its flexibility and seamless integration with AWS services make it a valuable tool for diverse data transfer requirements. The runbook emphasizes regular monitoring, verification, and cleanup for a well-managed and optimized data transfer process.
Notes
- Ensure that the necessary security groups and network ACLs are configured to allow traffic between the EC2 instance (DataSync Agent) and the NFS server.
- Regularly monitor the AWS DataSync Console for any issues or errors during the task execution.
- Consider enabling logging and monitoring using Amazon CloudWatch for enhanced visibility and troubleshooting.