Synchronize Data from NFS Server to Amazon S3 using AWS DataSync

SHARE THE BLOG

Facebook
Twitter
LinkedIn
Email
WhatsApp

Introduction

AWS DataSync is a service that can be used to transfer files between different storage services, including NFS (Network File System) and Amazon S3 (Simple Storage Service). Below is a runbook that outlines the steps to transfer files from an NFS to an S3 bucket using AWS DataSync.

This runbook provides a high-level overview of transferring files from NFS to S3 using AWS DataSync, and additional configurations and optimizations may be required based on specific use cases and requirements.

Prerequisites

  • AWS Account
  • NFS Server
  • S3 Bucket
  • EC2 Instance (for DataSync Agent)

Step 1: Set up the DataSync Agent

Launch EC2 Instance

  • Launch an EC2 instance in your AWS account. This instance will host the DataSync Agent.
  • Ensure that the instance has internet access through a VPC with an internet gateway.

Install the DataSync Agent

  • Connect to the EC2 instance and download the DataSync Agent installation script
				
					wget <https://datasync-agent-installation-scripts.s3.amazonaws.com/install_datasync_agent.sh>
				
			
  • Run the installation script
				
					sudo bash install_datasync_agent.sh
				
			
  • Follow the prompts to complete the installation

Activate the DataSync Agent

  • After installation, go to the AWS DataSync Console.
  • Navigate to “Agents” and click on “Create agent”.
  • Enter the IP address of the EC2 instance under “Agent Address” to get the activation key.
  • Follow the next steps to complete the activation process.

Step 2: Configure Source Location (NFS)

Create Source Location

  • In the AWS DataSync Console, navigate to “Locations”.
  • Click on “Create location” and select “Network file system (NFS)” as the location type.
  • Enter the NFS server details and configure the mount path.
  • Select the DataSync Agent created in Step 1.
  • Enter the “Domain name or IP address” for the NFS Server
  • Enter the Mount path exported by the NFS server or a subdirectory of an exported path.
  • Complete the location creation process.

Step 3: Configure Destination Location (S3)

Create Destination Location

  • In the AWS DataSync Console, navigate to “Locations”.
  • Click on “Create location” and select “Amazon S3” as the location type.• Select the S3 bucket where you want to transfer the files.
  • Choose a folder prefix that will be used for the data transfer.
  • Choose the IAM role used to access the selected S3 bucket or click “Autogenerate” for DataSync to automatically create an IAM role with the permissions required to access the S3 bucket.
  • Configure the necessary settings and complete the location creation process.

Step 4: Create and Start a Task

Create a Task

  • In the AWS DataSync Console, navigate to “Tasks”.
  • Click on “Create task”.
  • Select the source location (NFS) and destination location (S3) created in the previous steps.
  • Configure the task settings, including options, filters, and schedule as per your requirements.
  • Review the configuration and create the task.

Start the Task

  • Once the task is created, select the task and click on “Start”.
  • Monitor the progress in the AWS DataSync Console.

Step 5: Monitor and Verify

Monitor Task Execution

  • Go to the “Tasks” section in the AWS DataSync Console.
  • Select your task and monitor the “Status” and “Progress” of the task execution.

Verify Data Transfer

  • Once the task is completed, verify the transferred files in the destination S3 bucket.
  • Ensure that all files from the NFS source location are accurately transferred to the S3 bucket.

Step 6: Clean Up (Optional)

Delete Task

If the task is no longer needed, you can delete it from the “Tasks” section in the AWS DataSync Console

Delete Locations

Optionally, delete the source and destination locations from the “Locations” section in the AWS DataSync Console.

Terminate EC2 Instance

If the DataSync Agent is not needed for future tasks, consider terminating the EC2 instance to avoid incurring additional costs.

Conclusion

In summary, AWS DataSync offers an efficient solution for migrating files from NFS to S3. The runbook covers the setup of the DataSync Agent, configuration of source and destination locations, task creation, and monitoring. This guide empowers users to streamline data transfer with confidence, ensuring reliability and scalability. As organizations adopt cloud-native architectures, AWS DataSync simplifies workflows while maintaining data integrity and security. Its flexibility and seamless integration with AWS services make it a valuable tool for diverse data transfer requirements. The runbook emphasizes regular monitoring, verification, and cleanup for a well-managed and optimized data transfer process.

Notes

  • Ensure that the necessary security groups and network ACLs are configured to allow traffic between the EC2 instance (DataSync Agent) and the NFS server.
  • Regularly monitor the AWS DataSync Console for any issues or errors during the task execution.
  • Consider enabling logging and monitoring using Amazon CloudWatch for enhanced visibility and troubleshooting.
Picture of Mueez Khan

Mueez Khan

Mueez Khan is a dynamic and skilled Managed Services (MS) Associate Engineer with a Master's in Computer Engineering from the American University of Sharjah, UAE with a focus on cloud datacentre energy optimization. He brings hands-on experience from Bespin Global MEA, specializing in architecting robust infrastructure services on AWS. Alongside his proficiency in Python, Java, and Linux, Mueez holds certifications from AWS, highlighting his commitment to excellence in cloud technologies. With a keen interest in emerging trends, he also possesses valuable experience in AI and certifications, reflecting his dedication to advancing innovative solutions in the realm of artificial intelligence and cloud computing.
Picture of Mueez Khan

Mueez Khan

Mueez Khan is a dynamic and skilled Managed Services (MS) Associate Engineer with a Master's in Computer Engineering from the American University of Sharjah, UAE with a focus on cloud datacentre energy optimization. He brings hands-on experience from Bespin Global MEA, specializing in architecting robust infrastructure services on AWS. Alongside his proficiency in Python, Java, and Linux, Mueez holds certifications from AWS, highlighting his commitment to excellence in cloud technologies. With a keen interest in emerging trends, he also possesses valuable experience in AI and certifications, reflecting his dedication to advancing innovative solutions in the realm of artificial intelligence and cloud computing.

Start the Conversation Today

Let's Talk arrow
Our experts provide support at every step, empowering you to maximize your cloud potential and stay future-ready.
Let's Talk arrow