Move S3 Objects faster without any hurdles

If you have ever tried to copy/move millions of objects from one bucket to another bucket, i bet ya that won’t be as simple as is. Below are some of the options everyone could think of.

  1. Initiating copy/move transformations through AWS console: This operation should’t get disturbed at any cost. Especially, if your organization uses something like automatic sign-out based on idle time, then it adds more time to complete the required operation by triggering it manually every time during timeout/failure.

By considering all the above issues while transferring data, it’s worth to look into some of the options below.

  1. S3 Batch operations: S3 Batch operations is a feature introduced by AWS to perform large scale batch operations on S3 objects across buckets. Batch operation expect a manifest (input csv file contains bucket and object details) as it’s input and executes given operations on top of it. Core component of batch operations is JOB, which possess details of S3 bucket along with objects on which required operation needs to be performed. It can execute a single or multiple operation on lists of Amazon S3 objects that was specified in manifest file. JOB supports multiple operations that includes 1) PUT copy object, 2) PUT object tagging, 3) PUT object ACL, 4) Initiate S3 Glacier Restore and 5) Invoke AWS Lambda function..etc
--src s3://source-bucket/source-folder/hourly_table
--dest s3://target-bucket/target-folder/hourly_table
--srcPattern .*\.log

3. AWS CLI SYNC Command:

  • AWS Sync command keeps source and target directories in sync.

Thank you for going through this post. Please keep me posted for any comments.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store