FairBid data transfer between S3 and GCS¶
Imported from Confluence
Content may be outdated. Verify before following any procedures. View original | Last updated: June 2023
Description¶
FairBid project requires data replication between AWS and GCP object storage buckets (both ways) following a specific schedule. Taking into account the fact that cloud vendors does not provide any tool to move data out of there management we're forced to utilize two services simultaniously for that purpose. When we copy data FROM-GCP-TO-AWS we utilize AWS DataSync and for FROM-AWS-TO-GCP - GCP Storage Transfer.
Info
Worth to mention that in other projects we've already used another solution - 3rd party tool called flexify.io which is cloud agnostic, but does not provide any embedded scheduling out of the box. They did notify us though that scheduling can be setup somewhere on their side, but this way we won't have any transparent way to manage this automation, which should not be considered a production grade solution.
Data transfer from AWS (S3) to GCP (GCS)¶
As mentioned earlier GCP's Storage Transfer service is used.
Info
FROM: 003250186609 → fairbid-analytic://druid_lookups/*
TO: ss-shared-ent-data-prod → gcs-core-services-agp-fairbid-silver-regional-useast1-prod://druid_lookups/
The job is running under agp-fairbid-prod-7i project:

Configuration-wise the only prerequisite on AWS side is a user with read/list access to a source bucket whose keys are used for the migration task setup (arn:aws:iam::003250186609:user/sa-fairbid-migration-tmp). On GCP side you need only the destination bucket to be created.
Data transfer from GCP (GCS) to AWS (S3)¶
Respectively for this line we used AWS DataSync.
Info
FROM: ss-shared-ent-data-prod → gcs-core-services-agp-fairbid-bronze-regional-useast1-prod://sdk-events/valid_events_creation_ts/*
TO: 003250186609 → fairbid-sdk-events://valid_events_creation_ts/
Configuration-wise from GCP side you need a service-account with HMAC key (ss-shared-ent-data-prod-fn → Cloud Storage → Settings → INTEROPERABILITY → Access keys for service accounts → CREATE A KEY).
Keep in mind that HMAC key can be provided only for service account created in the same GCP project.¶
On AWS side a DataSync agent was launched as a EC2 intance and onboarded using embedded instruments. Then a transfer task was created (RW IAM role added automatically by the service):

Useful links¶
DEVOPSBLN-3584
Migrating Google Cloud Storage To Amazon S3 Using Aws Datasync