Aerospike Backup and Restore¶
Imported from Confluence
Content may be outdated. Verify before following any procedures. View original | Last updated: October 2024
This article will explain how to create and restore Aerospike backup.
Create and restore PROD cluster¶
- Adjust and run Terragrunt code
Change the name of the cluster in terragrunt code:
Change the name of SA
Check or add your current ssh-key
Info
To create a new cluster in the same project pls copy all folder prod/aerospike-userdata and rename to aerospike-userdata2
Review and change other parameters if needed.
Run terragrunt plan and terragrunt apply after review.
➜ growth-iac git:(master) cd terraform/configs/prod/aerospike-userdata/us-east1
➜ us-east1 git:(master) ls
inventory.yml terragrunt.hcl
➜ us-east1 git:(master) terragrunt plan --terragrunt-source-update
...
➜ us-east1 git:(master) terragrunt apply
- Update the terragrunt Cloud DNS code for new cluster IPs or create a new set of records for each node.
Change the name of the cluster from 01 to 02 for example:
records = [
dependency.aerospike-userdata.outputs.instances_ip.vm-aerospike-growth-userdata-prod-useast1-02-0,
]...
Info
If you create a new set of records, change the field host for each node as well!!!
If you need metrics scraping Prometheus configuration also needs to be updated.
3. Modify and run Ansible code
Change the name of the cluster in the Ansible vars file or create and source a new vars file.
Update inventory file with new DNS records.
Use the created by terraform new ansible inventory.yml file and run Production playbook
Info
Install tar of gnu compatibility on macos: brew install gnu-tar
ansible-playbook -v -i ../terraform/configs/prod/aerospike-userdata/us-east1/inventory.yml \
playbook-vm-aerospike-growth-userdata-prod-useast1-01.yml --private-key /Users/andriyshamray/.ssh/google_compute_engine
- Get lists of disk Snapshots
Use gcloudcommand to generate a full list of snapshots sorted by source disk and grep timestamp(specify latest available date):
gcloud compute snapshots list --project agp-growth-prod-d1 --filter="sourceDisk~vm-aerospike-growth-userdata-prod-useast1-01" --sort-by=SRC_DISK | grep 2024060919581 | awk '{print $1}'
or list latest one:
gcloud compute snapshots list --project agp-growth-prod-d1 --filter="sourceDisk~vm-aerospike-growth-userdata-prod-useast1-01" --sort-by=SRC_DISK,~creationTimestamp|uniq -f2|cut -d ' ' -f1
- Add snapshots and run Terragrunt
Shutdown VMs for newly created cluster from Google console or using gcloud
Update terragrunt.hcl file with 2 new variables snapshot_names_enabled and snapshot_names:
snapshot_names_enabled = true
...attached_disks_parameters = [
{
...
snapshot_names = [
"vm-aerospike-bkp-gr-us-east1-b-20240609195810-jifl2tfu",
"vm-aerospike-bkp-gr-us-east1-b-20240609195810-5w4822r1",
...
"vm-aerospike-bkp-gr-us-east1-d-20240609195810-gd8l5171"
]
}
]
Run terragrunt apply two times! The first run will create disks and 2nd will attach each disks to the needed node.
6. Updated Firewall rule to allow new cluster internal code communication - firewall
7. Start a cluster and validate its health.
Start VMs from console or using gcloud cli:
Connect via ssh to one of VM and validate cluster startup. Could take ~30 min for service up, depending on cluster size.
gcloud compute ssh --zone "us-east1-b" "vm-aerospike-growth-userdata-prod-useast1-01-0" --tunnel-through-iap --project "agp-growth-prod-d1"
To check status and connect to the cluster run :
To check cluster health:
You should see Rx/Tx value not 0 as the cluster was restored from backup. All data synchronization could take 2 days but clients can already connect to the cluster.
Official docs
Full restore new cluster in DEV.¶
- Create new cluster using Terrafom code
For POC new terraform folder was created by path:
/growth-iac/terraform/configs/dev/aerospike-userdata-bkpBefore running pls review and adjust terragrunt.hcl file. As for now, it will create a 3-node cluster.Also, new DNS records were created for *aerospike-bkp-userdataGitlab MR: appgrowthplatform (Gitlab)* - Run ansible-playbook to set up a clean cluster
For Dev a new ansible playbook file was created playbook-vm-aerospike-bkp-growth-userdata-dev-useast1-01.yml
- Run gcloud script command to create snapshots and attach new disks.
#list disk name
for i in $(gcloud compute instances list --filter="name~vm-aerospike-growth-userdata-prod-useast1-*" \
--project agp-growth-prod-d1 --format="value(disks[].deviceName)" | tr ";" " /n")
do
echo $i | grep shadow
done
##Output example
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-0
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-1
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-2
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-3
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-4
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-5
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-6
vm-aerospike-growth-userdata-prod-useast1-01-0-shadow-7
#create snapshot
for i{0..7}
do
gcloud compute snapshots create vm-aerospike-growth-userdata-prod-useast1-01-1-shadow-"$i"snapshot \
--source-disk https://www.googleapis.com/compute/v1/projects/agp-growth-prod-d1/zones/us-east1-c/disks/vm-aerospike-growth-userdata-prod-useast1-01-1-shadow-$i \
--project agp-growth-dev-fm
done
#create disk from snapshot
#!/bin/bash
n='0'
z='us-east1-b'
for i in {0..7}
do
gcloud compute disks create "vm-aerospike-bkp-growth-userdata-dev-useast1-01-$n-shadow-$i" \
--zone=$z \
--source-snapshot=vm-aerospike-growth-userdata-prod-useast1-01-"$n"-shadow-"$i"snapshot \
--project=agp-growth-dev-fm
done
#attach disk
n='0'
z='us-east1-b'
for i in {0..7}
do
gcloud compute instances attach-disk "vm-aerospike-bkp-growth-userdata-dev-useast1-01-$n" \
--disk "vm-aerospike-bkp-growth-userdata-dev-useast1-01-$n-shadow-$i" \
--device-name="vm-aerospike-bkp-growth-userdata-dev-useast1-01-$n-shadow-$i" \
--project=agp-growth-dev-fm \
--zone=$z
done
On-demand backup¶
1. Backup creation
To have on-demand backup we can adjust snapshot-policy schedule or create snapshots using gcloud command line.
Terraform location for snapshot-policy :
2. Restore backup
As for now the only one tested way to restore backup is creating and mapping disks using gcloud cli.
In the future, we will work on doing this via terraform code as well.
#list snapshots
gcloud compute snapshots list --project $project --filter="sourceDisk~vm-aerospike-bkp-growth-userdata-dev-useast1-01" --sort-by=SRC_DISK | grep 20240602195811
Here is an example of a bash script that can be used for the full process.
project='agp-growth-dev-fm'
clustername='vm-aerospike-bkp-growth-userdata-dev-useast1-01'
DATE='20240530195810'
for i in {0..7}
do
for n in {0..2}
do
gcloud compute snapshots list --project $project | grep \
$clustername-$n-shadow-$i | grep $DATE | awk '{print $1}' | read foo
gcloud compute disks create $clustername-$n-shadow-new-$i \
--source-snapshot=$foo --project=$project | read disk
gcloud compute instances attach-disk $clustername-$n --disk \
$disk --device-name=$disk --project=$project
done
done