Skip to content

ACP-EDGE Postgres DB Logical Replica Creation

Imported from Confluence

Content may be outdated. Verify before following any procedures. View original | Last updated: April 2022

Summary

This article repeats steps mentioned in Tickets - Devopsbln 2591, taken to create a logical replica of ACP-EDGE project's main db utilizing a Postgres engine in AWS RDS. The replicated database is referred to below as MASTER and the resulting logical copy as REPLICA.

Steps

  1. (MASTER) Update track_commit_timestamp and wal_level settings of the MASTER via rds.logical_replication and track_commit_timestamp parameters of AWS RDS Parameter Group.

Note

To apply the changes the db has to be rebooted; hence you might want to consult with dev teams regarding any possible issues caused by downtime. It is strongly recommended to use the "reboot-with-failover" functionality if possible (for multi-regional db in AWS RDS).

 ~/ psql postgresql://acpedge:********@aws-production-acp-edge.c7bparwcbb52.eu-west-1.rds.amazonaws.com/ofw_production
psql (13.2, server 11.13)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

ofw_production=> select name,setting from pg_settings where name in (
ofw_production(>         'wal_level',
ofw_production(>         'track_commit_timestamp',
ofw_production(>         'max_worker_processes',
ofw_production(>         'max_replication_slots',
ofw_production(>         'max_wal_senders') ;
          name          | setting
------------------------+---------
 max_replication_slots  | 10
 max_wal_senders        | 10
 max_worker_processes   | 8
 track_commit_timestamp | on
 wal_level              | logical
(5 rows)
  1. (MASTER) Create a dedicated user for logical replication
ofw_production=> CREATE USER logical_replication_user WITH password '***';
CREATE ROLE
ofw_production=> GRANT rds_replication TO logical_replication_user;
GRANT ROLE
ofw_production=> show search_path ;
   search_path
-----------------
 "$user", public
(1 row)

ofw_production=> GRANT SELECT ON ALL TABLES IN SCHEMA public TO logical_replication_user;
GRANT
  1. (MASTER) Create a publication on the MASTER
ofw_production=> CREATE PUBLICATION pglogical_rep01 FOR ALL TABLES;
CREATE PUBLICATION
ofw_production=> SELECT pg_create_logical_replication_slot('pglogical_rep01', 'pgoutput');
 pg_create_logical_replication_slot
------------------------------------
 (pglogical_rep01,1187/8E161038)
(1 row)
  1. (MASTER) Create a snapshot of the MASTER database
 ~/ aws rds create-db-snapshot --db-instance-identifier aws-production-acp-edge --db-snapshot-identifier pre-logical-replication | cat
  1. (REPLICA) Create a REPLICA from that snapshot (we used Terraform for this step)

Note

Ensure that the track_commit_timestamp parameter is also set to "on" on the replica.

 ~/repos/aws-infrastructure-code/terraform/ [master] bundle exec rake 'terraform:plan_and_apply[acp_edge_deps,production-eu-west-1]'

rds_logical_replica.tf (Github)

  1. (REPLICA) Get the latest LSN position from MASTER — fetch it from a line containing "redo done" (in this example: 1187/8E8534D0)
 ~/ aws rds describe-db-log-files --db-instance-identifier aws-production-acp-edge-logical-replica | tail -10
            "LogFileName": "error/postgresql.log.2022-04-26-12",
            "LastWritten": 1650977966404,
            "Size": 8219
        },
        {
            "LogFileName": "error/postgresql.log.2022-04-26-13",
            "LastWritten": 1650978960608,
            "Size": 4049
        }
    ]
 ~/ aws rds download-db-log-file-portion --db-instance-identifier aws-production-acp-edge-logical-replica --log-file-name "error/postgresql.log.2022-04-26-12" | grep -oE "redo done.*"
redo done at 1187/8E8534D0\n2022-04-26 12:55:48 UTC::@:[7954]:LOG:  last completed transaction was at log time 2022-04-26 12:16:09.080016+00\n2022-04-26 12:55:48 UTC::@:[7954]:LOG:  checkpoint starting: end-of-recovery immediate\n2022-04-26 12:55:49 UTC::@:[7954]:LOG:  checkpoint complete: wrote 1567 buffers (0.2%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.487 s, sync=0.224 s, total=0.725 s; sync files=44, longest=0.199 s, average=0.006 s; distance=94207 kB, estimate=94207 kB\n2022-04-26 12:55:49 UTC::@:[7875]:LOG:  database system is ready to accept connections\n2022-04-26 12:59:26 UTC::@:[7875]:LOG:  received SIGHUP, reloading configuration files\n2022-04-26 12:59:26 UTC::@:[7875]:LOG:  parameter \"track_commit_timestamp\" cannot be changed without restarting the server\n2022-04-26 12:59:26 UTC::@:[7875]:LOG:  configuration file \"/rdsdbdata/config/postgresql.conf\" contains errors; unaffected changes were applied\n"
  1. (REPLICA) Create a subscription
 ~/ psql postgresql://acpedge:********@aws-production-acp-edge-logical-replica.c7bparwcbb52.eu-west-1.rds.amazonaws.com/ofw_production
psql (13.2, server 11.13)
SSL connection (protocol: TLSv1.2, cipher: ECDHE-RSA-AES256-GCM-SHA384, bits: 256, compression: off)
Type "help" for help.

ofw_production=> CREATE SUBSCRIPTION pglogical_sub01 CONNECTION 'host=aws-production-acp-edge.c7bparwcbb52.eu-west-1.rds.amazonaws.com port=5432 dbname=ofw_production user=logical_replication_user password=********' PUBLICATION pglogical_rep01
ofw_production-> WITH (
ofw_production(>   copy_data = false,
ofw_production(>   create_slot = false,
ofw_production(>   enabled = false,
ofw_production(>   connect = true,
ofw_production(>   slot_name = 'pglogical_rep01'
ofw_production(> );
CREATE SUBSCRIPTION
  1. (REPLICA) Get subscription id
ofw_production=> SELECT 'pg_'||oid::text AS "external_id"
ofw_production-> FROM pg_subscription
ofw_production-> WHERE subname = 'pglogical_sub01';
 external_id
-------------
 pg_27084319
(1 row)
  1. (REPLICA) Advance the subscription id to the LSN position from step 6
ofw_production=> SELECT pg_replication_origin_advance('pg_27084319', '1187/8E8534D0');
 pg_replication_origin_advance
-------------------------------

(1 row)
  1. (REPLICA) Enable the subscription
ofw_production=> ALTER SUBSCRIPTION pglogical_sub01 ENABLE;
ALTER SUBSCRIPTION
  1. (MASTER) Validate (values should be equal)
ofw_production=> select confirmed_flush_lsn, pg_current_wal_lsn() from pg_replication_slots;
 confirmed_flush_lsn | pg_current_wal_lsn
---------------------+--------------------
 118D/B3C46AA8       | 118D/B3C46AA8
(1 row)