Switchover steps:
Step 1.Do clean shutdown of Primary[5432] (-m fast or smart)
[postgres@localhost:/~]$/opt/Postgresql/9.3/bin/pg_ctl-D/opt/Postgresql/9.3/datastop-mf waitingforservertoshutdown....done serverstopped
Step 2.Check for sync status and recovery status of Standby[5433] before promoting it:
[postgres@localhost:/opt/Postgresql/9.3~]$psql-p5433-c'selectpg_last_xlog_receive_location()"receive_location",pg_last_xlog_replay_location()"replay_location",pg_is_in_recovery()"recovery_status";' receive_location|replay_location|recovery_status ------------------+-----------------+----------------- 2/9F000A20|2/9F000A20|t (1row)
Standby in complete sync. At this stage we are safe to promote it as Primary.
Step 3.Open the Standby as new Primary by pg_ctl promote or creating a trigger file.
[postgres@localhost:/opt/Postgresql/9.3~]$greptrigger_filedata_slave/recovery.conf trigger_file='/tmp/primary_down.txt' [postgres@localhost:/opt/Postgresql/9.3~]$touch/tmp/primary_down.txt [postgres@localhost:/opt/Postgresql/9.3~]$psql-p5433-c"selectpg_is_in_recovery();" pg_is_in_recovery ------------------- f (1row) InLogs: 2014-12-2900:16:04PST-26344--[host=]LOG:triggerfilefound:/tmp/primary_down.txt 2014-12-2900:16:04PST-26344--[host=]LOG:redodoneat2/A0000028 2014-12-2900:16:04PST-26344--[host=]LOG:selectednewtimelineID:14 2014-12-2900:16:04PST-26344--[host=]LOG:restoredlogfile"0000000D.history"fromarchive 2014-12-2900:16:04PST-26344--[host=]LOG:archiverecoverycomplete 2014-12-2900:16:04PST-26342--[host=]LOG:databasesystemisreadytoacceptconnections 2014-12-2900:16:04PST-31874--[host=]LOG:autovacuumlauncherstarted
Standby has been promoted as master and a new timeline followed which you can notice in logs.
Step 4.Restart old Primary as standby and allow to follow the new timeline by passing "recovery_target_timline='latest'" in $PGDATA/recovery.conf file.
[postgres@localhost:/opt/Postgresql/9.3~]$catdata/recovery.conf recovery_target_timeline='latest' standby_mode=on primary_conninfo='host=localhostport=5433user=postgres' restore_command='cp/opt/Postgresql/9.3/archives93/%f%p' trigger_file='/tmp/primary_131_down.txt' [postgres@localhost:/opt/Postgresql/9.3~]$/opt/Postgresql/9.3/bin/pg_ctl-D/opt/Postgresql/9.3/datastart serverstarting
If you go through recovery.conf its very clear that old Primary trying to connect to 5433 port as new Standby pointing to common WAL Archives location and started.
InLogs: 2014-12-2900:21:17PST-32315--[host=]LOG:databasesystemwasshutdownat2014-12-2900:12:23PST 2014-12-2900:21:17PST-32315--[host=]LOG:restoredlogfile"0000000E.history"fromarchive 2014-12-2900:21:17PST-32315--[host=]LOG:enteringstandbymode 2014-12-2900:21:17PST-32315--[host=]LOG:restoredlogfile"0000000D00000002000000A0"fromarchive 2014-12-2900:21:17PST-32315--[host=]LOG:restoredlogfile"0000000D.history"fromarchive 2014-12-2900:21:17PST-32315--[host=]LOG:consistentrecoverystatereachedat2/A0000090 2014-12-2900:21:17PST-32315--[host=]LOG:recordwithzerolengthat2/A0000090 2014-12-2900:21:17PST-32310--[host=]LOG:databasesystemisreadytoacceptreadonlyconnections 2014-12-2900:21:17PST-32325--[host=]LOG:startedstreamingWALfromprimaryat2/A0000000ontimeline14
Step 5.Verify the new Standby status.
[postgres@localhost:/opt/Postgresql/9.3~]$psql-p5432-c"selectpg_is_in_recovery();" pg_is_in_recovery ------------------- t (1row)
Cool,without any re-setup we have brought back old Primary as new Standby.
Switchback steps:
Step 1.Do clean shutdown of new Primary [5433]:
[postgres@localhost:/opt/~]$/opt/Postgresql/9.3/bin/pg_ctl-D/opt/Postgresql/9.3/data_slavestop-mf waitingforservertoshutdown....done serverstopped
Step 2.Check for sync status of new Standby [5432] before promoting.
Step 3.Open the new Standby [5432] as Primary by creating trigger file or pg_ctl promote.
[postgres@localhost:/opt/Postgresql/9.3~]$touch/tmp/primary_131_down.txt
Step 4.Restart stopped new Primary [5433] as new Standby.
[postgres@localhost:/opt/Postgresql/9.3~]$moredata_slave/recovery.conf recovery_target_timeline='latest' standby_mode=on primary_conninfo='host=localhostport=5432user=postgres' restore_command='cp/opt/Postgresql/9.3/archives93/%f%p' trigger_file='/tmp/primary_down.txt' [postgres@localhost:/opt/Postgresql/9.3~]$/opt/Postgresql/9.3/bin/pg_ctl-D/opt/Postgresql/9.3/data_slavestart serverstarting
You can verify the logs of new Standby.
Inlogs: [postgres@localhost:/opt/Postgresql/9.3/data_slave/pg_log~]$morepostgresql-2014-12-29_003655.log 2014-12-2900:36:55PST-919--[host=]LOG:databasesystemwasshutdownat2014-12-2900:34:01PST 2014-12-2900:36:55PST-919--[host=]LOG:restoredlogfile"0000000F.history"fromarchive 2014-12-2900:36:55PST-919--[host=]LOG:enteringstandbymode 2014-12-2900:36:55PST-919--[host=]LOG:restoredlogfile"0000000F.history"fromarchive 2014-12-2900:36:55PST-919--[host=]LOG:restoredlogfile"0000000E00000002000000A1"fromarchive 2014-12-2900:36:55PST-919--[host=]LOG:restoredlogfile"0000000E.history"fromarchive 2014-12-2900:36:55PST-919--[host=]LOG:consistentrecoverystatereachedat2/A1000090 2014-12-2900:36:55PST-919--[host=]LOG:recordwithzerolengthat2/A1000090 2014-12-2900:36:55PST-914--[host=]LOG:databasesystemisreadytoacceptreadonlyconnections 2014-12-2900:36:55PST-929--[host=]LOG:startedstreamingWALfromprimaryat2/A1000000ontimeline15 2014-12-2900:36:56PST-919--[host=]LOG:redostartsat2/A1000090
Very nice,without much time we have switched the duties of Primary and Standby servers. You can even notice the increment of the timeline IDs from logs for each promotion.
Like others all my posts are part of knowledge sharing,any comments or corrections are most welcome. :)
--Raghav