[postgres@db1 .ssh]$ ssh-copy-id -i id_rsa.pub db2
The authenticity of host 'db2 (192.168.100.38)' can't be established.
RSA key fingerprint is 70:02:66:0a:f5:3a:62:52:55:a2:98:b1:1c:d7:6c:73.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'db2,192.168.100.38' (RSA) to the list of known hosts.
postgres@db2's password:
Now try logging into the machine,with "ssh 'db2'",and check in:
.ssh/authorized_keys
to make sure we haven't added extra keys that you weren't expecting.
验证:
[postgres@db1 .ssh]$ ssh db2 date
Thu Dec 19 01:25:48 EST 2013
【当然也可通过说明文档中指出的使用walmgr3 �ssh-keygen 命令做免密码登录认证】
2.5.3 配置并启动主库
[root@db1 ~]# cd /opt/pg93/
[root@db1 pg93]# mkdir data
[root@db1 pg93]# chown postgres:postgres data/
[root@db1 pg93]# su - postgres
[postgres@db1 ~]$ initdb -D /opt/pg93/data/
[postgres@db1 data]$ vi postgresql.conf
listen_addresses = '*'
port = 5432
logging_collector = on
log_filename = 'postgresql.log'
log_connections = on
log_disconnections = on
[postgres@masterdata]$ pg_ctl start
2.5.4 配置walmgr3
主端配置文件:
[postgres@db1 pg93]$ cat wal-master.ini
[walmgr]
job_name = wal-master
logfile = /opt/pg93/data/log/%(job_name)s.log
pidfile = /opt/pg93/data/pid/%(job_name)s.pid
use_skylog = 0
master_db = port=5432 host=db1 dbname=template1
master_data = /opt/pg93/data
master_config = /opt/pg93/data/postgresql.conf
master_bin = /opt/pg93/bin
# set this only if you can afford database restarts during setup and stop.
# master_restart_cmd = pg_ctlcluster 9.1 main restart
slave = db2
slave_config = /opt/pg93/wal-slave.ini
walmgr_data = db2:/opt/pg93/slave_walmanager/backup
completed_wals = %(walmgr_data)s/logs.complete
partial_wals = %(walmgr_data)s/logs.partial
full_backup = %(walmgr_data)s/data.master
config_backup = %(walmgr_data)s/config.backup
# syncdaemon update frequency
loop_delay = 10.0
# use record based shipping available since 8.2
use_xlog_functions = 0
# pass -z to rsync,useful on low bandwidth links
compression = 0
# keep symlinks for pg_xlog and pg_log
keep_symlinks = 1
# tell walmgr to set wal_level to hot_standby during setup
hot_standby = 1
# periodic sync
#command_interval = 600
#periodic_command = /var/lib/postgresql/walshipping/periodic.sh
备端配置文件:
[postgres@db2pg93]$ cat wal-slave.ini
job_name = wal-standby
slave_data = /opt/pg93/data
slave_bin = /opt/pg93/bin
slave_stop_cmd = /opt/pg93/bin/pg_ctl -D /opt/pg93/data stop
slave_start_cmd = /opt/pg93/bin/pg_ctl -D /opt/pg93/data start
#slave_config_dir = /tmp/test_slave
slave_config_dir = /opt/pg93/data
walmgr_data = /opt/pg93/slave_walmanager/backup
backup_datadir = no
keep_backups = 0
# archive_command =
# primary database connect string for hot standby -- enabling
# this will cause the slave to be started in hot standby mode.
primary_conninfo = host=db1 user=postgres port=5432
配置复制:
[postgres@db1 pg93]$ walmgr3 wal-master.ini setup
2013-12-20 02:05:19,289 32538 INFO Configuring WAL archiving
LOG: received SIGHUP,reloading configuration files
LOG: parameter "archive_command" changed to "/opt/skytools/bin/walmgr3 /opt/pg93/wal-master.ini xarchive %p %f"
此时配置文件postgresql.conf中如下参数发生变化:
wal_level = 'hot_standby'
archive_mode = 'on'
archive_command = '/opt/skytools/bin/walmgr3 /opt/pg93/data/wal-master.ini xarchive %p %f'
并且,在db2中生成了备份目录:
[postgres@db2pg93]$ ll slave_walmanager/backup/
total 16
drwxrwxr-x 2 postgres postgres 4096 Dec 19 22:04 config.backup
drwxrwxr-x 2 postgres postgres 4096 Dec 19 22:04 data.master
drwxrwxr-x 2 postgres postgres 4096 Dec 19 22:04 logs.complete
drwxrwxr-x 2 postgres postgres 4096 Dec 19 22:04 logs.partial
修改max_wal_senders参数:
max_wal_senders = 3
在pg_hba.conf中加入以下两条:
[postgres@db1 data]$ vi pg_hba.conf
host postgres postgres 192.168.100.0/24 trust
host replication postgres 192.168.100.0/24 trust
执行基础备份:
[postgres@db1 pg93]$ walmgr3 wal-master.ini backup
2013-12-20 02:05:58,339 28752 INFO Backup lock obtained.
sql: select pg_start_backup('FullBackup'); [port=5432 host=db1 dbname=template1]
2013-12-20 02:05:59,015 32550 INFO {count: 1,duration: 0.518224000931}
2013-12-20 02:06:00,098 32547 INFO Checking tablespaces
sql: select pg_stop_backup(); [port=5432 host=db1 dbname=template1]
2013-12-20 02:06:01,313 32564 INFO {count: 1,duration: 0.562467098236}
2013-12-20 02:06:02,876 28878 INFO Backup lock released.
Backup过程会在walmgr_data中生成基础备份数据,如下:
[postgres@db2pg93]$ ls slave_walmanager/backup/
config.backup/ data.master/ logs.complete/ logs.partial/
config.backup data.master logs.complete logs.partial
[postgres@db2pg93]$ ls slave_walmanager/backup/data.master/
PG_VERSION backup_label.old global pg_clog pg_notify pg_snapshots pg_stat_tmp pg_twophase pid
backup_label base log pg_multixact pg_serial pg_stat pg_subtrans pg_xlog recovery.conf
在备端执行恢复:
[postgres@db2pg93]$ walmgr3 wal-slave.ini restore
2013-12-20 02:09:38,190 28909 WARNING backup_datadir is disabled,deleting old data dir
server starting
[postgres@db2pg93]$ LOG: database system was interrupted; last known up at 2013-12-20 02:05:58 EST
LOG: entering standby mode
LOG: restored log file "00000003000000000000002B" from archive
LOG: redo starts at 0/2B000028
LOG: consistent recovery state reached at 0/2B0000F0
LOG: fetching timeline history file for timeline 3 from primary server
LOG: started streaming WAL from primary at 0/2C000000 on timeline 3
[postgres@db2pg93]$ cat data/recovery.conf
restore_command = '/opt/skytools/bin/walmgr3 /opt/pg93/wal-slave.ini xrestore %f "%p" %r'
standby_mode = 'on'
trigger_file = '/opt/pg93/slave_walmanager/backup/logs.complete/STOP'
primary_conninfo = 'host=db1 user=postgres port=5432'
archive_cleanup_command = '/opt/pg93/bin/pg_archivecleanup /opt/pg93/slave_walmanager/backup/logs.complete %r'
【至此,流复制已经实现。可将db1与db2的角色互换重复以上过程进行配置,便可实现db1与db2之间互相切换。】
2.5.5 验证
主端:
[postgres@db1 pg93]$ ps -ef | grep post
postgres 605 1 0 02:48 pts/1 00:00:00 /opt/pg93/bin/postgres
postgres 607 605 0 02:48 ? 00:00:00 postgres: checkpointer process
postgres 608 605 0 02:48 ? 00:00:00 postgres: writer process
postgres 609 605 0 02:48 ? 00:00:00 postgres: wal writer process
postgres 610 605 0 02:48 ? 00:00:00 postgres: autovacuum launcher process
postgres 611 605 0 02:48 ? 00:00:00 postgres: archiver process last was 000000030000000000000030.00000028.backup
postgres 612 605 0 02:48 ? 00:00:00 postgres: stats collector process
postgres 660 605 0 02:50 ? 00:00:00 postgres:wal sender processpostgres 192.168.100.38(56960) streaming 0/310000C8
[postgres@db1 pg93]$ createdb pgbench
[postgres@db1 pg93]$ pgbench -i -s 1 pgbench
NOTICE: table "pgbench_history" does not exist,SimSun;vertical-align:baseline;">NOTICE: table "pgbench_tellers" does not exist,SimSun;vertical-align:baseline;">NOTICE: table "pgbench_accounts" does not exist,SimSun;vertical-align:baseline;">NOTICE: table "pgbench_branches" does not exist,SimSun;vertical-align:baseline;">creating tables...
100000 of 100000 tuples (100%) done (elapsed 0.20 s,remaining 0.00 s).
vacuum...
set primary keys...
done.
[postgres@db1 pg93]$ psql
psql (9.3.2)
Type "help" for help.
postgres=# select pg_current_xlog_location();
pg_current_xlog_location
--------------------------
0/31CB94B0
(1 row)
备端:
[postgres@db2pg93]$ ps -ef | grep post
root 24384 24063 0 Dec19 pts/1 00:00:00 su - postgres
postgres 24385 24384 0 Dec19 pts/1 00:00:00 -bash
postgres 29539 1 0 02:50 pts/1 00:00:00 /opt/pg93/bin/postgres -D /opt/pg93/data
postgres 29540 29539 0 02:50 ? 00:00:00 postgres: startup process recovering 000000030000000000000031
postgres 29544 29539 0 02:50 ? 00:00:00 postgres: checkpointer process
postgres 29545 29539 0 02:50 ? 00:00:00 postgres: writer process
postgres 29546 29539 0 02:50 ? 00:00:00 postgres: stats collector process
postgres 29548 29539 0 02:50 ? 00:00:00 postgres:wal receiver process streaming 0/310000C8
[postgres@db2pg93]$ psql pgbench
pgbench=# \d+
Listof relations
Schema | Name | Type | Owner | Size | Description
--------+------------------+-------+----------+---------+-------------
public | pgbench_accounts | table | postgres | 13 MB |
public | pgbench_branches | table | postgres | 16 kB |
public | pgbench_history | table | postgres | 0 bytes |
public | pgbench_tellers | table | postgres | 16 kB |
(4 rows)
pgbench=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
pgbench=# select pg_last_xlog_replay_location();
pg_last_xlog_replay_location
------------------------------
(1 row)
三、管理篇
【接2.2节环境进行以下实验】
3.1、变更拓扑关系
3.1.1 change-provider
【改变一个节点的provider】
当前拓扑关系如下:
[postgres@londiste1 londiste3]$ londiste3 db1.ini status
Queue: replika Local node: node1
node1 (root)
| Tables:1/0/0
| Lag:15s,Tick: 439
+--:node2 (branch)
| | Tables:1/0/0
| | Lag:15s,SimSun;vertical-align:baseline;"> | +--:node4 (branch)
+--:node3 (branch)
+--:node5 (branch)
Tables:1/0/0
Lag:15s,SimSun;font-size:14px;font-weight:bold;vertical-align:baseline;">将node4的provider更改为node3,如下:
[postgres@londiste1 londiste3]$ londiste3 db4.ini change-provider --provider=node3
2013-09-21 10:14:40,353 25458 INFO [node4] Consumer londiste_db4 tagged as paused
2013-09-21 10:14:41,371 25458 INFO Consumer 'londiste_db4' on node 'node4' paused
2013-09-21 10:14:42,436 25458 INFO Consumer 'londiste_db4' on node 'node4' resumed
| Lag:1s,Tick: 448
+--:node4 (branch)
Lag:1s,SimSun;font-size:14px;vertical-align:baseline;">{node3已顺利接管node4}
将node5更改为node4的子节点,如下:
[postgres@londiste1 londiste3]$ londiste3 db5.ini change-provider --provider=node4
2013-09-21 10:16:11,661 25488 INFO [node5] Consumer londiste_db5 tagged as paused
2013-09-21 10:16:12,677 25488 INFO Consumer 'londiste_db5' on node 'node5' paused
2013-09-21 10:16:13,743 25488 INFO Consumer 'londiste_db5' on node 'node5' resumed
| Lag:3s,Tick: 451
Lag:3s,SimSun;font-size:14px;vertical-align:baseline;">{node5已成为node4的子节点}
将node4更改为node2的子节点,如下:
[postgres@londiste1 londiste3]$ londiste3 db4.ini change-provider --provider=node2
2013-09-21 10:17:20,633 25517 INFO [node4] Consumer londiste_db4 tagged as paused
2013-09-21 10:17:21,647 25517 INFO Consumer 'londiste_db4' on node 'node4' paused
2013-09-21 10:17:22,713 25517 INFO Consumer 'londiste_db4' on node 'node4' resumed
| Lag:45s,Tick: 452
Lag:45s,SimSun;font-size:14px;vertical-align:baseline;">{发现node4会携带其子节点受node2接管}
3.1.2 takeover
【使A节点直接接管B节点,接上小节】
使node3接管node5,如下:
[postgres@londiste1 londiste3]$ londiste3 db3.ini takeover node5
2013-09-21 10:19:58,929 25566 INFO old: node5
2013-09-21 10:19:59,008 25566 INFO Waiting for worker to accept
2013-09-21 10:20:00,013 25566 INFO Consumer 'londiste_db5' on node 'node5' paused
2013-09-21 10:20:01,077 25566 INFO Consumer 'londiste_db5' on node 'node5' resumed
| Lag:2s,Tick: 459
| | Lag:2s,SimSun;vertical-align:baseline;"> Lag:2s,SimSun;font-size:14px;vertical-align:baseline;">{已顺利接管}