GreenPlum作为一个很好的开源的数据仓库,一直想体验一把,最近装了一个.关于安装,网上资料介绍也多了不少,至少越来越多,不过真正装一把才知道还是有各种坑的,不是那么一帆风顺的.
Greenplum在线安装文档 http://gpdb.docs.pivotal.io/500/install_guide/prep_os_install_gpdb.html#topic8
整个文档的确是相当长,而且还需要一点E文和Linux.我也没有完全按照上面说的来操作.下面记录下基本的操作和遇到的一些问题.
1,调整系统参数,具体的不贴了,文档上面也有,Centos6.5和7上面的操作还是有些区别的
vi /etc/sysctl.conf
vi /etc/security/limits.conf
2,防火墙和HOSTS配置,这些操作看似简单,但是最容易出错,一定要注意检查.
3,下载,解压,安装,具体步骤省略......
gpadmin用户下
vi ~/.bash_profile
source /usr/local/greenplum-db/greenplum_path.sh
export MASTER_DATA_DIRECTORY=/data/master/gpseg-1
或者 source /usr/local/greenplum-db/greenplum_path.sh
指定hostfile_exkeys文件,里面包括所有的主机,这个文件放的位置可以随意,只要下面执行的能找到文件或者参数里面把路径加上就可以
gpseginstall -f hostfile_exkeys
确认安装情况
gpssh -f hostfile_exkeys -e ls -l $GPHOME
gpssh-exkeys -f hostfile_exkeys
MASTER上创建数据存储目录
mkdir /data/master
chown gpadmin /data/master
Segment上创建数据存储目录
gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'mkdir /data/mirror'
gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/primary'
gpssh -f hostfile_gpssh_segonly -e 'chown gpadmin /data/mirror'
cd /home/gpadmin/gpconfigs (没有就创建目录)
vigpinitsystem_config
ARRAY_NAME="EMC Greenplum DW"
SEG_PREFIX=gpseg
PORT_BASE=40000
declare -a DATA_DIRECTORY=(/data/primary)
MASTER_HOSTNAME=vm11
MASTER_DIRECTORY=/data/master
MASTER_PORT=5432
TRUSTED_SHELL=ssh
CHECK_POINT_SEGMENTS=8
ENCODING=UNICODEdeclare -a MIRROR_DATA_DIRECTORY=(/data/mirror)
DATABASE_NAME=gpexmp
MACHINE_LIST_FILE=/home/gpadmin/seg_hosts
我把所有的注释都去掉了,这里面几个是一定要配置的,需要注意!
5,开始初始化GP.
gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config 执行之后的日志是这样的:
[gpadmin@vm11 ~]$ gpinitsystem -c /home/gpadmin/gpconfigs/gpinitsystem_config
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking configuration parameters,please wait...
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Reading Greenplum configuration file /home/gpadmin/gpconfigs/gpinitsystem_config
20171006:06:17:43:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Locale has not been set in /home/gpadmin/gpconfigs/gpinitsystem_config,will set to default value
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Locale set to en_US.utf8
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-MASTER_MAX_CONNECT not set,will set to default value 250
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking configuration parameters,Completed
20171006:06:17:44:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing multi-home checks,please wait...
...
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Configuring build for standard array
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing multi-home checks,Completed
20171006:06:17:45:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Building primary segment instance array,please wait...
...
20171006:06:17:47:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking Master host
20171006:06:17:47:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking new segment hosts,please wait...
..20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-----------------------------------------------------------
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Host vm22 is assigned as localhost in /etc/hosts
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-This will cause segment->master communication failures
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Remove vm22 from local host line in /etc/hosts
20171006:06:17:52:109203 gpinitsystem:vm11:gpadmin-[WARN]:-----------------------------------------------------------
.
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checking new segment hosts,Completed
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Database Creation Parameters
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:---------------------------------------
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master Configuration
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:---------------------------------------
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master instance name = EMC Greenplum DW
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master hostname = vm11
20171006:06:17:54:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master port = 5432
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master instance dir = /data/master/gpseg-1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master LOCALE = en_US.utf8
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum segment prefix = gpseg
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master Database = gpexmp
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master connections = 250
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master buffers = 128000kB
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Segment connections = 750
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Segment buffers = 128000kB
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Checkpoint segments = 8
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Encoding = UNICODE
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Postgres param file = Off
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Initdb to be used = /usr/local/greenplum-db/./bin/initdb
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-GP_LIBRARY_PATH is = /usr/local/greenplum-db/./lib
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Ulimit check = Passed
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Array host connect type = Single hostname per node
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [1] = 192.168.122.1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [2] = 192.168.37.128
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Master IP address [3] = fe80::20c:29ff:feb1:4c90
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Standby Master = Not Configured
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Primary segment # = 1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total Database segments = 3
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Trusted shell = ssh
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Number segment hosts = 3
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Mirroring config = OFF
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:----------------------------------------
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Primary Segment Configuration
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:----------------------------------------
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm11 /data/primary/gpseg0 40000 2 0
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm22 /data/primary/gpseg1 40000 3 1
20171006:06:17:55:109203 gpinitsystem:vm11:gpadmin-[INFO]:-vm33 /data/primary/gpseg2 40000 4 2
Continue with Greenplum creation Yy/Nn>
Y
20171006:06:18:22:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Building the Master instance database,please wait...
20171006:06:18:30:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Starting the Master in admin mode
20171006:06:18:56:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Commencing parallel build of primary segment instances
20171006:06:18:57:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Spawning parallel processes batch [1],please wait...
...
20171006:06:18:57:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Waiting for parallel processes batch [1],please wait...
............................
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Parallel process exit status
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as completed = 3
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as killed = 0
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Total processes marked as Failed = 0
20171006:06:19:26:109203 gpinitsystem:vm11:gpadmin-[INFO]:------------------------------------------------
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Deleting distributed backout files
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Removing back out file
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-No errors generated from parallel processes
20171006:06:19:27:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Restarting the Greenplum instance in production mode
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Starting gpstop with args: -a -l /home/gpadmin/gpAdminLogs -i -m -d /data/master/gpseg-1
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Gathering information and validating the environment...
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Obtaining Segment details from master...
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Greenplum Version: 'postgres (Greenplum Database) 4.3.16.1 build 1'
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-There are 0 connections to the database
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Commencing Master instance shutdown with mode='immediate'
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Master host=vm11
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Commencing Master instance shutdown with mode=immediate
20171006:06:19:27:122973 gpstop:vm11:gpadmin-[INFO]:-Master segment instance directory=/data/master/gpseg-1
20171006:06:19:28:122973 gpstop:vm11:gpadmin-[INFO]:-Attempting forceful termination of any leftover master process
20171006:06:19:28:122973 gpstop:vm11:gpadmin-[INFO]:-Terminating processes for segment /data/master/gpseg-1
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Starting gpstart with args: -a -l /home/gpadmin/gpAdminLogs -d /data/master/gpseg-1
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Gathering information and validating the environment...
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Greenplum Binary Version: 'postgres (Greenplum Database) 4.3.16.1 build 1'
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Greenplum Catalog Version: '201310150'
20171006:06:19:29:123060 gpstart:vm11:gpadmin-[INFO]:-Starting Master instance in admin mode
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Obtaining Greenplum Master catalog information
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Obtaining Segment details from master...
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Setting new master era
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Master Started...
20171006:06:19:30:123060 gpstart:vm11:gpadmin-[INFO]:-Shutting down master
20171006:06:19:31:123060 gpstart:vm11:gpadmin-[INFO]:-Commencing parallel segment instance startup,please wait...
........
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Process results...
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:- Successful segment starts = 3
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:- Failed segment starts = 0
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:- Skipped segment starts (segments are marked down in configuration) = 0
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Successfully started 3 of 3 segment instances
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-----------------------------------------------------
20171006:06:19:39:123060 gpstart:vm11:gpadmin-[INFO]:-Starting Master instance vm11 directory /data/master/gpseg-1
20171006:06:19:51:123060 gpstart:vm11:gpadmin-[INFO]:-Command pg_ctl reports Master vm11 instance active
20171006:06:19:53:123060 gpstart:vm11:gpadmin-[INFO]:-No standby master configured. skipping...
20171006:06:19:53:123060 gpstart:vm11:gpadmin-[INFO]:-Database successfully started
20171006:06:19:58:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Completed restart of Greenplum instance in production mode
20171006:06:19:58:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Loading gp_toolkit...
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Scanning utility log file for any warning messages
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-*******************************************************
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-Scan of log file indicates that some warnings or errors
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-were generated during the array creation
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Please review contents of log file
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-/home/gpadmin/gpAdminLogs/gpinitsystem_20171006.log
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To determine level of criticality
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-These messages could be from a prevIoUs run of the utility
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-that was called today!
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[WARN]:-*******************************************************
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Greenplum Database instance successfully created
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To complete the environment configuration,please
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-update gpadmin .bashrc file with the following
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-1. Ensure that the greenplum_path.sh file is sourced
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:-2. Add "export MASTER_DATA_DIRECTORY=/data/master/gpseg-1"
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:- to access the Greenplum scripts for this instance:
20171006:06:20:09:109203 gpinitsystem:vm11:gpadmin-[INFO]:- or,use -d /data/master/gpseg-1 option for the Greenplum scripts
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:- Example gpstate -d /data/master/gpseg-1
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Script log file = /home/gpadmin/gpAdminLogs/gpinitsystem_20171006.log
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To remove instance,run gpdeletesystem utility
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-To initialize a Standby Master Segment for this Greenplum instance
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Review options for gpinitstandby
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-The Master /data/master/gpseg-1/pg_hba.conf post gpinitsystem
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-has been configured to allow all hosts within this new
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-array to intercommunicate. Any hosts external to this
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-new array must be explicitly added to this file
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-Refer to the Greenplum Admin support guide which is
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-located in the /usr/local/greenplum-db/./docs directory
20171006:06:20:10:109203 gpinitsystem:vm11:gpadmin-[INFO]:-------------------------------------------------------
出错过好几次,总算搞定了. gpstop出错,目录存在,系统已经在运行...等等.
ps -ef | grep gpadmin
[gpadmin@vm11 ~]$ kill -9 101840
[gpadmin@vm11 ~]$ rm /tmp/.s.PGsql.5432.lock
rm -rf /data/master/gpseg-1
rm -rf /data/primary/gpseg1
rm -rf /data/primary/gpseg0
rm -rf /data/primary/gpseg1
rm -rf /data/primary/gpseg2
6,运行sql
[gpadmin@vm11 conf]$ psql -p 5432 -d gpexmp
psql (8.2.15)
Type "help" for help.
create table t1(id int primary key,cn int,name varchar(40)) distributed by (id);
insert into t1 select generate_series(1,100),generate_series(1,100);
select gp_segment_id,count(1) from t1 group by 1;
gp_segment_id | count
---------------+-------
2 | 37
1 | 32
0 | 31
看到效果了,成功了