Oracle 11g RAC集群新增节点
前提说明:本文章实验一次向RAC集群中添加两个节点,实际如果只需新增一个节点,步骤基本类似,完全可以参考
一、背景介绍
1、环境介绍
源有环境:两个节点的RAC集群(节点一和节点而);
目标环境:新增两个集群节点(节点三和节点四)
2、源有环境配置
项目 节点一服务器 节点二服务器
主机名 hostrac1 hostrac2
操作系统 centos7.9_x64 centos7.9_x64
集群和数据库软件 Oracle 11.2.0.4 Oracle 11.2.0.4
公共网络 192.168.11.22 192.168.11.23
私有网络 10.10.10.22 10.10.10.23
虚拟VIP 192.168.11.24 192.168.11.25
SCANIP 192.168.11.26
备注:整个集群两个节点共享一个SCANIP地址。
3、目标新增环境配置
项目 节点三服务器 节点四服务器
主机名 hostrac1 hostrac2
操作系统 centos7.9_x64 centos7.9_x64
集群和数据库软件 Oracle 11.2.0.4 Oracle 11.2.0.4
公共网络 192.168.11.27 192.168.11.28
私有网络 10.10.10.27 10.10.10.28
虚拟VIP 192.168.11.29 192.168.11.30
SCANIP 192.168.11.26
备注:整个集群两个节点共享一个SCANIP地址。
4、源有环境RAC集群的安装,可以参考我的文章《Oracle 11g RAC集群安装_linux7》
5、新增2个节点服务器的最低限制:内存、根目录规划,内存至少4GB,根目录用于GI集群软件和Oracle软件目录安装,建议50G以上
二、准备工作
说明:3-4步骤和6-17步骤,新增的两个节点服务器都需执行,详细可以参考我的文章《Oracle 11g RAC集群安装_linux7》
1、配置主机名
节点一服务器
hostnamectl set-hostname hostrac3
节点二服务器
hostnamectl set-hostname hostrac4
2、/etc/hosts文件配置
新增的两个节点服务器都需执行,源有环境的两个集群节点也需要执行
cat >> /etc/hosts <<EOF
10.10.10.22 hostrac1-priv
10.10.10.23 hostrac2-priv
192.168.11.22 hostrac1
192.168.11.23 hostrac2
192.168.11.24 hostrac1-vip
192.168.11.25 hostrac2-vip
192.168.11.26 hostrac-scan
10.10.10.27 hostrac3-priv
10.10.10.28 hostrac4-priv
192.168.11.27 hostrac3
192.168.11.28 hostrac4
192.168.11.29 hostrac3-vip
192.168.11.30 hostrac4-vip
EOF
3、集群GI和Oracle数据库安装用户创建
4、创建目录
5、oracle用户和grid用户ssh互信配置
备注:
1)安装集群软件期间,ssh端口配置为默认的22,配置ssh互信完毕,安装完集群软件后,可以修改为其他端口,例如60022
2)源有集群环境ssh互信已经配置,具体执行如下
grid用户互相配置
节点三hostrac3服务器上执行
su - grid
mkdir /home/grid/.ssh
cd /home/grid/.ssh
ssh-keygen -t rsa
scp grid@hostrac1:/home/grid/.ssh/authorized_keys /home/grid/.ssh/
cat id_rsa.pub >> authorized_keys
scp authorized_keys grid@hostrac1:/home/grid/.ssh/authorized_keys
scp authorized_keys grid@hostrac2:/home/grid/.ssh/authorized_keys
节点四hostrac4服务器上执行
su - grid
mkdir /home/grid/.ssh
cd /home/grid/.ssh
ssh-keygen -t rsa
scp grid@hostrac1:/home/grid/.ssh/authorized_keys /home/grid/.ssh/
cat id_rsa.pub >> authorized_keys
scp authorized_keys grid@hostrac1:/home/grid/.ssh/authorized_keys
scp authorized_keys grid@hostrac2:/home/grid/.ssh/authorized_keys
scp authorized_keys grid@hostrac3:/home/grid/.ssh/authorized_keys
oracle用户互相配置
节点三hostrac3服务器上执行
su - oracle
mkdir /home/oracle/.ssh
cd /home/oracle/.ssh
ssh-keygen -t rsa
scp oracle@hostrac1:/home/oracle/.ssh/authorized_keys /home/oracle/.ssh/
cat id_rsa.pub >> authorized_keys
scp authorized_keys oracle@hostrac1:/home/oracle/.ssh/authorized_keys
scp authorized_keys oracle@hostrac2:/home/oracle/.ssh/authorized_keys
节点四hostrac4服务器上执行
su - oracle
mkdir /home/oracle/.ssh
cd /home/oracle/.ssh
ssh-keygen -t rsa
scp oracle@hostrac1:/home/oracle/.ssh/authorized_keys /home/oracle/.ssh/
cat id_rsa.pub >> authorized_keys
scp authorized_keys oracle@hostrac1:/home/oracle/.ssh/authorized_keys
scp authorized_keys oracle@hostrac2:/home/oracle/.ssh/authorized_keys
scp authorized_keys oracle@hostrac3:/home/oracle/.ssh/authorized_keys
互信测试(4台RAC服务器都需要进行测试和验证)
注意,ssh互相实现以后,所有机器都需要测试一遍,包括本机对本机,包括私有网络都尝试一遍
测试方法(每一个集群服务器都必须执行测试)
su - oracle
cat > sshceshi.sh <<EOF
ssh 192.168.11.22 hostname
ssh 192.168.11.23 hostname
ssh 10.10.10.22 hostname
ssh 10.10.10.23 hostname
ssh hostrac1 hostname
ssh hostrac2 hostname
ssh hostrac1-priv hostname
ssh hostrac2-priv hostname
ssh 192.168.11.27 hostname
ssh 192.168.11.28 hostname
ssh 10.10.10.27 hostname
ssh 10.10.10.28 hostname
ssh hostrac3 hostname
ssh hostrac4 hostname
ssh hostrac3-priv hostname
ssh hostrac4-priv hostname
EOF
sh sshceshi.sh
su - grid
cat > sshceshi.sh <<EOF
ssh 192.168.11.22 hostname
ssh 192.168.11.23 hostname
ssh 10.10.10.22 hostname
ssh 10.10.10.23 hostname
ssh hostrac1 hostname
ssh hostrac2 hostname
ssh hostrac1-priv hostname
ssh hostrac2-priv hostname
ssh 192.168.11.27 hostname
ssh 192.168.11.28 hostname
ssh 10.10.10.27 hostname
ssh 10.10.10.28 hostname
ssh hostrac3 hostname
ssh hostrac4 hostname
ssh hostrac3-priv hostname
ssh hostrac4-priv hostname
EOF
sh sshceshi.sh
6、共享磁盘配置(共享设备与ASM磁盘映射)
7、配置ntp时间同步
8、操作系统依赖包安装
9、关闭防火墙和selinux安全策略
10、停止和禁用linux7版本上的一些服务器
11、可选,设置linux多用户模式,即纯文本模式,没有图形界面
12、Linux7版本上关闭透明大页和NUMA
13、grid和oracle用户环境变量配置
ORACLE_SID=orcl3/ORACLE_SID=orcl4
ORACLE_SID=+ASM3/ORACLE_SID=+ASM4
14、修改/etc/profile配置文件
15、修改内核参数、操作系统用户限制、安全策略
16、可选,内存大于32G,建议设置大页,可以参考以前的文章《Oracle11g 安装完毕后的配置优化步骤》
两个节点服务器都需执行
17、重启生效
18、源有环境备份数据库(可选,强烈建议)
19、集群配置文件备份(OCR/OLR),强烈建议
节点一完成:
su - grid
which ocrconfig
/u01/app/crs/bin/ocrconfig //获取ocrconfig可执行文件的路径
root用户执行备份
/u01/app/crs/bin/ocrconfig -manualbackup
/u01/app/crs/bin/ocrconfig -local -manualbackup
验证备份
/u01/app/crs/bin/ocrconfig -showbackup
/u01/app/crs/bin/ocrconfig -local -showbackup
20、/etc/oracle目录下的配置文件备份
源有集群两个节点都执行
tar -zcvf /etc/oracle.tar.gz /etc/oracle
21、备份数据库GI集群、DB数据库软件
源有集群两个节点都执行
1)Oracle数据库软件的ORACLE_HOME目录备份
方法一:
tar命令用root用户,经过验证,文件属性都是保持原有的
tar命令
cd /u01/app/oracle/product/11.2.0/
tar -czvf db_1.tar.zip ./db_1
解压命令
cd /u01/app/oracle/product/11.2.0/
tar -xzvf db_1.tar.zip .
方法二:
cp命令,同样使用root命令
cp -a选项表示递归cp和保留文件原始属性(时间戳、文件权限、所有权)
cp -a相当于cp -r -p -d 其中r表示递归备份,-p表示保留文件属性,-d表示原为软连接复制后也一样
cd /u01/app/oracle/product/11.2.0/
cp -a db_1/ db_1_$(date +%F)/
还原命令如下:
mv db_1/ db_1_old/
mv db_1_$(date +%F)/ db_1
2)GI集群软件的ORACLE_HOME目录备份
方法一:
tar命令用root用户,经过验证,文件属性都是保持原有的
tar命令
cd /u01/app/
tar -czvf crs.tar.zip ./crs
解压命令
cd /u01/app/
tar -xzvf crs.tar.zip .
方法二:
cp命令,同样使用root命令
cp -a选项表示递归cp和保留文件原始属性(时间戳、文件权限、所有权)
cp -a相当于cp -r -p -d 其中r表示递归备份,-p表示保留文件属性,-d表示原为软连接复制后也一样
cd /u01/app/
cp -a crs/ crs_$(date +%F)/
还原命令如下:
mv crs/ crs_old/
mv crs_1_$(date +%F)/ crs
备注:另一个统一备份方法的方法tar -czvf grid_home_$(date+%F).tar.gz -C /u01/app .
22、cluvfy验证添加节点的条件是否满足
节点一完成:以grid用户执行
su - grid
cluvfy comp peer -n hostrac1,hostrac2,hostrac3,hostrac4 -verbose
cluvfy stage -pre nodeadd -n hostrac3,hostrac4 -verbose
类似下面的检查不通过信息都可以忽略
1)
Check: Package existence for "pdksh"
Node Name Available Required Status
------------ ------------------------ ------------------------ ----------
hostrac3 missing pdksh-5.2.14 failed
hostrac4 missing pdksh-5.2.14 failed
Result: Package existence check failed for "pdksh"
2)
PRVF-5507 : NTP daemon or service is not running on any node but NTP configuration file exists on the following node(s):
hostrac3,hostrac4
Result: Clock synchronization check using Network Time Protocol(NTP) failed
3)
Checking the file "/etc/resolv.conf" to make sure only one of domain and search entries is defined
File "/etc/resolv.conf" does not have both domain and search entries defined
Checking if domain entry in file "/etc/resolv.conf" is consistent across the nodes...
domain entry in file "/etc/resolv.conf" is consistent across nodes
Checking if search entry in file "/etc/resolv.conf" is consistent across the nodes...
search entry in file "/etc/resolv.conf" is consistent across nodes
备注:如果预检失败,使用fixup自动修复
cluvfy stage -pre nodeadd -n hostrac3,hostrac4 -fixup -verbose
三、正式添加节点
注意:源有集群环境GI和Oracle数据库软件已经安装的补丁包,会自动同步过去
1、添加GI集群节点
# 以grid 用户执行,添加节点到 Grid Infrastructure
节点一执行(grid用户)
su - grid
export IGNORE_PREADDNODE_CHECKS=Y
$ORACLE_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={hostrac3,hostrac4}" "CLUSTER_NEW_VIRTUAL_HOSTNAMES={hostrac3-vip,hostrac4-vip}" "CLUSTER_NEW_PRIVATE_NODE_NAMES={hostrac3-priv,hostrac4-priv}"
最后提示在新节点(节点三、节点四)root用户执行orainstRoot.sh,root.sh两个脚本。脚本执行方式与rac安装一样
新节点(节点三、节点四)root用户执行如下
/u01/app/oraInventory/orainstRoot.sh
/u01/app/crs/root.sh
备注:由于已经安装的补丁包,会自动同步过去,因此11g + linux7版本执行root.sh报错的问题,这里不会出现,如果报错,可以查看我的文章《Oracle 11g RAC集群安装_linux7》中最后总结部分,选择手动添加systemd中的ohas启动服务方式来解决。
2、检查集群状态(可以每个节点都执行检查)
crsctl stat res -t
[grid@hostrac4 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.DATA.dg
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.LISTENER.lsnr
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.asm
ONLINE ONLINE hostrac1 Started
ONLINE ONLINE hostrac2 Started
ONLINE ONLINE hostrac3 Started
ONLINE ONLINE hostrac4 Started
ora.gsd
OFFLINE OFFLINE hostrac1
OFFLINE OFFLINE hostrac2
OFFLINE OFFLINE hostrac3
OFFLINE OFFLINE hostrac4
ora.net1.network
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.ons
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE hostrac2
ora.cvu
1 ONLINE ONLINE hostrac1
ora.hostrac1.vip
1 ONLINE ONLINE hostrac1
ora.hostrac2.vip
1 ONLINE ONLINE hostrac2
ora.hostrac3.vip
1 ONLINE ONLINE hostrac3
ora.hostrac4.vip
1 ONLINE ONLINE hostrac4
ora.oc4j
1 ONLINE ONLINE hostrac1
ora.orcl.db
1 ONLINE ONLINE hostrac1 Open
2 ONLINE ONLINE hostrac2 Open
ora.scan1.vip
1 ONLINE ONLINE hostrac2
3、添加DB节点(节点一执行)
以oracle用户执行,添加节点到Oracle RAC数据库软件
su - oracle
export IGNORE_PREADDNODE_CHECKS=Y
$ORACLE_HOME/oui/bin/addNode.sh -silent "CLUSTER_NEW_NODES={hostrac3,hostrac4}"
最后提示在新节点(节点三、节点四)root用户执行root.sh两个脚本
新节点(节点三、节点四)执行
/u01/app/oracle/product/11.2/db_1/root.sh
4、添加数据库实例(节点一执行)
oracle用户查看数据库配置(节点一执行)
su - oracle
srvctl config database -d orcl|grep "Database instances"
Database instances: orcl1,orcl2
实例添加前:Database instances: orcl1,orcl2
注意dbca -silent -addInstance只能一次向集群添加一个实例,具体如下
第一次添加
dbca -silent -addInstance -nodeList hostrac3 -gdbName orcl -instanceName orcl3 -sysDBAUserName sys -sysDBAPassword oracle
第二次添加
dbca -silent -addInstance -nodeList hostrac4 -gdbName orcl -instanceName orcl4 -sysDBAUserName sys -sysDBAPassword oracle
实例添加后验证:
srvctl config database -d orcl|grep "Database instances"
Database instances: orcl1,orcl2,orcl3,orcl4
5、为新增的2个数据库实例添加undo及redo(如果用使用DBCA可以自动完成创建Undo表空间、Redo日志及配置初始化参数等工作)
备注:生产环境,请根据实际情况进行redo和undo的添加和配置
1)添加Redo日志线程
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 5 ('+DATA/orcl/redo05.log') SIZE 50M;
ALTER DATABASE ADD LOGFILE THREAD 3 GROUP 6 ('+DATA/orcl/redo06.log') SIZE 50M;
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 7 ('+DATA/orcl/redo07.log') SIZE 50M;
ALTER DATABASE ADD LOGFILE THREAD 4 GROUP 8 ('+DATA/orcl/redo08.log') SIZE 50M;
2)启用Redo日志线程
ALTER DATABASE ENABLE THREAD 3;
ALTER DATABASE ENABLE THREAD 4;
这条命令激活了线程3和4,新增的2个实例启动后即可使用
3)创建Undo表空间
CREATE UNDO TABLESPACE UNDOTBS3 DATAFILE '+DATA/orcl/datafile/undotbs03.dbf' SIZE 50M AUTOEXTEND ON;
CREATE UNDO TABLESPACE UNDOTBS4 DATAFILE '+DATA/orcl/datafile/undotbs03.dbf' SIZE 50M AUTOEXTEND ON;
4)配置新实例使用新创建的Undo表空间
ALTER SYSTEM SET undo_tablespace='UNDOTBS3' SCOPE=SPFILE SID='orcl3';
ALTER SYSTEM SET undo_tablespace='UNDOTBS4' SCOPE=SPFILE SID='orcl4';
6、启动新增的2个数据库实例
启动2个新的数据库实例
srvctl start instance -d orcl -i orcl3
srvctl start instance -d orcl -i orcl4
备注:条件允许的情况下,建议直接将数据库整体来一次重启操作
srvctl stop database -d orcl -o immediate
srvctl start database -d orcl
7、最终验证集群和DB整体情况
[grid@hostrac3 ~]$ crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.CRS.dg
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.DATA.dg
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.LISTENER.lsnr
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.asm
ONLINE ONLINE hostrac1 Started
ONLINE ONLINE hostrac2 Started
ONLINE ONLINE hostrac3 Started
ONLINE ONLINE hostrac4 Started
ora.gsd
OFFLINE OFFLINE hostrac1
OFFLINE OFFLINE hostrac2
OFFLINE OFFLINE hostrac3
OFFLINE OFFLINE hostrac4
ora.net1.network
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
ora.ons
ONLINE ONLINE hostrac1
ONLINE ONLINE hostrac2
ONLINE ONLINE hostrac3
ONLINE ONLINE hostrac4
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE hostrac2
ora.cvu
1 ONLINE ONLINE hostrac3
ora.hostrac1.vip
1 ONLINE ONLINE hostrac1
ora.hostrac2.vip
1 ONLINE ONLINE hostrac2
ora.hostrac3.vip
1 ONLINE ONLINE hostrac3
ora.hostrac4.vip
1 ONLINE ONLINE hostrac4
ora.oc4j
1 ONLINE ONLINE hostrac2
ora.orcl.db
1 ONLINE ONLINE hostrac1 Open
2 ONLINE ONLINE hostrac2 Open
3 ONLINE ONLINE hostrac3 Open
4 ONLINE ONLINE hostrac4 Open
ora.scan1.vip
1 ONLINE ONLINE hostrac2
至此完成,集群状态一切正常
四、总结
1、如果启动数据库时报如下错误
[oracle@hostrac1 ~]$ srvctl start database -d orcl
PRCR-1079 : Failed to start resource ora.orcl.db
CRS-5017: The resource action "ora.orcl.db start" encountered the following error:
ORA-01619: thread 3 is mounted by another instance
. For details refer to "(:CLSN00107:)" in "/u01/app/crs/log/hostrac3/agent/crsd/oraagent_oracle/oraagent_oracle.log".
CRS-2674: Start of 'ora.orcl.db' on 'hostrac3' failed
CRS-2632: There are no more servers to try to place resource 'ora.orcl.db' on that would satisfy its placement policy
sqlplus启动时报如下
[oracle@hostrac4 trace]$ sqlplus "/as sysdba"
SQL> startup
ORA-00304: requested INSTANCE_NUMBER is busy
大概率时新增实例的线程和实例序号混淆,直接通过参数明确指定即可,如下
alter system set instance_number=3 sid='orcl3' scope=spfile;
alter system set instance_number=4 sid='orcl4' scope=spfile;
alter system set thread=3 sid='orcl3' scope=spfile;
alter system set thread=4 sid='orcl4' scope=spfile;