客户RAC环境在一个节点重启后,另一个节点出现IPC send timeout信息。
详细错误信息为:
Wed May 2 22:07:00 2012
IPC Send timeout detected.Sender: ospid 20808
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:02 2012
IPC Send timeout detected.Sender: ospid 6677
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:09 2012
IPC Send timeout detected.Sender: ospid 16758
Receiver: inst 1 binc 1718096035 ospid 16261
Wed May 2 22:07:13 2012
IPC Send timeout detected.Sender: ospid 8947
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:13 2012
IPC Send timeout detected.Sender: ospid 6583
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:31 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 132
Wed May 2 22:07:31 2012
IPC Send timeout detected.Sender: ospid 17068
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:34 2012
Communications reconfiguration: tbinstance_number 1
Wed May 2 22:07:34 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 154
Wed May 2 22:07:45 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 64
Wed May 2 22:07:45 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 95
Wed May 2 22:07:54 2012
IPC Send timeout detected.Sender: ospid 21078
Receiver: inst 1 binc 1718095761 ospid 16263
Wed May 2 22:07:59 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 24
Wed May 2 22:08:04 2012
Trace dumping is performing id=[cdmp_20120502220729]
Wed May 2 22:08:24 2012
IPC Send timeout to 0.0 inc 24 for msg type 12 from opid 146
Wed May 2 22:08:36 2012
Trace dumping is performing id=[cdmp_20120502220805]
Wed May 2 22:08:38 2012
Trace dumping is performing id=[cdmp_20120502220805]
Wed May 2 22:10:55 2012
Evicting instance 1 from cluster
Wed May 2 22:11:32 2012
Waiting for instances to leave:
1
这个信息并不正常,查询MOS后发现,这是一个bug,问题描述可以参考:'IPC Send Timeout Detected' errors between QMON Processes after RAC reconfiguration [ID 458912.1]。
对于当前的10.2.0.4环境,需要针对Bug 6200820进行PATCH修正,而对于10.2.0.3版本则需要应用Patch 6326889。
在MOS中查到不少类似IPC Timeout的问题,多数都会影响10.2.0.4版本,且大部分都在10.2.0.5中被fixed,因此如果这个问题出现频繁,升级到10.2.0.5也是一个不错的选择。