[Ocfs-users] OCFS error
Phany Leblanc
pleblanc at techlinkentertainment.com
Fri Jun 8 08:20:15 PDT 2007
Hi,
Before I describe the problem, I'll describe my environment. This is a
development environment built over 1.5 yrs ago:
* We have a 2 node Oracle RAC Cluster.
* built on two Dell PowerEdge 1850 Servers (Intel(R) Xeon(TM) CPU 3.20GHz)
* OS is RHEL3 U3
* Kernel 2.4.21-20.ELsmp
* these 2 servers are attached to an EMC AX100 SAN via fibre channel
* HBA (QLogic QLA6312 PCI to Fibre Channel Host Adapter)
* Oracle Version: Oracle Database 10g Enterprise Edition Release 10.1.0.4.0
* OCFS: ocfs-tools-1.0.10-1 .... ocfs-2.4.21-EL-smp-1.0.13-1 ...
ocfs-support-1.1.5-1
* We have two ocfs volumes mounted /u01 and /u02
* These servers were last rebooted 17 days ago
Okay, hope that is sufficient info for starters... now here is the
problem we experienced:
I arrived at work this morning and discovered some errors in node2's
Oracle alert log, the error occurred in the middle of the night, the
archiver was complaining it couldn't archive a log to the flash recovery
area on /u02:
...(snip)...
Jun 8 01:10:49 2007
Private_strands 0 at log switch
Thread 2 advanced to log sequence 30694
Current log# 3 seq# 30694 mem# 0: /u01/oradata/rgcsd/redo03a.log
Current log# 3 seq# 30694 mem# 1: /u01/oradata/rgcsd/redo03b.log
Fri Jun 8 01:10:49 2007
ARC0: Evaluating archive thread 2 sequence 30693
Fri Jun 8 01:10:50 2007
Errors in file /usr/app/oracle/admin/rgcsd/bdump/rgcsd2_arc0_2036.trc:
ORA-01264: Unable to create archived log file name
ORA-19800: Unable to initialize Oracle Managed Destination
Linux Error: 13: Permission denied
Fri Jun 8 01:10:50 2007
Errors in file /usr/app/oracle/admin/rgcsd/bdump/rgcsd2_arc0_2036.trc:
ORA-16032: parameter LOG_ARCHIVE_DEST_10 destination string cannot be
translated
ORA-01264: Unable to create archived log file name
ORA-19800: Unable to initialize Oracle Managed Destination
Linux Error: 13: Permission denied
ARC0: Archiving not possible: No primary destinations
ARC0: Failed to archive thread 2 sequence 30693 (16032)
ARCH: Archival stopped, error occurred. Will continue retrying
...(snip)...
The archiver continued retrying until it succeeded 6 minutes later (at
01:16:38) and afterward everything continued to function seemingly smoothly.
I checked node2's /var/log/messages logfile and I found the following
errors occurred at the same time the archiver had its problems:
...(snip)...
Jun 8 01:10:50 rac4 kernel: (2036) ERROR: status = -17,
Common/ocfsgendirnode.c, 1535
Jun 8 01:10:50 rac4 kernel: (2036) ERROR: status = -17,
Common/ocfsgencreate.c, 1625
Jun 8 01:10:50 rac4 kernel: (2036) ERROR: status = -17,
Common/ocfsgencreate.c, 1821
Jun 8 01:10:50 rac4 kernel: (2036) ERROR: status = -17,
Linux/ocfsmain.c, 2193
Jun 8 01:10:50 rac4 kernel: (2036) ERROR: status = -17,
Linux/ocfsmain.c, 2493
...(snip)...
(I checked node1's /var/log/messages logfile and these ocfs errors do
NOT appear on that node)
I checked the event log for the EMC AX100 SAN but there is nothing reported.
Has this happened to anyone else? Anyone have any ideas what the root
cause of this hiccup could be? This is the first time this has happened
to us.
Thanks for your input,
Phany
More information about the Ocfs-users
mailing list