[Ocfs-users] OCFS file system used as archived redo destination is
corrupted
Pei Ku
pku at autotradecenter.com
Fri Feb 11 13:32:39 CST 2005
we started using an ocfs file system about 4 months ago as the shared archived redo destination for the 4-node rac instances (HP dl380, msa1000, RH AS 2.1) . last night we are seeing some weird behavior, and my guess is the inode directory in the file system is getting corrupted. I've always had a bad feeling about OCFS not being very robust at handling constant file creation and deletion (which is what happens when you use it for archived redo logs).
ocfs-2.4.9-e-smp-1.0.12-1 is what we are using in production.
For now, we set up an archo redo dest on a local ext3 FS on each node and made that dest the mandatory dest; we changed the ocfs dest to an optional one. The reason we made ocfs arch redo dest the primary dest a few months ago was because we are planning to migrate to rman-based backup (as opposed to the current hot backup scheme); it's easier (required?) to manage RAC archived redo logs with rman if archived redos reside in a shared file system
below are some diagnostics:
$ ls -l rdo_1_21810.arc*
-rw-r----- 1 oracle dba 397312 Feb 10 22:30 rdo_1_21810.arc
-rw-r----- 1 oracle dba 397312 Feb 10 22:30 rdo_1_21810.arc
(they have the same inode, btw -- I had done a 'ls -li' earlier but the output had rolled off the screen)
after a while , one of the dba scripts gziped the file(s). Now they look like this:
$ ls -liL /export/u10/oraarch/AUCP/rdo_1_21810.arc*
1457510912 -rw-r----- 1 oracle dba 36 Feb 10 23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz
1457510912 -rw-r----- 1 oracle dba 36 Feb 10 23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz
These two same files have the same inode also. But the size is way too small.
yeah, /export/u10 is pretty hosed...
Pei
-----Original Message-----
From: Pei Ku
Sent: Thu 2/10/2005 11:16 PM
To: IT
Cc: ADS
Subject: possible OCFS /export/u10/ corruption on dbprd*
Ulf,
AUCP had problems creating archive file "/export/u10/oraarch/AUCP/rdo_1_21810.arc". After a few tries, it appeared that it was able to -- except that there are *two* rdo_1_21810.arc files in it (by the time you look at it, it/they probably would get gzipped. We also have a couple of zero-lengh gzipped redo log files (which is not normal) in there.
At least the problem had not brought any of the AUCP instances down. Manoj and I turned on archiving to an ext3 file system on each node for now; archiving to /export/u10/ is still active but made optional for now.
My guess /export/u10/ is corrupted in some way. I still say OCFS can't take constant file creation/removing.
We are one rev behind (1.0.12 vs 1.0.13 on ocfs.org). No guarantee that 1.0.13 contains the cure...
Pei
-----Original Message-----
From: Oracle [mailto:oracle at dbprd01.autc.com]
Sent: Thu 2/10/2005 10:26 PM
To: DBA; Page DBA; Unix Admin
Cc:
Subject: SL1:dbprd01.autc.com:050210_222600:oalert_mon> Alert Log Errors
SEVER_LVL=1 PROG=oalert_mon
**** oalert_mon.pl: DB=AUCP SID=AUCP1
[Thu Feb 10 22:25:21] ORA-19504: failed to create file "/export/u10/oraarch/AUCP/rdo_1_21810.arc"
[Thu Feb 10 22:25:21] ORA-19504: failed to create file "/export/u10/oraarch/AUCP/rdo_1_21810.arc"
[Thu Feb 10 22:25:21] ORA-27040: skgfrcre: create error, unable to create file
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived
[Thu Feb 10 22:25:28] ORA-19504: failed to create file ""
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log'
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log'
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived
[Thu Feb 10 22:25:28] ORA-19504: failed to create file ""
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log'
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log'
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived
[Thu Feb 10 22:25:28] ORA-19504: failed to create file ""
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log'
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log'
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs-users/attachments/20050211/ef968754/attachment.html
More information about the Ocfs-users
mailing list