[Ocfs-users] OCFS file system used as archived redo destination is corrupted

Pei Ku pku at autotradecenter.com
Fri Feb 11 13:32:39 CST 2005


 
we started using an ocfs file system about 4 months ago as the shared archived redo  destination for the 4-node rac instances  (HP dl380, msa1000, RH AS 2.1)  .  last night we are seeing some weird behavior, and my guess is the inode directory in the file system is getting corrupted.  I've always had a bad feeling about OCFS not being very robust at handling constant file creation and deletion (which is what happens when you use it for archived redo logs).
 
ocfs-2.4.9-e-smp-1.0.12-1 is what we are using in production.
 
For now, we set up an archo redo dest on a local ext3 FS on each node and made that dest the mandatory dest; we changed the ocfs dest to an optional one.  The reason we made ocfs arch redo dest the primary dest a few months ago was because we are planning to migrate to rman-based backup (as opposed to the current hot backup scheme); it's easier (required?) to manage RAC archived redo logs with rman if archived redos reside in a shared file system 
 
below are some diagnostics: 

$ ls -l rdo_1_21810.arc*
 
-rw-r-----    1 oracle   dba        397312 Feb 10 22:30 rdo_1_21810.arc
-rw-r-----    1 oracle   dba        397312 Feb 10 22:30 rdo_1_21810.arc
 
(they have the same inode, btw -- I had done a 'ls -li' earlier but the output had rolled off the screen)
 
after a while , one of the dba scripts gziped the file(s).  Now they look like this:
 
 $ ls -liL /export/u10/oraarch/AUCP/rdo_1_21810.arc*
1457510912 -rw-r-----    1 oracle   dba            36 Feb 10 23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz
1457510912 -rw-r-----    1 oracle   dba            36 Feb 10 23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz
 
These two same files have the same inode also.  But the size is way too small.  
 
yeah, /export/u10 is pretty hosed...
 
Pei 



-----Original Message----- 
From: Pei Ku 
Sent: Thu 2/10/2005 11:16 PM 
To: IT 
Cc: ADS 
Subject: possible OCFS /export/u10/ corruption on dbprd*


Ulf,
 
AUCP had problems creating archive file "/export/u10/oraarch/AUCP/rdo_1_21810.arc".  After a few tries, it appeared that it was able to -- except that there are *two* rdo_1_21810.arc files in it (by the time you look at it, it/they probably would get gzipped.  We also have a couple of zero-lengh gzipped redo log files (which is not normal) in there.
 
At least the problem had not brought any of the AUCP instances down.  Manoj and I turned on archiving to an ext3 file system on each node for now; archiving to /export/u10/ is still active but made optional for now.
 
My guess /export/u10/ is corrupted in some way.  I still say OCFS can't take constant file creation/removing.
 
We are one rev behind (1.0.12 vs 1.0.13 on ocfs.org).   No guarantee that 1.0.13 contains the cure...
 
Pei

-----Original Message----- 
From: Oracle [mailto:oracle at dbprd01.autc.com] 
Sent: Thu 2/10/2005 10:26 PM 
To: DBA; Page DBA; Unix Admin 
Cc: 
Subject: SL1:dbprd01.autc.com:050210_222600:oalert_mon> Alert Log Errors



SEVER_LVL=1  PROG=oalert_mon 
**** oalert_mon.pl: DB=AUCP SID=AUCP1 
[Thu Feb 10 22:25:21] ORA-19504: failed to create file "/export/u10/oraarch/AUCP/rdo_1_21810.arc" 
[Thu Feb 10 22:25:21] ORA-19504: failed to create file "/export/u10/oraarch/AUCP/rdo_1_21810.arc" 
[Thu Feb 10 22:25:21] ORA-27040: skgfrcre: create error, unable to create file 
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived 
[Thu Feb 10 22:25:28] ORA-19504: failed to create file "" 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log' 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log' 
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived 
[Thu Feb 10 22:25:28] ORA-19504: failed to create file "" 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log' 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log' 
[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived 
[Thu Feb 10 22:25:28] ORA-19504: failed to create file "" 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m1.log' 
[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1: '/export/u01/oradata/AUCP/redo12m2.log' 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs-users/attachments/20050211/ef968754/attachment.html


More information about the Ocfs-users mailing list