<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf-8">
<TITLE>SL1:dbprd01.autc.com:050210_222600:oalert_mon> Alert Log Errors</TITLE>
<META content="MSHTML 6.00.2800.1400" name=GENERATOR></HEAD>
<BODY dir=ltr>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=752093218-11022005>we
started using an ocfs file system about 4 months ago as the shared arch<SPAN
class=912302819-11022005>ived redo</SPAN><SPAN
class=912302819-11022005> </SPAN> destination for the 4-node rac
instances<SPAN class=912302819-11022005> </SPAN><SPAN
class=912302819-11022005>(HP dl380, msa1000, RH AS 2.1) </SPAN><SPAN
class=912302819-11022005> </SPAN>. last night we are seeing some
weird behavior, and my guess is the inode directory in the file system is
getting corrupted. I've always had a bad feeling about OCFS not being very
robust at handling constant file creation and deletion (which is what happens
when you use it for archived redo logs).</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=752093218-11022005></SPAN></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2>ocfs-2.4.9-e-smp-1.0.12-1<SPAN
class=752093218-11022005> is what we are using in
production.</SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=752093218-11022005></SPAN></FONT> </DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN class=752093218-11022005>For
now, we set up an archo redo dest on a local ext3 FS on each node and made that
dest the mandatory dest; we changed the ocfs dest to an optional one. The
reason we made ocfs arch redo dest the primary dest a few months ago was because
we are planning to migrate to rman-based backup (as opposed to the current hot
backup scheme); it's easier (required?) to manage RAC archived redo logs with
rman if archived redos reside in a shared file system </SPAN></FONT></DIV>
<DIV><FONT face=Arial color=#0000ff size=2><SPAN
class=752093218-11022005></SPAN></FONT> </DIV>
<DIV><FONT face=Tahoma><FONT size=2><SPAN class=912302819-11022005><FONT
face=Arial color=#0000ff>below are some
diagnostics: </FONT></SPAN><BR></FONT></FONT></DIV>
<DIV>$ ls -l rdo_1_21810.arc*</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>-rw-r----- 1 oracle
dba 397312 Feb 10 22:30
rdo_1_21810.arc<BR>-rw-r----- 1 oracle
dba 397312 Feb 10 22:30
rdo_1_21810.arc</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>(they have the same inode, btw -- I had done a 'ls -li' earlier but the
output had rolled off the screen)</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>after a while , one of the dba scripts gziped the file(s). Now
they look like this:</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV> $ ls -liL
/export/u10/oraarch/AUCP/rdo_1_21810.arc*<BR>1457510912
-rw-r----- 1 oracle
dba 36 Feb 10
23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz<BR>1457510912
-rw-r----- 1 oracle
dba 36 Feb 10
23:00 /export/u10/oraarch/AUCP/rdo_1_21810.arc.gz</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>These two same files have the same inode also. But the size is way
too small. </DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>yeah, /export/u10 is pretty hosed...</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>Pei <BR><BR></DIV>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV><FONT size=2>-----Original Message----- <BR><B>From:</B> Pei Ku
<BR><B>Sent:</B> Thu 2/10/2005 11:16 PM <BR><B>To:</B> IT <BR><B>Cc:</B> ADS
<BR><B>Subject:</B> possible OCFS /export/u10/ corruption on
dbprd*<BR><BR></FONT></DIV>
<DIV>Ulf,</DIV>
<DIV><FONT face=Arial color=#0000ff size=2></FONT> </DIV>
<DIV>AUCP had problems creating archive file <FONT
size=2>"/export/u10/oraarch/AUCP/rdo_1_21810.arc". After a few tries, it
appeared that it was able to -- except that there are *two* rdo_1_21810.arc
files in it (by the time you look at it, it/they probably would get
gzipped. We also have a couple of zero-lengh gzipped redo log files
(which is not normal) in there.</FONT></DIV>
<DIV><FONT size=2></FONT> </DIV>
<DIV><FONT size=2>At least the problem had not brought any of the AUCP
instances down. Manoj and I turned on archiving to an ext3 file system
on each node for now; archiving to /export/u10/ is still active but made
optional for now.</FONT></DIV>
<DIV><FONT size=2></FONT> </DIV>
<DIV><FONT size=2>My guess /export/u10/ is corrupted in some way. I
still say OCFS can't take constant file creation/removing.</FONT></DIV>
<DIV><FONT size=2></FONT> </DIV>
<DIV><FONT size=2>We are one rev behind (1.0.12 vs 1.0.13 on
ocfs.org). No guarantee that 1.0.13 contains the
cure...</FONT></DIV>
<DIV><FONT size=2></FONT> </DIV>
<DIV><FONT size=2>Pei</FONT></DIV>
<BLOCKQUOTE dir=ltr style="MARGIN-RIGHT: 0px">
<DIV><FONT size=2>-----Original Message----- <BR><B>From:</B> Oracle
[mailto:oracle@dbprd01.autc.com] <BR><B>Sent:</B> Thu 2/10/2005 10:26 PM
<BR><B>To:</B> DBA; Page DBA; Unix Admin <BR><B>Cc:</B> <BR><B>Subject:</B>
SL1:dbprd01.autc.com:050210_222600:oalert_mon> Alert Log
Errors<BR><BR></FONT></DIV>
<P><FONT size=2>SEVER_LVL=1 PROG=oalert_mon</FONT> <BR><FONT
size=2>**** oalert_mon.pl: DB=AUCP SID=AUCP1</FONT> <BR><FONT size=2>[Thu
Feb 10 22:25:21] ORA-19504: failed to create file
"/export/u10/oraarch/AUCP/rdo_1_21810.arc"</FONT> <BR><FONT size=2>[Thu Feb
10 22:25:21] ORA-19504: failed to create file
"/export/u10/oraarch/AUCP/rdo_1_21810.arc"</FONT> <BR><FONT size=2>[Thu Feb
10 22:25:21] ORA-27040: skgfrcre: create error, unable to create file</FONT>
<BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-16038: log 12 sequence# 21810
cannot be archived</FONT> <BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-19504:
failed to create file ""</FONT> <BR><FONT size=2>[Thu Feb 10 22:25:28]
ORA-00312: online log 12 thread 1:
'/export/u01/oradata/AUCP/redo12m1.log'</FONT> <BR><FONT size=2>[Thu Feb 10
22:25:28] ORA-00312: online log 12 thread 1:
'/export/u01/oradata/AUCP/redo12m2.log'</FONT> <BR><FONT size=2>[Thu Feb 10
22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived</FONT>
<BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-19504: failed to create file
""</FONT> <BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-00312: online log 12
thread 1: '/export/u01/oradata/AUCP/redo12m1.log'</FONT> <BR><FONT
size=2>[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1:
'/export/u01/oradata/AUCP/redo12m2.log'</FONT> <BR><FONT size=2>[Thu Feb 10
22:25:28] ORA-16038: log 12 sequence# 21810 cannot be archived</FONT>
<BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-19504: failed to create file
""</FONT> <BR><FONT size=2>[Thu Feb 10 22:25:28] ORA-00312: online log 12
thread 1: '/export/u01/oradata/AUCP/redo12m1.log'</FONT> <BR><FONT
size=2>[Thu Feb 10 22:25:28] ORA-00312: online log 12 thread 1:
'/export/u01/oradata/AUCP/redo12m2.log'</FONT>
</P></BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>