<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 TRANSITIONAL//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=UTF-8">
<META NAME="GENERATOR" CONTENT="GtkHTML/3.16.3">
</HEAD>
<BODY>
Hi Tao,<BR>
<BR>
Thanks for your help.<BR>
Yes, we see them from nfs client. <BR>
There is only 2 special log that I think be relate to ocfs2.<BR>
Nov 7 04:42:05 dbu2pub kernel: (1751,3):ocfs2_encode_fh:152 ERROR: fh buffer is too small for encoding<BR>
Nov 7 10:03:24 dbu2pub kernel: (1751,3):ocfs2_encode_fh:152 ERROR: fh buffer is too small for encoding<BR>
<BR>
I think the stale inode and inode corruption cause by this,<BR>
- we update information via ocfs2. ocfs2 will distribute information to all members. <BR>
- nfs server don't know the updated information before ocfs2 sync to disk.<BR>
- nfs client will doesn't know the updates for we use asyn option for performance.<BR>
<BR>
I will adjust our application to update data via nfs, then ocfs2. Let all layers know the updates.<BR>
<BR>
<BR>
BRs,<BR>
Colin<BR>
<BR>
-----Original Message-----<BR>
<B>From</B>: ext TaoMa <<A HREF="mailto:ext%20TaoMa%20%3ctao.ma@oracle.com%3e">tao.ma@oracle.com</A>><BR>
<B>To</B>: Wang2, Colin (NSN - CN/Cheng Du) <<A HREF="mailto:%22Wang2,%20Colin%20%28NSN%20-%20CN/Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com%3e">colin.wang2@nsn.com</A>><BR>
<B>Cc</B>: ext Sunil Mushran <<A HREF="mailto:ext%20Sunil%20Mushran%20%3csunil.mushran@oracle.com%3e">sunil.mushran@oracle.com</A>>, ocfs2-users@oss.oracle.com <<A HREF="mailto:%22ocfs2-users@oss.oracle.com%22%20%3cocfs2-users@oss.oracle.com%3e">ocfs2-users@oss.oracle.com</A>><BR>
<B>Subject</B>: Re: [Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is too        small for encoding<BR>
<B>Date</B>: Fri, 13 Nov 2009 06:43:12 +0800<BR>
<BR>
<PRE>
Wang2, Colin (NSN - CN/Cheng Du) wrote:
> Hi Tao,
>
> Could you give me more information about inode corruption? Thanks in
> advance.
It means that the inode corresponding to the dentry is corrupted. So
when you ls -l, the system will try to get the information from the
inode but fails. Oh, I just recognized that you use NFS. So do you see
it from NFS client or the NFS server. If it is the client, I guess a
stale inode can cause this. Then it may not be a file system corruption.
> - How to check/make sure it's a inode corruption?
I guess echo 'stat <filename>'|debugfs.ocfs2 /dev/sdx should have some
information for you.
> - How to fix inode corruption?
fsck.ocfs2.
> - How generate inode corruption? How to prevent it?
I don't know how to generate. Otherwise I would have already fixed it. ;)
>
> Sorry to ask so many question. I met this problem a few times and
> customer complained this. I hope to resolve it permanently.
So every time you meet with this issue, it is the NFS exported volume?
As I have asked above, did you see this from NFS client or NFS server?
If a NFS client, it may be caused by a stale inode.
While for the NFS server, it may be a file corruption. Do you have
anything special in your system log?
Regards,
Tao
>
> BRs,
> Colin
>
> -----Original Message-----
> *From*: ext TaoMa <<A HREF="mailto:tao.ma@oracle.com">tao.ma@oracle.com</A>
> <<A HREF="mailto:ext%20TaoMa%20%3ctao.ma@oracle.com">mailto:ext%20TaoMa%20%3ctao.ma@oracle.com</A>%3e>>
> *To*: Wang2, Colin (NSN - CN/Cheng Du) <<A HREF="mailto:colin.wang2@nsn.com">colin.wang2@nsn.com</A>
> <<A HREF="mailto">mailto</A>:%22Wang2,%20Colin%20%28NSN%20-%20CN/<A HREF="mailto:Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com">Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com</A>%3e>>
> *Cc*: ext Sunil Mushran <<A HREF="mailto:sunil.mushran@oracle.com">sunil.mushran@oracle.com</A>
> <<A HREF="mailto:ext%20Sunil%20Mushran%20%3csunil.mushran@oracle.com">mailto:ext%20Sunil%20Mushran%20%3csunil.mushran@oracle.com</A>%3e>>,
> <A HREF="mailto:ocfs2-users@oss.oracle.com">ocfs2-users@oss.oracle.com</A> <<A HREF="mailto:ocfs2-users@oss.oracle.com">ocfs2-users@oss.oracle.com</A>
> <<A HREF="mailto:%22ocfs2-users@oss.oracle.com">mailto:%22ocfs2-users@oss.oracle.com</A><A HREF="mailto:%22%20%3cocfs2-users@oss.oracle.com">%22%20%3cocfs2-users@oss.oracle.com</A>%3e>>
> *Subject*: Re: [Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is
> too small for encoding
> *Date*: Thu, 12 Nov 2009 23:54:23 +0800
>
> Hi Colin,
> The file is blinking may be casued by the file's inode corruption.
> I met with it once.
>
> As for debug ocfs2, there are many ways. One is
> <A HREF="http://oss.oracle.com/projects/ocfs2-tools/dist/documentation/v1.4/debugfs.ocfs2.html">http://oss.oracle.com/projects/ocfs2-tools/dist/documentation/v1.4/debugfs.ocfs2.html</A>
>
> debugfs.ocfs2 *-l* [/tracebit/ ... [*allow*|*off*|*deny*]] ...
> can open and off a lot of tracing which will show some helpful
> information in system log.
>
> But I guess what Sunil mean is the debug version of ocfs2, not how to
> debug? Since it is a production system, I am afraid a debug version
> isn't allowed in your system.
>
> Regards,
> Tao
> Wang2, Colin (NSN - CN/Cheng Du) wrote:
> > Hi Sunil,
> >
> > Please see answer in line.
> >
> > BRs,
> > Colin
> >
> > -----Original Message-----
> > *From*: ext Sunil Mushran <<A HREF="mailto:sunil.mushran@oracle.com">sunil.mushran@oracle.com</A> <<A HREF="mailto:sunil.mushran@oracle.com">mailto:sunil.mushran@oracle.com</A>>
> > <<A HREF="mailto:ext%20Sunil%20Mushran%20%3csunil.mushran@oracle.com">mailto:ext%20Sunil%20Mushran%20%3csunil.mushran@oracle.com</A>%3e>>
> > *To*: Wang2, Colin (NSN - CN/Cheng Du) <<A HREF="mailto:colin.wang2@nsn.com">colin.wang2@nsn.com</A> <<A HREF="mailto:colin.wang2@nsn.com">mailto:colin.wang2@nsn.com</A>>
> > <<A HREF="mailto">mailto</A>:%22Wang2,%20Colin%20%28NSN%20-%20CN/<A HREF="mailto:Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com">Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com</A> <<A HREF="mailto:Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com">mailto:Cheng%20Du%29%22%20%3ccolin.wang2@nsn.com</A>>%3e>>
> > *Cc*: <A HREF="mailto:ocfs2-users@oss.oracle.com">ocfs2-users@oss.oracle.com</A> <<A HREF="mailto:ocfs2-users@oss.oracle.com">mailto:ocfs2-users@oss.oracle.com</A>> <<A HREF="mailto:ocfs2-users@oss.oracle.com">ocfs2-users@oss.oracle.com</A> <<A HREF="mailto:ocfs2-users@oss.oracle.com">mailto:ocfs2-users@oss.oracle.com</A>>
> > <<A HREF="mailto:%22ocfs2-users@oss.oracle.com">mailto:%22ocfs2-users@oss.oracle.com</A><A HREF="mailto:%22%20%3cocfs2-users@oss.oracle.com">%22%20%3cocfs2-users@oss.oracle.com</A> <<A HREF="mailto:%22%20%3cocfs2-users@oss.oracle.com">mailto:%22%20%3cocfs2-users@oss.oracle.com</A>>%3e>>
> > *Subject*: Re: [Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is
> > too small for encoding
> > *Date*: Wed, 11 Nov 2009 19:55:57 -0800
> >
> > Wang2, Colin (NSN - CN/Cheng Du) wrote:
> > > Base on your questions,
> > > 1. The error is time issue. And it's a production system, it's hard to
> > > install a debug version.
> > > I appreciate if you share some document about debug version so I can
> > > test it while have chance.
> >
> > The error is not necessarily an ocfs2 issue. ocfs2 has 64-bit inode numbers
> > and requires the large filehandle. I am unsure what you mean by document
> > about debug version.
> > Colin:
> > I mean the method to debug ocfs2.
> >
> > > 2. Confirmed with onsite engineer.
> > > I think it's a file data corruption but file system. Here are scenes.
> > > The system has 2 nodes with ocfs2 filesystem, and nfs export on one node.
> > > Suppose:
> > > Node name: db1, db2
> > > Node that currently export NFS; db1
> > > Node that mount exported nfs: app1
> > > A. Read/write file corruption.
> > > Shutdown app1.
> > > When check file with ls command, it's blinking on db1, it's ok on
> > > db2.
> > > Remove on db2 failed too.
> > > Can't unmount and stop ocfs2 on db2.
> > > Faillover nfs to db1 and reboot db2.
> > > It's ok to delete on db1.
> > > Reboot app1, it can use exported fs.
> > > I don't what the error, why file is blinking? inode missed?
> >
> > I did not follow what you meant by "blinking". Secondly if you
> > have exported a volume, then that volume cannot be umounted.
> > That goes for all fs.
> > Colin:
> > When I run "ls -l" command, the bad file will be marked as read and blinking.
> > While I use xterm. I don't know what cause this.
> >
> > > B. Readonly file corruption.
> > > Update file, maybe from db1, maybe from db2.
> > > app1 report corruption file.
> > > Failover nfs from db1 to db2.
> > > Reboot app1, it's ok now.
> > > I think this scene caused by exported nfs fs not lock relative file,
> > > and partial content of updated file on another node(like db2) is not
> > > synchnized to db1 and then to app1, so app1 report corruption.
> > >
> > > I think this scene can be prevented from update file from
> > > db1(currently nfs exported node) but db2.
> >
> > So when you write to a file on node db2, the next read on db1 will
> > show that new data. However, there is no guarantee that app1 (which
> > has nfs mounted the volume on db2) will see the same data. The only
> > way this will work is if the application is doing odirect ios. This is an
> > inherent limitation in nfs.
> > Colin:
> > Thanks, got it. But I think we must accept current situation for direct ios will reduce our performance.
> >
> >
> > BRs,
> > Colin
> >
> >
> > ------------------------------------------------------------------------
> >
> > _______________________________________________
> > Ocfs2-users mailing list
> > <A HREF="mailto:Ocfs2-users@oss.oracle.com">Ocfs2-users@oss.oracle.com</A> <<A HREF="mailto:Ocfs2-users@oss.oracle.com">mailto:Ocfs2-users@oss.oracle.com</A>>
> > <A HREF="http://oss.oracle.com/mailman/listinfo/ocfs2-users">http://oss.oracle.com/mailman/listinfo/ocfs2-users</A>
>
>
</PRE>
</BODY>
</HTML>