[Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is too small for encoding

Wed Nov 11 23:36:12 PST 2009

Hi Sunil,

Please see answer in line.

BRs,
Colin

-----Original Message-----
From: ext Sunil Mushran <sunil.mushran at oracle.com>
To: Wang2, Colin (NSN - CN/Cheng Du) <colin.wang2 at nsn.com>
Cc: ocfs2-users at oss.oracle.com <ocfs2-users at oss.oracle.com>
Subject: Re: [Ocfs2-users] ocfs2_encode_fh:152 ERROR: fh buffer is too
small for encoding
Date: Wed, 11 Nov 2009 19:55:57 -0800

Wang2, Colin (NSN - CN/Cheng Du) wrote:
> Base on your questions,
> 1. The error is time issue. And it's a production system, it's hard to 
> install a debug version.
> I appreciate if you share some document about debug version so I can 
> test it while have chance.

The error is not necessarily an ocfs2 issue. ocfs2 has 64-bit inode numbers
and requires the large filehandle. I am unsure what you mean by document
about debug version.
Colin:
  I mean the method to debug ocfs2.

> 2.  Confirmed with onsite engineer.
> I think it's a file data corruption but file system. Here are scenes.
> The system has 2 nodes with ocfs2 filesystem, and nfs export on one node.
> Suppose:
> Node name: db1, db2
> Node that currently export NFS; db1
> Node that mount exported nfs: app1
> A. Read/write file corruption.
>     Shutdown app1.
>     When check file with ls command,  it's blinking on db1, it's ok on 
> db2.
>     Remove on db2 failed too.
>     Can't unmount and stop ocfs2 on db2.
>     Faillover nfs to db1 and reboot db2.
>     It's ok to delete on db1.
>     Reboot app1, it can use exported fs.
> I don't what the error, why file is blinking? inode missed?

I did not follow what you meant by "blinking". Secondly if you
have exported a volume, then that volume cannot be umounted.
That goes for all fs.
Colin:
   When I run "ls -l" command, the bad file will be marked as read and blinking. 
While I use xterm. I don't know what cause this.

> B. Readonly file corruption.
>    Update file, maybe from db1, maybe from db2.
>    app1 report corruption file.
>    Failover nfs from db1 to db2.
>    Reboot app1, it's ok now.
> I think this scene caused by exported nfs fs not lock relative file, 
> and partial content of updated file on another node(like db2) is not 
> synchnized to db1 and then to app1, so app1 report corruption.
>
> I think this scene can be prevented from update file from 
> db1(currently nfs exported node) but db2.

So when you write to a file on node db2, the next read on db1 will
show that new data. However, there is no guarantee that app1 (which
has nfs mounted the volume on db2) will see the same data. The only
way this will work is if the application is doing odirect ios. This is an
inherent limitation in nfs.
Colin:
  Thanks, got it. But I think we must accept current situation for direct ios will reduce our performance.

BRs,
Colin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20091112/d0ff12ba/attachment.html