[Ocfs2-users] 'No space left on device' error with plenty of space.

Jason Price japrice at gmail.com
Thu Jun 10 06:27:27 PDT 2010


(Sorry Tao: I realized I had just replied to you)

I just uploaded a third output from stat_sysfs.sh to bug # 1263.  It was
taken while we were experiencing ENOSPC errors.  In my limited testing, I
was able to write a 324k file, then a 1620k file (5x324), but failed to
write a 16200k file (10x1620).

I also may need to frame the stat_sysfs outputs.  Here's a rough timeline:

Monday morning: start experiencing ENOSPC errors.  Start researching, while
Node1 limps along (no traffic on node 2.  Take stat_sysfs output (the one I
posted last to bug # 1163.  It is also posted under bug # 1159).  This is
when I ran the file size test mentioned above.

I found the bug # 1159, scheduled emergency downtime to "tunefs.ocfs2 -N 3"
the cluster.  Everything works fine, traffic still on node 1.  Writing large
files (60-70 megs) works just fine at this time.

Wednesday early morning: Again we start seeing ENOSPC errors.  Fail traffic
to node2, unmount/remount OCFS volume from node1. Take stat_sysfs.sh outputs
on both nodes (these are the first two that I posted to bug # 1163).
 Continue researching.  After failing to node2, writing large files works
again.

Wednesday around 11: Again ENOSPC errors start appearing.  I take the
opportunity to upgrade node1 to v1.4.7, then fail traffic to node1, then
upgrade node2 to v1.4.7.  We haven't seen the problem since (granted, that's
less than 24 hours).

This problem mostly affects the users attempting to write files via FTP.
 From the FTP daemon, I have log files which say that we're getting 'No
space left on device' errors, but I don't have info about file sizes that
are failing.

On Wed, Jun 9, 2010 at 10:20 PM, Tao Ma <tao.ma at oracle.com> wrote:

> Hi Jason,
>
>
> On 06/09/2010 11:34 PM, Jason Price wrote:
>
>> And now it's starting to fail again.
>>
> How about the situation?
> I checked your stat_sysfs output, it looks that you have spaces for inode,
> extent alloc and local alloc(but maybe the kernel haven't flushed the
> metadata to the disk while the stat_sysfs only read the disk). So why you
> meet with ENOSPC? Can you describe it in more detail? You meet with it when
> touching a new file, or cat some bytes to a file or ...?
> If you find the wrong scenario, please enable the debugfs option so that we
> can find out the real cause.
> debugfs.ocfs2 -l INODE allow
> debugfs.ocfs2 -l DISK_ALLOC allow
> run you test case here.
> debugfs.ocfs2 -l INODE off
> debugfs.ocfs2 -l DISK_ALLOC off
>
> Regards,
> Tao
>
>
>> --Jason
>>
>> On Wed, Jun 9, 2010 at 9:51 AM, Jason Price <japrice at gmail.com
>> <mailto:japrice at gmail.com>> wrote:
>>
>>    I've got a busy FTP/Web cluster running OCFS2 v1.4.4.
>>
>>    I've started getting "No space on device" errors when users attempt
>>    to write to the file system.  Disk utilization is about 76% with
>>    more than 100gb free.  Inode utilization is also at 76%.
>>
>>    I thought this was a manifestation of bug # 1189, so I decreased the
>>    number of nodes via tunefs.ocfs2 from 8 (the default) down to 3
>>    (there are only 2 nodes in the cluster, with no growth anticipated).
>>
>>    That got me out of the woods on Monday, but this morning the problem
>>    manifested again.
>>
>>    I've opened bug # 1263 about this issue. (link:
>>    http://oss.oracle.com/bugzilla/show_bug.cgi?id=1263 )
>>
>>    Does anyone have other ideas?
>>
>>    I'm more than happy to supply other information.
>>
>>    What seems to happen is that small writes are allowed, but bigger
>>    writes failed.  On Monday, I could write multiple 325kb files, and I
>>    could cat them together to make one file of ~2 mb, but when I tried
>>    to make a 10ish mb file, it failed.
>>
>>    --Jason
>>
>>
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100610/93b41ee4/attachment.html 


More information about the Ocfs2-users mailing list