<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
</head>
<body bgcolor="#ffffff" text="#000000">
Hi all,<br>
<br>
Yesterday I encountered a problem on one of our servers<br>
<br>
One file was not accessible on one of the servers, the rest of the
servers could read this file just fine.<br>
Every time the file was stat'ed on that one server the following error
was logged <br>
(nothing more, these messages were the only ocfs2 messages during the
past 2 months):<br>
<blockquote>ocfs2_permission:975 ERROR: status = -2 <br>
</blockquote>
A directory listing showed the file, but when doing a ls -la it
reported 'no such file or directory' for that file.<br>
The error number -2 is -ENOENT, and from reading the source I saw that
the error is generated by a <br>
call to ocfs2_meta_lock. The only possible way to generate this error
without printing more errors is that <br>
ocfs2_meta_lock_update must have returned this error. But looking at
ocfs2_meta_lock_update, the<br>
only way this error is generated is when the following condition is
true (oi->ip_flags & OCFS2_INODE_DELETED) <br>
But if I understand the code correctly, it also must give another
error: <br>
"Orphaned inode %llu was deleted while we were waiting on a lock.
ip_flags = 0x%x\n"<br>
But this error is not in my log files .. <br>
So I am puzzled how this error could be generated and what caused it in
the first place.<br>
<br>
Our setup:<br>
<br>
Storage: EMC-ax150i (iscsi)<br>
Each machine is connected to our ax150i with open-iscsi
(open-iscsi-2.0.865-2) with multipathing<br>
The machines all run a 2.6.21.5 kernel (with the backports from
<a class="moz-txt-link-freetext" href="http://kernel.org/pub/linux/kernel/people/mfasheh/ocfs2/backports/2.6.21">http://kernel.org/pub/linux/kernel/people/mfasheh/ocfs2/backports/2.6.21</a>
applied)<br>
<br>
The file giving the problems is a file which is under versioncontrol by
svn. <br>
And it was updated that afternoon, but the problems arised late in the
evening, and as far as I can<br>
tell the file was not in use at the time the file was updated.<br>
But I don't think svn can be blamed since we use it regularly and we
didn't have any problem with<br>
it in the past 2 months (the time our production cluster is alive)<br>
<br>
The problem was easily resolved by umounting and mounting the volume on
that one server.<br>
But that's not something I want to be doing often since it involves
shutting down all services<br>
making use of the volume.<br>
<br>
I think I've seen this problem 1 time before but at that time we didn't
have time to investigate<br>
since we were in the process of setting up the rest of our systems.<br>
<br>
Does someone have a clue what can have caused this error and how can we
prevent this <br>
error from happening in the future?<br>
<br>
Regards,<br>
<br>
Eric de Ruiter<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
</body>
</html>