[Ocfs2-users] OCFS2 1.4 Problem on SuSE

Sunil Mushran sunil.mushran at oracle.com
Tue Sep 29 09:48:10 PDT 2009


I won't be able to help you with sles kernel versions. As far as I know,
that fix has not yet made it in any official kernel. But I could be wrong.

In any event, this issue will be best handled by Novell.

Angelo McComis wrote:
> All:
>
> I contacted Novell and received a PTF patch, however, we also ran the
> current patch updates to the box which was actually a couple builds
> newer than the patch (patch was -39.3, our current was -42.5). We are
> still having the crash.
>
> I should clarify a little more around the application behavior...
> essentially, the application kicks off a run job, and fails due to a
> file not found error...  the working directory (in our case,
> /opt/IBM/dev/projects/) seems to disappear out from under the app,
> which causes the app to error out.  I checked my /var/log/messages and
> see nothing now during the time the testing was occurring... just the
> ssh logins from the application as it's kicking off the job.
>
> I plan to present a LUN directly to one host and create a simple ext2
> fs there. This would help narrow the focus from storage stack in
> general or ocfs2 specifically.
>
> Also, with the ocfs2 being part of the kernel, is there a way to
> determine which version of it is actually in the running kernel?
>
> Thanks,
> Angelo
>
>
> On Mon, Sep 28, 2009 at 2:35 PM, Sunil Mushran <sunil.mushran at oracle.com> wrote:
>   
>> Ping Novell for issues on SLES10. The error suggests that you are
>> encountering novell bz#524683. This has been addressed in ocfs2 1.4.4.
>> Ping Novell for a PTF kernel with the fix.
>>
>> Angelo McComis wrote:
>>     
>>>  Hello --
>>>
>>> We're running a handful of OCFS2 clusters on Novell SuSE SLES 10 SP2.
>>> We are in front of IBM SVC storage, and on HP Blade hardware via the
>>> QLA 2xxx HBAs.
>>>
>>> We have an application from IBM that makes use of files in this space
>>> in a grid style environment, and we are in the process of debugging
>>> some I/O issues and crashes, but while we do, I'm wondering if there
>>> is any good reference on what constitutes a solid starting point for
>>> tuning how many concurrent accesses to a directory are allowed, or if
>>> there are specific tunables that are outside the default we need.
>>>
>>>
>>> There are some strange errors that I can't decipher:
>>>
>>> Sep 25 12:11:30 host02 kernel: (4438,4):dlmunlock_common:128 ERROR:
>>> lockres F00000000000000003b1341b545b16f: Someone is calling dlmunlock
>>> while waiting for an ast!<3>(4438,4):dlmunlock:685 ERROR: dlm status =
>>> DLM_BADPARAM
>>>
>>> Sep 25 12:11:30 host02 kernel: (4438,4):ocfs2_cancel_convert:3092
>>> ERROR: Dlm error "DLM_BADPARAM" while calling dlmunlock on resource
>>> F00000000000000003b1341b545b16f: invalid lock mode specified
>>>
>>> The symptom of the problem is that file access to the mountpoint of
>>> ocfs2 space gets gradually slower and slower until the system just
>>> crashes / becomes unresponsive when trying to access files there, cd
>>> into the directory, etc.
>>>
>>> What we've done so far:
>>>
>>> - Checked our multipath configuration - seems to be showing all paths
>>> to our disks, none offline, none failed, etc.
>>> - Checked our lvm configuration - seems to be good as well.
>>> - Checked our HBA configuration -- made some changes in regards to
>>> retry and failover... but this change has made the behavior no better.
>>>
>>> Anyone can point me in the right direction or help me know what
>>> questions to even start asking here?
>>>
>>> The problem seems related to multiple / concurrent access to
>>> directories within an OCFS2 filesystem, and how DLM is behaving.
>>>
>>>
>>> Our OS ver/kernel is 2.6.16.60-0.42.5-smp  (Novell SLES10-sp2 + patches)
>>>
>>> Thanks in advance...
>>>
>>> Angelo
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>       
>>     




More information about the Ocfs2-users mailing list