[Ocfs2-users] OCFS2 1.4 Problem on SuSE

Charlie Sharkey charlie.sharkey at bustech.com
Tue Sep 29 07:41:43 PDT 2009


It was mentioned:

    - Checked our lvm configuration - seems to be good as well.

Is lvm supported by ocfs2 ?


-----Original Message-----
From: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-bounces at oss.oracle.com] On Behalf Of Angelo McComis
Sent: Monday, September 28, 2009 9:29 PM
To: Sunil Mushran
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] OCFS2 1.4 Problem on SuSE

All:

I contacted Novell and received a PTF patch, however, we also ran the
current patch updates to the box which was actually a couple builds
newer than the patch (patch was -39.3, our current was -42.5). We are
still having the crash.

I should clarify a little more around the application behavior...
essentially, the application kicks off a run job, and fails due to a
file not found error...  the working directory (in our case,
/opt/IBM/dev/projects/) seems to disappear out from under the app,
which causes the app to error out.  I checked my /var/log/messages and
see nothing now during the time the testing was occurring... just the
ssh logins from the application as it's kicking off the job.

I plan to present a LUN directly to one host and create a simple ext2
fs there. This would help narrow the focus from storage stack in
general or ocfs2 specifically.

Also, with the ocfs2 being part of the kernel, is there a way to
determine which version of it is actually in the running kernel?

Thanks,
Angelo


On Mon, Sep 28, 2009 at 2:35 PM, Sunil Mushran <sunil.mushran at oracle.com> wrote:
> Ping Novell for issues on SLES10. The error suggests that you are
> encountering novell bz#524683. This has been addressed in ocfs2 1.4.4.
> Ping Novell for a PTF kernel with the fix.
>
> Angelo McComis wrote:
>>
>>  Hello --
>>
>> We're running a handful of OCFS2 clusters on Novell SuSE SLES 10 SP2.
>> We are in front of IBM SVC storage, and on HP Blade hardware via the
>> QLA 2xxx HBAs.
>>
>> We have an application from IBM that makes use of files in this space
>> in a grid style environment, and we are in the process of debugging
>> some I/O issues and crashes, but while we do, I'm wondering if there
>> is any good reference on what constitutes a solid starting point for
>> tuning how many concurrent accesses to a directory are allowed, or if
>> there are specific tunables that are outside the default we need.
>>
>>
>> There are some strange errors that I can't decipher:
>>
>> Sep 25 12:11:30 host02 kernel: (4438,4):dlmunlock_common:128 ERROR:
>> lockres F00000000000000003b1341b545b16f: Someone is calling dlmunlock
>> while waiting for an ast!<3>(4438,4):dlmunlock:685 ERROR: dlm status =
>> DLM_BADPARAM
>>
>> Sep 25 12:11:30 host02 kernel: (4438,4):ocfs2_cancel_convert:3092
>> ERROR: Dlm error "DLM_BADPARAM" while calling dlmunlock on resource
>> F00000000000000003b1341b545b16f: invalid lock mode specified
>>
>> The symptom of the problem is that file access to the mountpoint of
>> ocfs2 space gets gradually slower and slower until the system just
>> crashes / becomes unresponsive when trying to access files there, cd
>> into the directory, etc.
>>
>> What we've done so far:
>>
>> - Checked our multipath configuration - seems to be showing all paths
>> to our disks, none offline, none failed, etc.
>> - Checked our lvm configuration - seems to be good as well.
>> - Checked our HBA configuration -- made some changes in regards to
>> retry and failover... but this change has made the behavior no better.
>>
>> Anyone can point me in the right direction or help me know what
>> questions to even start asking here?
>>
>> The problem seems related to multiple / concurrent access to
>> directories within an OCFS2 filesystem, and how DLM is behaving.
>>
>>
>> Our OS ver/kernel is 2.6.16.60-0.42.5-smp  (Novell SLES10-sp2 + patches)
>>
>> Thanks in advance...
>>
>> Angelo
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>
>

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users


More information about the Ocfs2-users mailing list