[Ocfs2-users] OCFS2 Crash

Sunil Mushran sunil.mushran at oracle.com
Wed Jun 29 11:42:08 PDT 2011


1.2.1? That's 5 years old. We've had a few fixes since then. ;)

You have to catch the oops trace to figure out the reason. And one
way to get it by using netconsole. Check the sles10 docs to see how to
configure netconsole. Or, whatever is recommended for capturing the
oops log in that release.

On 06/29/2011 11:28 AM, B Leggett wrote:
> Hi,
> I am running the OCFS2 1.2.1 on SLES 10, just the stuff right out of the box. This is a 3 node cluster that's been running for 2 years with just about zero modification. The storage is a high end SAN and the transport is iscsi. We went two years without an issue and all a sudden node 1 in the cluster keeps crashing. I have never had to troubleshoot OCFS2, so I started with what I could control.
>
> I checked /var/log/messages and nothing there suggests a problem. I replaced hardware that went as far as me popping the scsi drives out and putting them in another server and trying it with all new hardware. The problem still persists.
>
> I had the network team check the iscsi port on the private iscsi network and they are not seeing errors.
>
> I've check the few OCFS2 settings in play and they all look good.
>
> My question to the group is how go I continue troubleshooting this issue? I'm not aware of any native logs etc to reference. I would appreciate any help that gets this diagnosis moving to a solution.
>
> Thanks,
> Bruce




More information about the Ocfs2-users mailing list