[Ocfs2-users] OCFS2 CLUSTER HANG

Javier Mugueta javier.mugueta at oracle.com
Wed Oct 27 06:35:38 PDT 2010


Hi,

IHAC with an OCFS2 1.4 2-node cluster in RedHat 5.3 x86-64.

Basically for explanation simplicity the software uses the cluster for writing in a log file up to 500Mb size, among other things (file read, etc)

Last monday there was a problem somewhere (software is a two-node application server) that caused the software to hang, and it seems both application servers was waiting each other for writing. Just restarting the application servers, the hypothetical deadlock was resolved.

Question: Is ocfs2 capable to detect the deadlock and fence one of the nodes in this situation?

The configuration is:

# 2010-02-10: OCFS2 cluster-aware filesystem configuration

kernel.panic_on_oops = 1

kernel.panic = 30

…


 [oracle at mapcms1 ocfs2]$ /etc/init.d/o2cb status

Driver for "configfs": Loaded

Filesystem "configfs": Mounted

Driver for "ocfs2_dlmfs": Loaded

Filesystem "ocfs2_dlmfs": Mounted

Checking O2CB cluster ucmcluster: Online

Heartbeat dead threshold = 31

  Network idle timeout: 30000

  Network keepalive delay: 2000

  Network reconnect delay: 2000

Checking O2CB heartbeat: Active


Regards



More information about the Ocfs2-users mailing list