[Ocfs2-users] strange fencing behavior

Sunil Mushran sunil.mushran at oracle.com
Thu Sep 24 14:24:11 PDT 2009


You have to look at the logs of the fenced nodes. Setup netconsole
to trap the kernel logs.

On a side note, the logs show you are hitting the following bugzilla.
http://oss.oracle.com/bugzilla/show_bug.cgi?id=1053
Upgrade the tools to 1.4.2.

Sean Thon wrote:
> I have 10 servers in a cluster running Debian Etch with 2.6.26-bpo.2 
> with a backport of ocfs2-tools-1.4.1-1
> I'm using AoE to export the drives from a Debian Lenny server in the 
> cluster.
>
> My problem is if I mount the ocfs2 partition on the server that is 
> exporting it via AoE it fences the entire cluster.  Looking at the logs 
> exporting the ocfs2 partition doesn't give much information... Is this a 
> known limitation?
>
> Sep 24 14:06:20 storage0 kernel: [650570.916574] ocfs2_hb_ctl[12083]: 
> segfault at 0 ip 7f20d9c97a90 sp 7fffe238cf98 error 4 in 
> libc-2.7.so[7f20d9c1d000+14a000]
> Sep 24 14:06:20 storage0 kernel: [650570.916932] ocfs2: Unmounting 
> device (8,17) on (node 0)
> Sep 24 14:07:04 storage0 kernel: [650622.608756] ocfs2_dlm: Nodes in 
> domain ("E042658E558940E6B05EFE2DBA548DFD"): 0
> Sep 24 14:07:04 storage0 kernel: [650622.608818] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:07:04 storage0 kernel: [650622.609924] ocfs2: Mounting device 
> (8,17) on (node 0, slot 0) with ordered data mode.
> Sep 24 14:07:29 storage0 kernel: [650651.285363] ocfs2_hb_ctl[12160]: 
> segfault at 0 ip 7f17160f3a90 sp 7fff1e7e93e8 error 4 in 
> libc-2.7.so[7f1716079000+14a000]
> Sep 24 14:07:29 storage0 kernel: [650651.285638] ocfs2: Unmounting 
> device (8,17) on (node 0)
> Sep 24 14:07:57 storage0 kernel: [650683.535472] ocfs2_dlm: Nodes in 
> domain ("1C1694511D464F16971020CE53D45401"): 0
> Sep 24 14:07:57 storage0 kernel: [650683.566388] JBD: Ignoring recovery 
> information on journal
> Sep 24 14:07:57 storage0 kernel: [650683.566388] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:07:57 storage0 kernel: [650683.566388] ocfs2: Mounting device 
> (8,18) on (node 0, slot 10) with ordered data mode.
> Sep 24 14:07:57 storage0 kernel: [650683.566388] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 10 from slot 0 on 
> device (8,18)
> Sep 24 14:08:00 storage0 kernel: [650687.138110] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:00 storage0 kernel: [650687.268898] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 2 from slot 1 on 
> device (8,18)
> Sep 24 14:08:02 storage0 kernel: [650690.547309] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:03 storage0 kernel: [650690.581554] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 3 from slot 2 on 
> device (8,18)
> Sep 24 14:08:05 storage0 kernel: [650693.934489] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:05 storage0 kernel: [650693.983941] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 4 from slot 3 on 
> device (8,18)
> Sep 24 14:08:08 storage0 kernel: [650697.403252] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:08 storage0 kernel: [650697.478297] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 5 from slot 4 on 
> device (8,18)
> Sep 24 14:08:11 storage0 kernel: [650700.979480] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:11 storage0 kernel: [650701.049023] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 6 from slot 5 on 
> device (8,18)
> Sep 24 14:08:14 storage0 kernel: [650704.525081] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:14 storage0 kernel: [650704.569578] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 7 from slot 6 on 
> device (8,18)
> Sep 24 14:08:17 storage0 kernel: [650708.610043] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:17 storage0 kernel: [650708.642044] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 8 from slot 7 on 
> device (8,18)
> Sep 24 14:08:20 storage0 kernel: [650712.306538] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:20 storage0 kernel: [650712.366600] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 9 from slot 8 on 
> device (8,18)
> Sep 24 14:08:24 storage0 kernel: [650716.491634] kjournald starting.  
> Commit interval 5 seconds
> Sep 24 14:08:24 storage0 kernel: [650716.587527] 
> (12231,1):ocfs2_replay_journal:1149 Recovering node 1 from slot 9 on 
> device (8,18)
> Sep 24 14:08:27 storage0 kernel: [650720.039214] kjournald starting.  
> Commit interval 5 seconds
>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   




More information about the Ocfs2-users mailing list