[Ocfs2-users] kernel panic - not syncing

Sunil Mushran Sunil.Mushran at oracle.com
Mon Jan 22 10:38:14 PST 2007


o2net timeout cannot cause the o2hb panic. The two are totally
different. From the outputs, I would guess o2hb is timing out but
I cannot say for sure till I don't see the full logs.

Andy Phillips wrote:
> Its worth pointing out that the o2net idle timer is triggering on the 
> network heartbeat, which is 10 seconds, in the current 1.2.x series.
>
>
> O2CB_HEARTBEAT_THRESHOLD has no effect on this, because its another part
> of the code which causes the problem.
>
> see ocfs2-1.2.3/fs/ocfs2/cluster/tcp_internal.h
> #define O2NET_IDLE_TIMEOUT_SECS         10
>
> Andy
>
>
> On Mon, 2007-01-22 at 09:29 -0800, Srinivas Eeda wrote:
>   
>> problem appears to be that IO is taking more time than effective O2CB_HEARTBEAT_THRESHOLD. Your configured value "31" doesn't seem to be effective?
>>
>> Index 6: took 1995 ms to do msleepIndex 
>> Index 17: took 1996 ms to do msleep
>> Index 22: took 10001 ms to do waiting for read completion.
>>
>> Can you please cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold and verify. 
>>
>> Thanks,
>> --Srini.
>>
>>
>>
>>
>> Consulente3 wrote:
>>     
>>> Hi all, 
>>>
>>> my test environment, is composed by 2 server with centos 4.4
>>> nodes is exporting with aoe6-43 + vblade-14
>>>
>>> kernel-2.6.9-42.0.3.EL
>>> ocfs2-tools-1.2.2-1
>>> ocfs2console-1.2.2-1
>>> ocfs2-2.6.9-42.0.3.EL-1.2.3-1
>>>
>>> /dev/etherd/e2.0 on /ocfs2 type ocfs2 (rw,_netdev,heartbeat=local)
>>> /dev/etherd/e3.0 on /ocfs2_nfs type ocfs2 (rw,_netdev,heartbeat=local)
>>>
>>> Device                FS     Nodes
>>> /dev/etherd/e2.0      ocfs2  ocfs2, becks
>>> /dev/etherd/e3.0      ocfs2  ocfs2, becks
>>>
>>> Device                FS     UUID                                  Label
>>> /dev/etherd/e2.0      ocfs2  b24cc18d-af89-4980-a75e-a87530b1b878  test1
>>> /dev/etherd/e3.0      ocfs2  101a92fd-b83b-4294-8bfc-fbaa069c3239  nfs4
>>>
>>> O2CB_HEARTBEAT_THRESHOLD=31
>>>
>>> when i try to make stress test:
>>>
>>> Index 4: took 0 ms to do checking slots
>>> Index 5: took 2 ms to do waiting for write completion
>>> Index 6: took 1995 ms to do msleep
>>> Index 7: took 0 ms to do allocating bios for read
>>> Index 8: took 0 ms to do bio alloc read
>>> Index 9: took 0 ms to do bio add page read
>>> Index 10: took 0 ms to do submit_bio for read
>>> Index 11: took 2 ms to do waiting for read completion
>>> Index 12: took 0 ms to do bio alloc write
>>> Index 13: took 0 ms to do bio add page write
>>> Index 14: took 0 ms to do submit_bio for write
>>> Index 15: took 0 ms to do checking slots
>>> Index 16: took 1 ms to do waiting for write completion
>>> Index 17: took 1996 ms to do msleep
>>> Index 18: took 0 ms to do allocating bios for read
>>> Index 19: took 0 ms to do bio allo read
>>> Index 20: took 0 ms to do bio add page read
>>> Index 21: took 0 ms to do submit_bio for read
>>> Index 22: took 10001 ms to do waiting for read completion
>>> (3,0):o2hb_stop_all_regions:1908 ERROR: stopping heartbeat on all active
>>> regions.
>>> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
>>> system by panicing
>>>
>>>
>>> <6>o2net: connection to node ocfs2 (num 2) at 10.1.7.107:777 has been
>>> idle for 10 seconds, shutting it down
>>> (3,0): o2net_idle_timer:1309 here are some times that might help debug
>>> the situation:
>>> (tmr: 1169487957.71650 now 1169487967.69569 dr 1169487962.88883 adv
>>> 1169487957.71671:1159487957.71674
>>> func 83bce37b2:505) 1169487901.984644:1169487901.984676)
>>>
>>> the kernel panic occurs always on the same node, and the other node
>>> still responding
>>>
>>> thanks!
>>>                                                                  
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>   
>>>       
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>     



More information about the Ocfs2-users mailing list