[Ocfs2-users] kernel panic - not syncing

Srinivas Eeda srinivas.eeda at oracle.com
Mon Jan 22 09:29:39 PST 2007


problem appears to be that IO is taking more time than effective O2CB_HEARTBEAT_THRESHOLD. Your configured value "31" doesn't seem to be effective?

Index 6: took 1995 ms to do msleepIndex 
Index 17: took 1996 ms to do msleep
Index 22: took 10001 ms to do waiting for read completion.

Can you please cat /proc/fs/ocfs2_nodemanager/hb_dead_threshold and verify. 

Thanks,
--Srini.




Consulente3 wrote:
> Hi all, 
>
> my test environment, is composed by 2 server with centos 4.4
> nodes is exporting with aoe6-43 + vblade-14
>
> kernel-2.6.9-42.0.3.EL
> ocfs2-tools-1.2.2-1
> ocfs2console-1.2.2-1
> ocfs2-2.6.9-42.0.3.EL-1.2.3-1
>
> /dev/etherd/e2.0 on /ocfs2 type ocfs2 (rw,_netdev,heartbeat=local)
> /dev/etherd/e3.0 on /ocfs2_nfs type ocfs2 (rw,_netdev,heartbeat=local)
>
> Device                FS     Nodes
> /dev/etherd/e2.0      ocfs2  ocfs2, becks
> /dev/etherd/e3.0      ocfs2  ocfs2, becks
>
> Device                FS     UUID                                  Label
> /dev/etherd/e2.0      ocfs2  b24cc18d-af89-4980-a75e-a87530b1b878  test1
> /dev/etherd/e3.0      ocfs2  101a92fd-b83b-4294-8bfc-fbaa069c3239  nfs4
>
> O2CB_HEARTBEAT_THRESHOLD=31
>
> when i try to make stress test:
>
> Index 4: took 0 ms to do checking slots
> Index 5: took 2 ms to do waiting for write completion
> Index 6: took 1995 ms to do msleep
> Index 7: took 0 ms to do allocating bios for read
> Index 8: took 0 ms to do bio alloc read
> Index 9: took 0 ms to do bio add page read
> Index 10: took 0 ms to do submit_bio for read
> Index 11: took 2 ms to do waiting for read completion
> Index 12: took 0 ms to do bio alloc write
> Index 13: took 0 ms to do bio add page write
> Index 14: took 0 ms to do submit_bio for write
> Index 15: took 0 ms to do checking slots
> Index 16: took 1 ms to do waiting for write completion
> Index 17: took 1996 ms to do msleep
> Index 18: took 0 ms to do allocating bios for read
> Index 19: took 0 ms to do bio allo read
> Index 20: took 0 ms to do bio add page read
> Index 21: took 0 ms to do submit_bio for read
> Index 22: took 10001 ms to do waiting for read completion
> (3,0):o2hb_stop_all_regions:1908 ERROR: stopping heartbeat on all active
> regions.
> Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
> system by panicing
>
>
> <6>o2net: connection to node ocfs2 (num 2) at 10.1.7.107:777 has been
> idle for 10 seconds, shutting it down
> (3,0): o2net_idle_timer:1309 here are some times that might help debug
> the situation:
> (tmr: 1169487957.71650 now 1169487967.69569 dr 1169487962.88883 adv
> 1169487957.71671:1159487957.71674
> func 83bce37b2:505) 1169487901.984644:1169487901.984676)
>
> the kernel panic occurs always on the same node, and the other node
> still responding
>
> thanks!
>                                                                  
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   



More information about the Ocfs2-users mailing list