[Ocfs2-users] heartbeat write timeout

Stephan A. Rickauer stephan.rickauer at ini.phys.ethz.ch
Fri Mar 31 02:28:30 CST 2006


Stephan A. Rickauer wrote:
>> When the hb thread panics, it dumps messages indicating
>> the times it took to perform the tasks. Could you share
>> those messages?
> 
> Actually, I have not seen those messages. Give me a couple of minutes
> and I will reproduce the crash to post the numbers here.

Ok, this is what I get when reducing the heartbeat treshold to the
default in /etc/sysconfig/o2cb:

---snip---
(3,0):o2hb_write_timeout: 164 ERROR: Heartbeat write timeout to device
sdb1 after 12000 milliseconds
(3,0):02hb_stop_all_regions: 1727 ERROR: stopping heartbeat on all
active regions
Kernel panic - not syncing: ocfs2 is very sorry to be fencing this
system by panicing

<3>iscsi-sfnet:host1: ping timeout of 5 secs expired, last rx
4296316431, last ping 4296321431, now 4296326431
---snip---

I haven't reported the iscsi-sfnet message the first time, since I
believed it is a followup error of the ocfs2 crash. However, this is all
I have on the screen.


Apart from that, here is what I get when I mount my ocfs2 fs (before the
crash, of course). May be irrelevant:

---snip---
[root at lvs02 ~]# mount /dev/sdb1 /mnt/iscsi
(2943,0):ocfs2_initialize_super:1354 max_slots for this device: 4
(2943,0):ocfs2_fill_local_node_info:1031 I am node 0
(2943,0):__dlm_print_nodes:384 Nodes in my domain
("6862E40BCE3F4A0CBB047A5ADF8FA2E6"):
(2943,0):__dlm_print_nodes:388  node 0
(2943,0):ocfs2_find_slot:267 taking node slot 0
ocfs2: Mounting device (8,17) on (node 0, slot 0)
---snip---


And the proof of using deadline plus some additional info:

---snip---
[root at lvs02 ~]# dmesg | grep sched
Using deadline io scheduler

[root at lvs02 ~]# lspci | grep Broadcom
02:03.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5704
Gigabit Ethernet (rev 10)

[root at lvs02 ~]# uname -a
Linux lvs02.lan.ini.unizh.ch 2.6.9-34.EL #1 Thu Mar 9 06:03:30 GMT 2006
x86_64 x86_64 x86_64 GNU/Linux

[root at lvs02 ~]# rpm -qa | grep ocfs2
ocfs2console-1.2.0-1
ocfs2-2.6.9-34.EL-1.2.0-1
ocfs2-tools-1.2.0-1

[root at lvs02 ~]# cat /proc/cpuinfo | grep name
model name      : AMD Opteron(tm) Processor 254
---snip---


let me know if you need more... or how I can help.

Thanks!

-- 

 Stephan A. Rickauer

 -----------------------------------------------------------
 Institut für Neuroinformatik          Tel: +41 44 635 30 50
 Universität / ETH Zürich              Sek: +41 44 635 30 52
 Winterthurerstrasse 190               Fax: +41 44 635 30 53
 CH-8057 Zürich                        Web:  www.ini.ethz.ch

 RSA public key: https://www.ini.ethz.ch/~stephan/pubkey.asc
 -----------------------------------------------------------

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 890 bytes
Desc: OpenPGP digital signature
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20060331/60a029c3/signature.bin


More information about the Ocfs2-users mailing list