[Ocfs2-users] Unsual Segfault (but reboot did not occur and node stayed offline)

David Murphy dmurphy at leadgeniuses.com
Tue Dec 16 09:01:50 PST 2008


My logs on Node Id 3:


Dec 16 06:44:03 web3 syslogd 1.5.0#1ubuntu1: restart.
Dec 16 08:43:31 web3 kernel: [10727560.835261] Modules linked in: vmmemctl
ocfs2 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs vmhgfs ext2
dm_round_robin crc32c libcrc32c iscsi_tcp libiscsi scsi_transport_iscsi lp
loop ipv6 parport_pc parport psmouse evdev serio_raw pcspkr i2c_piix4
i2c_core container ac button intel_agp agpgart dm_multipath dm_mod ext3 jbd
mbcache sr_mod cdrom sg sd_mod ata_piix pata_acpi floppy pcnet32 ata_generic
mii mptspi mptscsih mptbase scsi_transport_spi libata scsi_mod thermal
processor fan vmxnet vesafb fbcon tileblit font bitblit softcursor
Dec 16 08:43:31 web3 kernel: [10727560.843108] 
Dec 16 08:43:31 web3 kernel: [10727560.843900] Pid: 4856, comm: o2net Not
tainted (2.6.24-19-virtual #1)
Dec 16 08:43:31 web3 kernel: [10727560.844724] EIP: 0062:[<f8e682bb>]
EFLAGS: 00010202 CPU: 0
Dec 16 08:43:31 web3 kernel: [10727560.845566] EIP is at
__dlm_print_one_lock_resource+0x9db/0x9f0 [ocfs2_dlm]
Dec 16 08:43:31 web3 kernel: [10727560.846385] EAX: 00000001 EBX: 0000001f
ECX: 00000000 EDX: 00000000
Dec 16 08:43:31 web3 kernel: [10727560.849779] ESI: f75e8c00 EDI: 00000000
EBP: ec774700 ESP: df877d34
Dec 16 08:43:31 web3 kernel: [10727560.851900]  DS: 007b ES: 007b FS: 00d8
GS: 0000 SS: 006a
Dec 16 08:43:31 web3 kernel: [10727560.906502] ---[ end trace
989a5ffd1351fea4 ]---
Dec 16 08:44:01 web3 kernel: [10727590.622434] o2net: connection to node
deploy (num 5) at 192.168.102.12:7777 has been idle for 30.0 seconds,
shutting it down.
Dec 16 08:44:01 web3 kernel: [10727590.627319] (4,0):o2net_idle_timer:1414
here are some times that might help debug the situation: (tmr
1229438611.731225 now 1229438641.727360 dr 1229438613.731191 adv
1229438611.731227:1229438611.731228 func (a9b6ebe7:504)
1229438600.868142:1229438600.868149)
Dec 16 08:44:01 web3 kernel: [10727590.629281] o2net: connection to node
app1 (num 6) at 192.168.102.10:7777 has been idle for 30.0 seconds, shutting
it down.
Dec 16 08:44:01 web3 kernel: [10727590.630630] (4,0):o2net_idle_timer:1414
here are some times that might help debug the situation: (tmr
1229438611.731486 now 1229438641.734226 dr 1229438634.811356 adv
1229438611.731488:1229438611.731489 func (a9b6ebe7:502)
1229438610.482837:1229438610.482839)
Dec 16 08:44:01 web3 kernel: [10727590.632818] o2net: connection to node
rgapp1 (num 4) at 192.168.102.11:7777 has been idle for 30.0 seconds,
shutting it down.
Dec 16 08:44:01 web3 kernel: [10727590.634937] (4,0):o2net_idle_timer:1414
here are some times that might help debug the situation: (tmr
1229438611.736146 now 1229438641.737771 dr 1229438613.756472 adv
1229438611.736149:1229438611.736149 func (a9b6ebe7:503)
1229438611.735983:1229438611.735988)
Dec 16 08:44:01 web3 kernel: [10727590.640618] o2net: connection to node
web1 (num 1) at 192.168.102.40:7777 has been idle for 30.0 seconds, shutting
it down.
Dec 16 08:44:01 web3 kernel: [10727590.642402] (4,0):o2net_idle_timer:1414
here are some times that might help debug the situation: (tmr
1229438611.742904 now 1229438641.745604 dr 1229438617.734942 adv
1229438611.742907:1229438611.742907 func (a9b6ebe7:504)
1229438611.675070:1229438611.675075)
Dec 16 08:44:01 web3 kernel: [10727590.651745] o2net: connection to node
web2 (num 2) at 192.168.102.41:7777 has been idle for 30.0 seconds, shutting
it down.
Dec 16 08:44:01 web3 kernel: [10727590.657208] (0,0):o2net_idle_timer:1414
here are some times that might help debug the situation: (tmr
1229438611.756791 now 1229438641.756770 dr 1229438641.756769 adv
1229438611.756768:1229438611.756697 func (a9b6ebe7:507)
1229438611.756792:1229438611.746230)



On the other nodes they ended up locking up waiting for  death notification
of Node3. 
Can anyone tell me with the kernel message above means and what I can to to
keep this from occurring again


Thanks
David




More information about the Ocfs2-users mailing list