[Ocfs2-users] Kernel oops: ocfs2_read_blocks

Sunil Mushran sunil.mushran at oracle.com
Fri Jun 24 10:26:05 PDT 2011


How many nodes?
Does it happen on all the nodes or one in particular?
Are you running the same kernel version on all nodes?
Did this issue start reproducing after some update?
How often does it happen?

Maybe best if you file a bugzilla on oss.oracle.com/bugzilla and
answer the qs there. This could be squeeze specific.

Also, attach the objdump generated as follows;
# objdump -DSl /lib/modules/`uname -r`/kernel/fs/ocfs2/ocfs2.ko >/tmp/ocfs2.out

Ensure it is the same binary that generated the stack below.

Also, cut-paste the following instead of the one you posted. (I have
removed the unnecessary bits to make it more readable.)

===================================================================
BUG: unable to handle kernel NULL pointer dereference at 00000002
IP: [<fb4fdf60>] ocfs2_read_blocks+0x2e2/0x5ba [ocfs2]
*pdpt = 0000000001 446001 *pde = 0000000000000000
Oops: 0000 [#1] SMP
last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:0d:00.0/host3/rport-3:0-7/target3:0:0/3:0:0:10/state
Modules linked in: ipt_REJECT xt_tcpudp iptable_filter ip_tables ocfs2 quota_tree
ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs
arpt_mangle arptable_filter arp_tables x_tables bonding dm_round_robin dm_multipath
scsi_dh loop radeon ttm snd_pcm drm_kms_helper snd_timer snd drm soundcore
i2c_algo_bit ipmi_si ses i2c_core ipmi_msghandler snd_page_alloc sd_mod psmouse hpwdt
pcspkr enclosure hpilo crc_t10dif serio_raw processor container power_meter button
evdev ext4 mbcache jbd2 crc16 dm_mod sg usbhid sr_mod hid cdrom ata_generic cciss
uhci _hcd thermal qla2xxx scsi_transport_fc ata_piix scsi_tgt ehci_hcd libata usbcore
nls_base thermal_sys bnx2 scsi_mod [last unloaded: scsi_wait_scan]

Pid: 32337, comm: ocfs2rec Not tainted (2.6.32-5-686-bigmem #1) ProLiant DL380 G6
EIP: 0060:[<fb4fdf60>] EFLAGS: 00010202 CPU: 10
EIP is at ocfs2_read_blocks+0x2e2/0x5ba [ocfs2]
EAX: f5647d48 EBX: f3ec31dc ECX: ffffffff EDX: 00000001
ESI: 00000002 EDI: 00000000 EBP: 00000001 ESP: f40f1e6c
  DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process ocfs2rec (pid: 32337, ti=f40f0000 task=f658aec0 task.ti=f40f0000)
Stack:
00000001 00000000 ffffffff ffffffff f5dc8dd8 f5673000 00000001 f3ec31dc
f6b34d20 00000000 f3ec3000 fb53937f 00000001 f5647d48 00000001 00000000
f3ec31dc 00000001 f3ec3000 00000000 fb50eeab f383be64 f3ec3000 00000000
Call Trace:
[<fb53937f>] ? ocfs2_refresh_slot_info+0x80/0xad [ocfs2]
[<fb50eeab>] ? ocfs2_super_lock+0x1b4/0x27c [ocfs2]
[<fb51e18f>] ? __ocfs2_recovery_thread+0xc4/0x146d [ocfs2]
[<c1006f50>] ? __switch_to+0xcf/0x141
[<c1031658>] ? finish_task_switch+0x34/0x95
[<c127e933>] ? schedule+0x7a4/0x7f1
[<c1026881>] ? __wake_up_common+0x34/0x59
[<fb51e0cb>] ? __ocfs2_recovery_thread+0x0/0x146d [ocfs2]
[<c104a34c>] ? kthread+0x61/0x66
[<c104a2eb>] ? kthread+0x0/0x66
[<c1008d87>] ? kernel_thread_helper+0x7/0x10
Code: 01
00 00 68 bd 0e 56 fb e8 d9 00 d8 c5 c7 44 24 3c 01 00 00 00 83 c4 24 eb
11 8b 14 24 89 54 24 18 eb 08 c7 44 24 18 01 00 00 00 <8b> 06 a9 00 00
01 00 74 50 83 7c 24 18 00 0f 84 5b 01 00 00 f6
EIP: [<fb4fdf60>] ocfs2_read_blocks+0x2e2/0x5ba [ocfs2] SS:ESP 0068:f40f1e6c
CR2: 0000000000000002
---[ end trace c5a96dd4578cc061 ]---
===================================================================


On 06/24/2011 08:48 AM, Stefan Upietz wrote:
> Hello there,
>
> we're experiencing strange behaviour with our ocfs2-enabled systems
> after one node goes down. Right now it is not possible to recreate
> this situation, for it is (alas!) on a critical system...
> This happened on a HP ProLiant DL 360 with an ocfs2 volume on a SAN.
> We're running Debian Squeeze with o2cb_ctl version 1.4.4.
> This is my first post on an oops, so if I missed any information I'd be
> glad if you gave me some hints. Here's the trace:
>
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225294] BUG:
> unable to handle kernel NULL pointer dereference at 00000002
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225346] IP:
> [<fb4fdf60>] ocfs2_read_blocks+0x2e2/0x5ba [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225393] *pdpt =
> 0000000001 446001 *pde = 0000000000000000
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225426] Oops: 0000
> [#1] SMP
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225453] last sysfs
> file:
> /sys/devices/pci0000:00/0000:00:07.0/0000:0d:00.0/host3/rport-3:0-7/target3:0:0/3:0:0:10/state
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225506] Modules
> linked in:
>    ipt_REJECT xt_tcpudp iptable_filter ip_tables ocfs2 quota_tree
> ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue
> configfs arpt_mangle arptable_filter arp_tables x_tables bonding
> dm_round_robin dm_multipath scsi_dh loop radeon ttm snd_pcm
> drm_kms_helper snd_timer snd drm soundcore i2c_algo_bit ipmi_si ses
> i2c_core ipmi_msghandler snd_page_alloc sd_mod psmouse hpwdt pcspkr
> enclosure hpilo crc_t10dif serio_raw processor container power_meter
> button evdev ext4 mbcache jbd2 crc16 dm_mod sg usbhid sr_mod hid cdrom
> ata_generic cciss uhci _hcd thermal qla2xxx scsi_transport_fc ata_piix
> scsi_tgt ehci_hcd libata usbcore nls_base thermal_sys bnx2 scsi_mod
> [last unloaded: scsi_wait_scan]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225952]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.225974] Pid:
> 32337, comm: ocfs2rec Not tainted (2.6.32-5-686-bigmem #1) ProLiant DL380 G6
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226022] EIP:
> 0060:[<fb4fdf60>] EFLAGS: 00010202 CPU: 10
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226059] EIP is at
> ocfs2_read_blocks+0x2e2/0x5ba [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226087] EAX:
> f5647d48 EBX: f3ec31dc ECX: ffffffff EDX: 00000001
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226116] ESI:
> 00000002 EDI: 00000000 EBP: 00000001 ESP: f40f1e6c
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226145]  DS: 007b
> ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226174] Process
> ocfs2rec (pid: 32337, ti=f40f0000 task=f658aec0 task.ti=f40f0000)
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226219] Stack:
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226240]  00000001
> 00000000 ffffffff ffffffff f5dc8dd8 f5673000 00000001 f3ec31dc
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226286]<0>
> f6b34d20 00000000 f3ec3000 fb53937f 00000001 f5647d48 00000001 00000000
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226350]<0>
> f3ec31dc 00000001 f3ec3000 00000000 fb50eeab f383be64 f3ec3000 00000000
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226431] Call Trace:
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226467]
> [<fb53937f>] ? ocfs2_refresh_slot_info+0x80/0xad [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226508]
> [<fb50eeab>] ? ocfs2_super_lock+0x1b4/0x27c [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226548]
> [<fb51e18f>] ? __ocfs2_recovery_thread+0xc4/0x146d [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226581]
> [<c1006f50>] ? __switch_to+0xcf/0x141
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226610]
> [<c1031658>] ? finish_task_switch+0x34/0x95
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226642]
> [<c127e933>] ? schedule+0x7a4/0x7f1
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226670]
> [<c1026881>] ? __wake_up_common+0x34/0x59
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226708]
> [<fb51e0cb>] ? __ocfs2_recovery_thread+0x0/0x146d [ocfs2]
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226741]
> [<c104a34c>] ? kthread+0x61/0x66
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226767]
> [<c104a2eb>] ? kthread+0x0/0x66
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226795]
> [<c1008d87>] ? kernel_thread_helper+0x7/0x10
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.226822] Code: 01
> 00 00 68 bd 0e 56 fb e8 d9 00 d8 c5 c7 44 24 3c 01 00 00 00 83 c4 24 eb
> 11 8b 14 24 89 54 24 18 eb 08 c7 44 24 18 01 00 00 00<8b>  06 a9 00 00
> 01 00 74 50 83 7c 24 18 00 0f 84 5b 01 00 00 f6
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.227056] EIP:
> [<fb4fdf60>] ocfs2_read_blocks+0x2e2/0x5ba [ocfs2] SS:ESP 0068:f40f1e6c
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.227117] CR2:
> 0000000000000002
> Jun 23 02:27:45 s_local at dante/dante kernel: : [638743.227387] ---[ end
> trace c5a96dd4578cc061 ]---
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list