[Ocfs2-users] ocfs2 kernel BUG

Tao Ma tao.ma at oracle.com
Fri Aug 1 01:57:27 PDT 2008


Hi,
	Please provide the detail info of ocfs2 version which may be helpful 
for diagnose.

Peter Selzner wrote:
> Hi,
> 
> we had this entries in /var/log/messeges a few days ago:
> 
> Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: bug expression: i_size_read(inode) != (le64_to_cpu(fe->i_size) - *bytes_extended)
> Jul 28 23:30:47 xxx kernel: (12268,2):ocfs2_extend_file:790 ERROR: Inode 8323098 i_size = 1572864, dinode i_size = 1568768, bytes_extended = 0, new_i_size = 1576960 
> Jul 28 23:30:47 xxx kernel: klogd 1.4.1, ---------- state change ---------- 
> Jul 28 23:30:47 xxx kernel: ------------[ cut here ]------------
> Jul 28 23:30:47 xxx kernel: kernel BUG at fs/ocfs2/file.c:790!
> Jul 28 23:30:47 xxx kernel: invalid opcode: 0000 [#1]
> Jul 28 23:30:47 xxx kernel: SMP 
> Jul 28 23:30:47 xxx kernel: last sysfs file: /class/infiniband/mthca1/board_id
> Jul 28 23:30:47 xxx kernel: Modules linked in: ocfs2 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs cpqci mptctl mptbase ipmi_si ipmi_devintf ipmi_msghandler rdma_ucm rds ib_ucm ib_sdp rdma_cm iw_cm
> ib_addr ib_local_sa ib_ipoib ib_cm ib_sa ipv6 ib_uverbs ib_umad bonding ib_mthca ib_mad ib_core button battery ac raw loop dm_round_robin dm_multipath dm_mod usbhid hw_random ide_cd uhci_hcd e1000
> cdrom ehci_hcd bnx2 usbcore ext3 jbd ata_piix ahci libata edd fan thermal processor cciss sg qla2400 qla2300 qla2xxx firmware_class qla2xxx_conf intermodule piix sd_mod scsi_mod ide_disk ide_core
> Jul 28 23:30:47 xxx kernel: CPU:    2   
> Jul 28 23:30:47 xxx kernel: EIP:    0060:[<f9de8173>]    Tainted: P     U VLI 
> Jul 28 23:30:47 xxx kernel: EFLAGS: 00210292   (2.6.16.46-0.12-bigsmp #1) 
> Jul 28 23:30:47 xxx kernel: EIP is at ocfs2_extend_file+0x3cd/0xf9b [ocfs2]
> Jul 28 23:30:47 xxx kernel: eax: 0000008c   ebx: 00000000   ecx: ffffff00   edx: 00200286
> Jul 28 23:30:47 xxx kernel: esi: 00000000   edi: 00000000   ebp: df05f000   esp: e398de70
> Jul 28 23:30:47 xxx kernel: ds: 007b   es: 007b   ss: 0068
> Jul 28 23:30:47 xxx kernel: Process mv (pid: 12268, threadinfo=e398c000 task=f7f80660)
> Jul 28 23:30:47 xxx kernel: Stack: <0>00000000 dd4f9d88 ce48c000 00000000 00000000 00000001 cf253280 dd4f9b80 
> Jul 28 23:30:47 xxx kernel:        dd4f9ee4 0017f000 00000000 00000000 f9ddf432 e398dea8 dd4f9b80 00000000 
> Jul 28 23:30:47 xxx kernel:        00000001 e398deb4 e398deb4 ce48c000 00000000 00000000 ece0bc00 00000000 
> Jul 28 23:30:47 xxx kernel: Call Trace:
> Jul 28 23:30:47 xxx kernel:  [<f9ddf432>] ocfs2_status_completion_cb+0x0/0xa [ocfs2]
> Jul 28 23:30:47 xxx kernel:  [<f9df72f2>] ocfs2_write_lock_maybe_extend+0xb2f/0xde3 [ocfs2]
> Jul 28 23:30:47 xxx kernel:  [<f9dea85d>] ocfs2_file_write+0x125/0x24d [ocfs2]
> Jul 28 23:30:47 xxx kernel:  [<f9dea738>] ocfs2_file_write+0x0/0x24d [ocfs2]
> Jul 28 23:30:47 xxx kernel:  [<c0164714>] vfs_write+0xaa/0x152
> Jul 28 23:30:47 xxx kernel:  [<c0164d1f>] sys_write+0x3c/0x63
> Jul 28 23:30:47 xxx kernel:  [<c0103cab>] sysenter_past_esp+0x54/0x79
> Jul 28 23:30:47 xxx kernel: Code: 8b 4c 24 3c ff 71 04 ff 31 68 16 03 00 00 68 2b b5 e0 f9 ff 70 10 8b 00 ff b0 c0 00 00 00 68 b1 fd e0 f9 e8 ca a8 33 c6 83 c4 3c <0f> 0b 16 03 db fb e0 f9 8b 5c 24 20
> 8b 03 0f ae e8 89 f6 8b 74 
> 
> It was impossible to do "ls -al" in a certain directory (each process that
> "touched" files in this directory ends in DEAD state (uninterruptible sleep).
> Any suggestions? Thanks.
How do this happen and could you please explain it in more detail? e.g, 
how many nodes are in your cluster? you hang in one node, how about 
other nodes or what you are doing in other nodes.

Regards,
Tao



More information about the Ocfs2-users mailing list