[Ocfs2-users] kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494!

Jan Kirchhoff kirchy at gmx.de
Mon Jun 12 08:03:08 CDT 2006


Hi,

First of all, I'm new to ocfs2 and drbd.
I set up two identical servers (Athlon64, 1GB RAM, GB-Ethernet) with Debian Etch, compiled my own kernel (2.6.16.20), 
then compiled the drbd-modules and ocfs (modules and tools) from source.
The process of getting everything up and running was very easy.

I have one big 140GB partition that is synced with drbd (in c-mode) and has an ocfs2 filesystem on it. The servers will be webservers so the data ist the whole document-root (mostly pdfs for download) and CGIs.

I rsynced 31GB of data from another server onto this partition last week and did some simple testing and everything looked good. Today though, I typed in the url of one of the servers in my browser and didn't get anything back but an apache-error after a 3 minute timeout of the cgi-script. The same with the second system :(
There has been no traffic/load on the servers but my testing through the browser.

dmesg shows me the following (same error on both systems):

XFS mounting filesystem sda6
Ending clean XFS mount for filesystem: sda6
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
IPv6 over IPv4 tunneling driver
OCFS2 Node Manager 1.2.1 Tue Jun  6 13:24:21 CEST 2006 (build d647396d7a65bfeeaad84fa736d4dd1c)
OCFS2 DLM 1.2.1 Tue Jun  6 13:24:21 CEST 2006 (build 70adfba8f7c9ce44dac2d47ec99bb7d2)
OCFS2 DLMFS 1.2.1 Tue Jun  6 13:24:21 CEST 2006 (build 70adfba8f7c9ce44dac2d47ec99bb7d2)
OCFS2 User DLM kernel interface loaded
eth0: no IPv6 routers present
drbd0: disk( Diskless -> Attaching ) 
drbd0: drbd_bm_resize called with capacity == 351555584
drbd0: resync bitmap: bits=43944448 words=1373264
drbd0: size = 167 GB (175777792 KB)
drbd0: reading of bitmap took 152 jiffies
drbd0: recounting of set bits took additional 8 jiffies
drbd0: 0 KB marked out-of-sync by on disk bit-map.
drbd0: Found 6 transactions (276 active extents) in activity log.
drbd0: disk( Attaching -> Consistent ) 
drbd0: Writing meta data super block now.
drbd1: disk( Diskless -> Attaching ) 
drbd1: drbd_bm_resize called with capacity == 9767080
drbd1: resync bitmap: bits=1220885 words=38154
drbd1: size = 4769 MB (4883540 KB)
drbd1: reading of bitmap took 8 jiffies
drbd1: recounting of set bits took additional 0 jiffies
drbd1: 0 KB marked out-of-sync by on disk bit-map.
drbd1: Found 6 transactions (103 active extents) in activity log.
drbd1: disk( Attaching -> Consistent ) 
drbd1: Writing meta data super block now.
drbd0: conn( StandAlone -> Unconnected ) 
drbd0: conn( Unconnected -> WFConnection ) 
drbd1: conn( StandAlone -> Unconnected ) 
drbd1: conn( Unconnected -> WFConnection ) 
drbd0: conn( WFConnection -> WFReportParams ) 
drbd0: Handshake successful: DRBD Network Protocol version 80
drbd0: Peer authenticated usind 20 bytes of 'sha1' HMAC
drbd0: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk( DUnknown -> UpToDate ) 
drbd0: Writing meta data super block now.
drbd1: conn( WFConnection -> WFReportParams ) 
drbd1: Handshake successful: DRBD Network Protocol version 80
drbd1: Peer authenticated usind 20 bytes of 'sha1' HMAC
drbd1: peer( Unknown -> Secondary ) conn( WFReportParams -> WFBitMapT ) disk( Consistent -> Outdated ) pdsk( DUnknown -> UpToDate ) 
drbd1: Writing meta data super block now.
drbd1: conn( WFBitMapT -> WFSyncUUID ) 
drbd1: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) 
drbd1: Began resync as SyncTarget (will sync 32 KB [8 bits set]).
drbd1: Writing meta data super block now.
drbd0: conn( WFBitMapT -> WFSyncUUID ) 
drbd0: conn( WFSyncUUID -> SyncTarget ) disk( Outdated -> Inconsistent ) 
drbd0: Began resync as SyncTarget (will sync 32 KB [8 bits set]).
drbd0: Writing meta data super block now.
drbd1: Resync done (total 1 sec; paused 0 sec; 32 K/sec)
drbd1: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 
drbd1: Writing meta data super block now.
drbd0: Resync done (total 1 sec; paused 0 sec; 32 K/sec)
drbd0: conn( SyncTarget -> Connected ) disk( Inconsistent -> UpToDate ) 
drbd0: Writing meta data super block now.
drbd0: peer( Secondary -> Primary ) 
drbd1: peer( Secondary -> Primary ) 
drbd0: role( Secondary -> Primary ) 
drbd0: Writing meta data super block now.
drbd1: role( Secondary -> Primary ) 
drbd1: Writing meta data super block now.
o2net: accepted connection from node portal2 (num 1) at 192.168.0.82:7777
OCFS2 1.2.1 Tue Jun  6 13:24:15 CEST 2006 (build bd2f25ba0af9677db3572e3ccd92f739)
ocfs2_dlm: Nodes in domain ("8B6DD64326394C308A4E2B2259162C78"): 0 1 
kjournald starting.  Commit interval 5 seconds
ocfs2: Mounting device (147,0) on (node 0, slot 1)
(16918,0):ocfs2_truncate_file:494 ERROR: bug expression: le64_to_cpu(fe->i_size) != i_size_read(inode)
(16918,0):ocfs2_truncate_file:494 ERROR: Inode 42363033, inode i_size = 1129 != di i_size = 1120, i_flags = 0x1
------------[ cut here ]------------
kernel BUG at /usr/src/ocfs2-1.2.1/fs/ocfs2/file.c:494!
invalid opcode: 0000 [#1]
SMP 
Modules linked in: ocfs2 sha1 ocfs2_dlmfs ocfs2_dlm ocfs2_nodemanager configfs ipv6 ext3 jbd dm_mod drbd sr_mod sbp2 ide_generic ide_disk ide_cd cdrom eth1394 mousedev tsdev psmouse ehci_hcd ohci_hcd amd74xx generic parport_pc parport evdev serio_raw usbcore ohci1394 ieee1394 nvnet ide_core rtc floppy pcspkr snd_hda_intel snd_hda_codec snd_pcm snd_timer snd soundcore snd_page_alloc
CPU:    0
EIP:    0060:[<f95374f6>]    Tainted: P      VLI
EFLAGS: 00210286   (2.6.16.20ll-wbsrv #1) 
EIP is at ocfs2_setattr+0x6a8/0x12cf [ocfs2]
eax: 00000073   ebx: 00000000   ecx: ffffffff   edx: ffffff23
esi: 00000469   edi: 00000000   ebp: d3875000   esp: e1c31eb8
ds: 007b   es: 007b   ss: 0068
Process ix.cgi (pid: 16918, threadinfo=e1c30000 task=f26d5a90)
Stack: <0>00000000 00000000 00000000 c0cc5e08 f491e800 c0cc5f7c 00000460 00000000 
       00000460 00000000 00000000 00000000 00000000 d75695e8 d75695e8 00000000 
       c0cc5e08 00002008 e1c31f38 c0162405 e904584c e1c31f38 01222222 448d600f 
Call Trace:
 [<c0162405>] notify_change+0x13f/0x2da
 [<c014a7c9>] do_truncate+0x59/0x72
 [<c014a902>] do_sys_ftruncate+0x120/0x13f
 [<c014a947>] sys_ftruncate64+0x13/0x15
 [<c010278d>] syscall_call+0x7/0xb
Code: fd ff ff ff b1 f8 fd ff ff 68 ee 01 00 00 68 37 90 55 f9 ff 70 10 8b 00 ff b0 9c 00 00 00 68 36 dc 55 f9 e8 e6 10 be c6 83 c4 30 <0f> 0b ee 01 ed d4 55 f9 8b 4d 24 39 4c 24 08 8b 55 20 0f 82 c3 
 BUG: ix.cgi/16918, lock held at task exit time!
 [c0cc5e7c] {inode_init_once}
.. held by:            ix.cgi:16918 [f26d5a90, 116]
... acquired at:               do_truncate+0x50/0x72


portal1:~# uname -a
Linux portal1 2.6.16.20ll-wbsrv #1 SMP Tue Jun 6 12:33:55 CEST 2006 i686 GNU/Linux
portal1:~# cat /proc/cpuinfo 
processor       : 0
vendor_id       : AuthenticAMD
cpu family      : 15
model           : 47
model name      : AMD Athlon(tm) 64 Processor 3500+
stepping        : 2
cpu MHz         : 2210.343
cache size      : 512 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 1
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 syscall nx mmxext fxsr_opt lm 3dnowext 3dnow pni lahf_lm ts fid vid ttp tm stc
bogomips        : 4429.36


Can anybody help me with that?

thanks 
Jan



More information about the Ocfs2-users mailing list