[Ocfs2-users] remove locks? or copy the whole file?

Joel Becker jlbec at evilplan.org
Tue Jul 3 23:17:43 PDT 2012


On Tue, Jul 03, 2012 at 06:57:53PM -0700, Aleks Clark wrote:
> well, by 'clean', it said it was clean. the locks persisted though. I
> seriously can't believe there's no way to force lock removal. is it
> just a file somewhere I can delete?

There's no lock hanging around past a full restart.  This looks like
on-disk corruption.  Did fsck.ocfs2 say that it run multiple passes, or
just say "clean" and exit?  Please try fsck.ocfs2 with the '-f' flag
(obviously with the filesystem not mounted on ANY node).

Joel

> 
> 
> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark <aleks.clark at gmail.com> wrote:
> > yep, tried that, returned clean.
> >
> > On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh
> > <herbert.van.den.bergh at oracle.com> wrote:
> >>
> >> One more thing: did you try running fsck.ocfs2 on it?
> >>
> >> Thanks,
> >> Herbert.
> >>
> >>
> >> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote:
> >>>
> >>> Hmm doesn't mean much to me, but maybe to someone else on the list.  But
> >>> I bet their first suggestion will be to try a recent kernel...
> >>>
> >>> Thanks,
> >>> Herbert.
> >>>
> >>> On 7/3/2012 6:19 PM, Aleks Clark wrote:
> >>>>
> >>>> Nick, I don't think so, it's a 2tb partition with only 300gb used.
> >>>>
> >>>> Herb,
> >>>>
> >>>>
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578659]
> >>>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression:
> >>>> path_leaf_bh(left_path) == path_leaf_bh(right_path)
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578714]
> >>>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error
> >>>> during insert of 15761664 (left path cpos 20725762) results in two
> >>>> identical paths ending at 395267
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578800] ------------[ cut here
> >>>> ]------------
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578826] kernel BUG at
> >>>>
> >>>> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483!
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578881] invalid opcode: 0000 [#1]
> >>>> SMP
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578909] last sysfs file:
> >>>> /sys/devices/virtual/net/lo/operstate
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578937] CPU 0
> >>>> Jul  3 14:47:26 castor kernel: [3488036.578960] Modules linked in:
> >>>> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables
> >>>> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac
> >>>> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb
> >>>> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop
> >>>> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801
> >>>> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid
> >>>> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata
> >>>> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded:
> >>>> drbd]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm
> >>>> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579309] RIP:
> >>>> 0010:[<ffffffffa041177b>]  [<ffffffffa041177b>]
> >>>> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579363] RSP:
> >>>> 0018:ffff880014839688  EFLAGS: 00010292
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579390] RAX: 00000000000000bf
> >>>> RBX: 0000000000060803 RCX: 0000000000001806
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579435] RDX: 0000000000000000
> >>>> RSI: 0000000000000096 RDI: 0000000000000246
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579479] RBP: ffff8800148398a8
> >>>> R08: 00000000000209d0 R09: 000000000000000a
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579524] R10: 0000000000000000
> >>>> R11: 0000000100000000 R12: 00000000013c4002
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579568] R13: ffff88002a1e4030
> >>>> R14: 0000000000000001 R15: ffff88023c153c60
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579613] FS:
> >>>> 00007f0cfef83700(0000) GS:ffff880008a00000(0000)
> >>>> knlGS:0000000000000000
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579659] CS:  0010 DS: 002b ES:
> >>>> 002b CR0: 000000008005003b
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579687] CR2: 00007f0d25dbf000
> >>>> CR3: 000000023ccb6000 CR4: 00000000000426e0
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579732] DR0: 0000000000000000
> >>>> DR1: 0000000000000000 DR2: 0000000000000000
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579776] DR3: 0000000000000000
> >>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid:
> >>>> 25326, threadinfo ffff880014838000, task ffff88023b999c40)
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579867] Stack:
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579887]  0000000000f08100
> >>>> 00000000013c4002 0000000000060803 ffff880014839718
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579923]<0>   ffff880232abde80
> >>>> ffff88023b999c40 ffff88023b999c40 ffff8800148397a8
> >>>> Jul  3 14:47:26 castor kernel: [3488036.579977]<0>   ffff8800148397c8
> >>>> ffff8800148398a8 ffff88023d8027f8 0000000000f08100
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580047] Call Trace:
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580074]  [<ffffffffa04186b9>]
> >>>> ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580108]  [<ffffffffa0442e08>]
> >>>> ? __ocfs2_journal_access+0x261/0x32a [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580156]  [<ffffffffa04194da>]
> >>>> ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580205]  [<ffffffffa0436a34>]
> >>>> ? ocfs2_add_inode_data+0x62/0x6e [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580239]  [<ffffffffa0442f53>]
> >>>> ? ocfs2_journal_access_di+0x0/0xf [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580272]  [<ffffffffa041c1d5>]
> >>>> ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580321]  [<ffffffffa0466e02>]
> >>>> ? ocfs2_set_buffer_uptodate+0x15/0x60e [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580370]  [<ffffffffa043a9a5>]
> >>>> ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580418]  [<ffffffffa0442f53>]
> >>>> ? ocfs2_journal_access_di+0x0/0xf [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580451]  [<ffffffffa041cd57>]
> >>>> ? ocfs2_write_begin+0x116/0x1d2 [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580484]  [<ffffffff810b4fd0>]
> >>>> ? generic_file_buffered_write+0x118/0x278
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580515]  [<ffffffff810b54e1>]
> >>>> ? __generic_file_aio_write+0x25f/0x293
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580548]  [<ffffffffa0434fc8>]
> >>>> ? ocfs2_prepare_inode_for_write+0x683/0x69c [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580597]  [<ffffffffa042c4e2>]
> >>>> ? ocfs2_rw_lock+0x16d/0x239 [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580628]  [<ffffffffa0435b19>]
> >>>> ? ocfs2_file_aio_write+0x45f/0x5da [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580674]  [<ffffffff8101654b>]
> >>>> ? sched_clock+0x5/0x8
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580703]  [<ffffffff8104a4cc>]
> >>>> ? default_wake_function+0x0/0x9
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580733]  [<ffffffff810eebf2>]
> >>>> ? do_sync_write+0xce/0x113
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580762]  [<ffffffff81064f92>]
> >>>> ? autoremove_wake_function+0x0/0x2e
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580792]  [<ffffffff8105cd26>]
> >>>> ? kill_pid_info+0x31/0x3b
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580819]  [<ffffffff8105cefc>]
> >>>> ? sys_kill+0x72/0x140
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580847]  [<ffffffff810ef544>]
> >>>> ? vfs_write+0xa9/0x102
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580875]  [<ffffffff810ef5f4>]
> >>>> ? sys_pwrite64+0x57/0x77
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580902]  [<ffffffff81010b42>]
> >>>> ? system_call_fastpath+0x16/0x1b
> >>>> Jul  3 14:47:26 castor kernel: [3488036.580930] Code: 41 b8 b3 09 00
> >>>> 00 48 63 d2 48 c7 c7 6f 48 48 a0 89 0c 24 31 c0 48 c7 c1 c0 df 47 a0
> >>>> 48 89 5c 24 10 44 89 64 24 08 e8 5c 91 ee e0<0f>   0b eb fe 83 7c 24 5c
> >>>> 00 75 1a 49 8b 54 17 08 8b 5c 24 58 0f
> >>>> Jul  3 14:47:26 castor kernel: [3488036.581120] RIP
> >>>> [<ffffffffa041177b>] ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2]
> >>>> Jul  3 14:47:26 castor kernel: [3488036.581167]  RSP<ffff880014839688>
> >>>> Jul  3 14:47:26 castor kernel: [3488036.581581] ---[ end trace
> >>>> fb597ecc3418e6d6 ]---
> >>>>
> >>>>
> >>>> On Tue, Jul 3, 2012 at 5:39 PM, Herbert van den Bergh
> >>>> <herbert.van.den.bergh at oracle.com>   wrote:
> >>>>>
> >>>>> On 07/03/2012 04:12 PM, Aleks Clark wrote:
> >>>>>>
> >>>>>> Ok, so I've got this ocfs2 cluster that's been running for a long
> >>>>>> while, hosting my VMs. All of the sudden I'm getting kernel panics
> >>>>>> originating from ocfs2 when trying to spin up one particular file.
> >>>>>> I've determined that there are several locks on this file, one of them
> >>>>>> exclusive. I restarted the whole cluster to try to get rid of it, but
> >>>>>> no go. I also tried to copy the file, both on and off of the cluster,
> >>>>>> but only half of it copied. Any way to get around either issue would
> >>>>>> be appreciated.
> >>>>>
> >>>>> The panic stack may be helpful, and any messages that the kernel spit
> >>>>> out
> >>>>> before it.
> >>>>>
> >>>>> Thanks,
> >>>>> Herbert.
> >>>>>
> >>>>>
> >>>>
> >>> _______________________________________________
> >>> Ocfs2-users mailing list
> >>> Ocfs2-users at oss.oracle.com
> >>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
> >
> >
> >
> > --
> > Aleks Clark
> 
> 
> 
> -- 
> Aleks Clark
> 
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

-- 

 Joel's First Law:

	Nature abhors a GUI.

			http://www.jlbec.org/
			jlbec at evilplan.org



More information about the Ocfs2-users mailing list