[Ocfs2-users] remove locks? or copy the whole file?

Aleks Clark aleks.clark at gmail.com
Tue Jul 3 23:13:04 PDT 2012


grr. the copy I 'recovered' using dd to copy instead of cp is totally
munged. Would really appreciate some pointers on fixing the ocfs2
issue, I've got data backups but not looking forward to rebuilding the
whole damned VM :/

On Tue, Jul 3, 2012 at 6:57 PM, Aleks Clark <aleks.clark at gmail.com> wrote:
> well, by 'clean', it said it was clean. the locks persisted though. I
> seriously can't believe there's no way to force lock removal. is it
> just a file somewhere I can delete?
>
>
> On Tue, Jul 3, 2012 at 6:56 PM, Aleks Clark <aleks.clark at gmail.com> wrote:
>> yep, tried that, returned clean.
>>
>> On Tue, Jul 3, 2012 at 6:25 PM, herbert van.den.bergh
>> <herbert.van.den.bergh at oracle.com> wrote:
>>>
>>> One more thing: did you try running fsck.ocfs2 on it?
>>>
>>> Thanks,
>>> Herbert.
>>>
>>>
>>> On 7/3/2012 6:23 PM, herbert van.den.bergh wrote:
>>>>
>>>> Hmm doesn't mean much to me, but maybe to someone else on the list.  But
>>>> I bet their first suggestion will be to try a recent kernel...
>>>>
>>>> Thanks,
>>>> Herbert.
>>>>
>>>> On 7/3/2012 6:19 PM, Aleks Clark wrote:
>>>>>
>>>>> Nick, I don't think so, it's a 2tb partition with only 300gb used.
>>>>>
>>>>> Herb,
>>>>>
>>>>>
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578659]
>>>>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: bug expression:
>>>>> path_leaf_bh(left_path) == path_leaf_bh(right_path)
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578714]
>>>>> (25326,0):ocfs2_rotate_tree_right:2483 ERROR: Owner 18319883: error
>>>>> during insert of 15761664 (left path cpos 20725762) results in two
>>>>> identical paths ending at 395267
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578800] ------------[ cut here
>>>>> ]------------
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578826] kernel BUG at
>>>>>
>>>>> /build/buildd-linux-2.6_2.6.32-38-amd64-bk66e4/linux-2.6-2.6.32/debian/build/source_amd64_none/fs/ocfs2/alloc.c:2483!
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578881] invalid opcode: 0000 [#1]
>>>>> SMP
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578909] last sysfs file:
>>>>> /sys/devices/virtual/net/lo/operstate
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578937] CPU 0
>>>>> Jul  3 14:47:26 castor kernel: [3488036.578960] Modules linked in:
>>>>> drbd tun ocfs2 jbd2 quota_tree raid0 ip6table_filter ip6_tables
>>>>> iptable_filter ip_tables sha1_generic ebtable_nat ebtables hmac
>>>>> x_tables lru_cache cn kvm_intel kvm ocfs2_dlmfs ocfs2_stack_o2cb
>>>>> ocfs2_dlm ocfs2_nodemanager ocfs2_stackglue configfs bridge stp loop
>>>>> md_mod snd_pcm snd_timer snd soundcore snd_page_alloc i2c_i801
>>>>> i2c_core pcspkr processor button psmouse joydev evdev serio_raw usbhid
>>>>> hid ext3 jbd mbcache dm_mod sd_mod crc_t10dif ahci ehci_hcd libata
>>>>> usbcore scsi_mod e1000e nls_base thermal thermal_sys [last unloaded:
>>>>> drbd]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579279] Pid: 25326, comm: kvm
>>>>> Not tainted 2.6.32-5-amd64 #1 X9SCL/X9SCM
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579309] RIP:
>>>>> 0010:[<ffffffffa041177b>]  [<ffffffffa041177b>]
>>>>> ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579363] RSP:
>>>>> 0018:ffff880014839688  EFLAGS: 00010292
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579390] RAX: 00000000000000bf
>>>>> RBX: 0000000000060803 RCX: 0000000000001806
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579435] RDX: 0000000000000000
>>>>> RSI: 0000000000000096 RDI: 0000000000000246
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579479] RBP: ffff8800148398a8
>>>>> R08: 00000000000209d0 R09: 000000000000000a
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579524] R10: 0000000000000000
>>>>> R11: 0000000100000000 R12: 00000000013c4002
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579568] R13: ffff88002a1e4030
>>>>> R14: 0000000000000001 R15: ffff88023c153c60
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579613] FS:
>>>>> 00007f0cfef83700(0000) GS:ffff880008a00000(0000)
>>>>> knlGS:0000000000000000
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579659] CS:  0010 DS: 002b ES:
>>>>> 002b CR0: 000000008005003b
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579687] CR2: 00007f0d25dbf000
>>>>> CR3: 000000023ccb6000 CR4: 00000000000426e0
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579732] DR0: 0000000000000000
>>>>> DR1: 0000000000000000 DR2: 0000000000000000
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579776] DR3: 0000000000000000
>>>>> DR6: 00000000ffff0ff0 DR7: 0000000000000400
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579821] Process kvm (pid:
>>>>> 25326, threadinfo ffff880014838000, task ffff88023b999c40)
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579867] Stack:
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579887]  0000000000f08100
>>>>> 00000000013c4002 0000000000060803 ffff880014839718
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579923]<0>   ffff880232abde80
>>>>> ffff88023b999c40 ffff88023b999c40 ffff8800148397a8
>>>>> Jul  3 14:47:26 castor kernel: [3488036.579977]<0>   ffff8800148397c8
>>>>> ffff8800148398a8 ffff88023d8027f8 0000000000f08100
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580047] Call Trace:
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580074]  [<ffffffffa04186b9>]
>>>>> ? ocfs2_insert_extent+0x5fb/0x6e6 [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580108]  [<ffffffffa0442e08>]
>>>>> ? __ocfs2_journal_access+0x261/0x32a [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580156]  [<ffffffffa04194da>]
>>>>> ? ocfs2_add_clusters_in_btree+0x35f/0x53c [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580205]  [<ffffffffa0436a34>]
>>>>> ? ocfs2_add_inode_data+0x62/0x6e [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580239]  [<ffffffffa0442f53>]
>>>>> ? ocfs2_journal_access_di+0x0/0xf [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580272]  [<ffffffffa041c1d5>]
>>>>> ? ocfs2_write_begin_nolock+0x1376/0x1de2 [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580321]  [<ffffffffa0466e02>]
>>>>> ? ocfs2_set_buffer_uptodate+0x15/0x60e [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580370]  [<ffffffffa043a9a5>]
>>>>> ? ocfs2_validate_inode_block+0x0/0x1ab [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580418]  [<ffffffffa0442f53>]
>>>>> ? ocfs2_journal_access_di+0x0/0xf [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580451]  [<ffffffffa041cd57>]
>>>>> ? ocfs2_write_begin+0x116/0x1d2 [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580484]  [<ffffffff810b4fd0>]
>>>>> ? generic_file_buffered_write+0x118/0x278
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580515]  [<ffffffff810b54e1>]
>>>>> ? __generic_file_aio_write+0x25f/0x293
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580548]  [<ffffffffa0434fc8>]
>>>>> ? ocfs2_prepare_inode_for_write+0x683/0x69c [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580597]  [<ffffffffa042c4e2>]
>>>>> ? ocfs2_rw_lock+0x16d/0x239 [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580628]  [<ffffffffa0435b19>]
>>>>> ? ocfs2_file_aio_write+0x45f/0x5da [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580674]  [<ffffffff8101654b>]
>>>>> ? sched_clock+0x5/0x8
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580703]  [<ffffffff8104a4cc>]
>>>>> ? default_wake_function+0x0/0x9
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580733]  [<ffffffff810eebf2>]
>>>>> ? do_sync_write+0xce/0x113
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580762]  [<ffffffff81064f92>]
>>>>> ? autoremove_wake_function+0x0/0x2e
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580792]  [<ffffffff8105cd26>]
>>>>> ? kill_pid_info+0x31/0x3b
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580819]  [<ffffffff8105cefc>]
>>>>> ? sys_kill+0x72/0x140
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580847]  [<ffffffff810ef544>]
>>>>> ? vfs_write+0xa9/0x102
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580875]  [<ffffffff810ef5f4>]
>>>>> ? sys_pwrite64+0x57/0x77
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580902]  [<ffffffff81010b42>]
>>>>> ? system_call_fastpath+0x16/0x1b
>>>>> Jul  3 14:47:26 castor kernel: [3488036.580930] Code: 41 b8 b3 09 00
>>>>> 00 48 63 d2 48 c7 c7 6f 48 48 a0 89 0c 24 31 c0 48 c7 c1 c0 df 47 a0
>>>>> 48 89 5c 24 10 44 89 64 24 08 e8 5c 91 ee e0<0f>   0b eb fe 83 7c 24 5c
>>>>> 00 75 1a 49 8b 54 17 08 8b 5c 24 58 0f
>>>>> Jul  3 14:47:26 castor kernel: [3488036.581120] RIP
>>>>> [<ffffffffa041177b>] ocfs2_do_insert_extent+0x5dc/0x1aaf [ocfs2]
>>>>> Jul  3 14:47:26 castor kernel: [3488036.581167]  RSP<ffff880014839688>
>>>>> Jul  3 14:47:26 castor kernel: [3488036.581581] ---[ end trace
>>>>> fb597ecc3418e6d6 ]---
>>>>>
>>>>>
>>>>> On Tue, Jul 3, 2012 at 5:39 PM, Herbert van den Bergh
>>>>> <herbert.van.den.bergh at oracle.com>   wrote:
>>>>>>
>>>>>> On 07/03/2012 04:12 PM, Aleks Clark wrote:
>>>>>>>
>>>>>>> Ok, so I've got this ocfs2 cluster that's been running for a long
>>>>>>> while, hosting my VMs. All of the sudden I'm getting kernel panics
>>>>>>> originating from ocfs2 when trying to spin up one particular file.
>>>>>>> I've determined that there are several locks on this file, one of them
>>>>>>> exclusive. I restarted the whole cluster to try to get rid of it, but
>>>>>>> no go. I also tried to copy the file, both on and off of the cluster,
>>>>>>> but only half of it copied. Any way to get around either issue would
>>>>>>> be appreciated.
>>>>>>
>>>>>> The panic stack may be helpful, and any messages that the kernel spit
>>>>>> out
>>>>>> before it.
>>>>>>
>>>>>> Thanks,
>>>>>> Herbert.
>>>>>>
>>>>>>
>>>>>
>>>> _______________________________________________
>>>> Ocfs2-users mailing list
>>>> Ocfs2-users at oss.oracle.com
>>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>>
>> --
>> Aleks Clark
>
>
>
> --
> Aleks Clark



-- 
Aleks Clark



More information about the Ocfs2-users mailing list