[Ocfs2-users] ocf2 mount point hangs

Ishmael Tsoaela ishmaelt3 at gmail.com
Tue Sep 13 02:01:19 PDT 2016


Hi Eric,

Sorry Here are the other 2 syslogs if you need and debug output




nodeD
root at nodeD:~# sudo debugfs.ocfs2 -R stats /dev/rbd1
        Revision: 0.90
        Mount Count: 0   Max Mount Count: 20
        State: 0   Errors: 0
        Check Interval: 0   Last Check: Tue Aug  2 15:41:12 2016
        Creator OS: 0
        Feature Compat: 3 backup-super strict-journal-super
        Feature Incompat: 592 sparse inline-data xattr
        Tunefs Incomplete: 0
        Feature RO compat: 1 unwritten
        Root Blknum: 5   System Dir Blknum: 6
        First Cluster Group Blknum: 3
        Block Size Bits: 12   Cluster Size Bits: 12
        Max Node Slots: 16
        Extended Attributes Inline Size: 256
        Label:
        UUID: 238F878003E7455FA5B01CC884D1047F
        Hash: 919897149 (0x36d4843d)
        DX Seed[0]: 0x00000000
        DX Seed[1]: 0x00000000
        DX Seed[2]: 0x00000000
        Cluster stack: classic o2cb
        Inode: 2   Mode: 00   Generation: 1754092981 (0x688d55b5)
        FS Generation: 1754092981 (0x688d55b5)
        CRC32: 00000000   ECC: 0000
        Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
        Dynamic Features: (0x0)
        User: 0 (root)   Group: 0 (root)   Size: 0
        Links: 0   Clusters: 640000000
        ctime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
        atime: 0x0 -- Thu Jan  1 02:00:00 1970
        mtime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
        dtime: 0x0 -- Thu Jan  1 02:00:00 1970
        ctime_nsec: 0x00000000 -- 0
        atime_nsec: 0x00000000 -- 0
        mtime_nsec: 0x00000000 -- 0
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: Global   Sub Alloc Bit: 65535


nodeB

root at nodeB:~# sudo debugfs.ocfs2 -R stats /dev/rbd1
        Revision: 0.90
        Mount Count: 0   Max Mount Count: 20
        State: 0   Errors: 0
        Check Interval: 0   Last Check: Tue Aug  2 15:41:12 2016
        Creator OS: 0
        Feature Compat: 3 backup-super strict-journal-super
        Feature Incompat: 592 sparse inline-data xattr
        Tunefs Incomplete: 0
        Feature RO compat: 1 unwritten
        Root Blknum: 5   System Dir Blknum: 6
        First Cluster Group Blknum: 3
        Block Size Bits: 12   Cluster Size Bits: 12
        Max Node Slots: 16
        Extended Attributes Inline Size: 256
        Label:
        UUID: 238F878003E7455FA5B01CC884D1047F
        Hash: 919897149 (0x36d4843d)
        DX Seed[0]: 0x00000000
        DX Seed[1]: 0x00000000
        DX Seed[2]: 0x00000000
        Cluster stack: classic o2cb
        Inode: 2   Mode: 00   Generation: 1754092981 (0x688d55b5)
        FS Generation: 1754092981 (0x688d55b5)
        CRC32: 00000000   ECC: 0000
        Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
        Dynamic Features: (0x0)
        User: 0 (root)   Group: 0 (root)   Size: 0
        Links: 0   Clusters: 640000000
        ctime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
        atime: 0x0 -- Thu Jan  1 02:00:00 1970
        mtime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
        dtime: 0x0 -- Thu Jan  1 02:00:00 1970
        ctime_nsec: 0x00000000 -- 0
        atime_nsec: 0x00000000 -- 0
        mtime_nsec: 0x00000000 -- 0
        Refcount Block: 0
        Last Extblk: 0   Orphan Slot: 0
        Sub Alloc Slot: Global   Sub Alloc Bit: 65535




The request in the snip attached just hangs








On Tue, Sep 13, 2016 at 10:37 AM, Ishmael Tsoaela <ishmaelt3 at gmail.com> wrote:
> Thanks for the response
>
>
> 1.  the disk is a shared ceph rbd device
>
>  #rbd showmapped
> id pool            image                 snap device
> 1  vmimages        block_vmimages        -    /dev/rbd1
>
>
> 2. ocfs2 has been working well for 2 months now, with a reboot 12 days ago
>
> 3.  3 ceph nodes all have rbd image mapped and  ocfs3 mounted
>
> commands used
>
> #sudo rbd map block_vmimages  --pool vmimages --name
>
> #sudo mount /dev/rbd/vmimages/block_vmimages /mnt/vmimages/
> /dev/rbd1
>
> 4.
> root at nodeC:~# sudo debugfs.ocfs2 -R stats /dev/rbd1
>         Revision: 0.90
>         Mount Count: 0   Max Mount Count: 20
>         State: 0   Errors: 0
>         Check Interval: 0   Last Check: Tue Aug  2 15:41:12 2016
>         Creator OS: 0
>         Feature Compat: 3 backup-super strict-journal-super
>         Feature Incompat: 592 sparse inline-data xattr
>         Tunefs Incomplete: 0
>         Feature RO compat: 1 unwritten
>         Root Blknum: 5   System Dir Blknum: 6
>         First Cluster Group Blknum: 3
>         Block Size Bits: 12   Cluster Size Bits: 12
>         Max Node Slots: 16
>         Extended Attributes Inline Size: 256
>         Label:
>         UUID: 238F878003E7455FA5B01CC884D1047F
>         Hash: 919897149 (0x36d4843d)
>         DX Seed[0]: 0x00000000
>         DX Seed[1]: 0x00000000
>         DX Seed[2]: 0x00000000
>         Cluster stack: classic o2cb
>         Inode: 2   Mode: 00   Generation: 1754092981 (0x688d55b5)
>         FS Generation: 1754092981 (0x688d55b5)
>         CRC32: 00000000   ECC: 0000
>         Type: Unknown   Attr: 0x0   Flags: Valid System Superblock
>         Dynamic Features: (0x0)
>         User: 0 (root)   Group: 0 (root)   Size: 0
>         Links: 0   Clusters: 640000000
>         ctime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
>         atime: 0x0 -- Thu Jan  1 02:00:00 1970
>         mtime: 0x57a0a2f8 -- Tue Aug  2 15:41:12 2016
>         dtime: 0x0 -- Thu Jan  1 02:00:00 1970
>         ctime_nsec: 0x00000000 -- 0
>         atime_nsec: 0x00000000 -- 0
>         mtime_nsec: 0x00000000 -- 0
>         Refcount Block: 0
>         Last Extblk: 0   Orphan Slot: 0
>         Sub Alloc Slot: Global   Sub Alloc Bit: 65535
>
>
>
> thanks for the assistance
>
>
> On Tue, Sep 13, 2016 at 10:23 AM, Eric Ren <zren at suse.com> wrote:
>> Hi,
>>
>> On 09/13/2016 03:16 PM, Ishmael Tsoaela wrote:
>>>
>>> Hi All,
>>>
>>> I have an ocfs2  mount point of 3 ceph cluster nodes and suddenly I
>>> cannot read and write to the mount point although the cluster is clean
>>> and showing no errors.
>>
>> 1. What is your ocfs2 shared disk? I mean it's a shared disk exported by
>> iscsi target, or a ceph rbd device?
>> 2. Did you check if ocfs2 works well before any read/write? and how?
>> 3. Could you elaborating more details how the ceph nodes use ocfs2?
>> 4. Please provide the output of:
>>        #sudo debugfs.ocfs2 -R stats /dev/sda
>>>
>>>
>>>
>>> Are the any other logs I can check?
>>
>> All log messages should go to /var/log/messages, could you attach the whole
>> log file?
>>
>> Eric
>>>
>>>
>>> There are some log in kern.log about
>>>
>>>
>>> kern.log
>>>
>>> Sep 13 08:10:18 nodeB kernel: [1104431.300882] kernel BUG at
>>>
>>> /build/linux-lts-wily-Vv6Eyd/linux-lts-wily-4.2.0/fs/ocfs2/suballoc.c:2419!
>>> Sep 13 08:10:18 nodeB kernel: [1104431.345504] invalid opcode: 0000 [#1]
>>> SMP
>>> Sep 13 08:10:18 nodeB kernel: [1104431.370081] Modules linked in:
>>> vhost_net vhost macvtap macvlan ocfs2 quota_tree rbd libceph ipmi_si
>>> mpt3sas mpt2sas raid_class scsi_transport_sas mptctl mptbase
>>> xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4
>>> iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4
>>> xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 xt_tcpudp
>>> ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter
>>> ip_tables x_tables dell_rbu ocfs2_dlmfs ocfs2_stack_o2cb ocfs2_dlm
>>> ocfs2_nodemanager ocfs2_stackglue configfs bridge stp llc binfmt_misc
>>> ipmi_devintf kvm_amd dcdbas kvm input_leds joydev amd64_edac_mod
>>> crct10dif_pclmul edac_core shpchp i2c_piix4 fam15h_power crc32_pclmul
>>> edac_mce_amd ipmi_ssif k10temp aesni_intel aes_x86_64 lrw gf128mul
>>> 8250_fintek glue_helper acpi_power_meter mac_hid serio_raw ablk_helper
>>> cryptd ipmi_msghandler xfs libcrc32c lp parport ixgbe dca hid_generic
>>> uas usbhid vxlan usb_storage ip6_udp_tunnel hid udp_tunnel ptp psmouse
>>> bnx2 pps_core megaraid_sas mdio [last unloaded: ipmi_si]
>>> Sep 13 08:10:18 nodeB kernel: [1104431.898986] CPU: 10 PID: 65016
>>> Comm: cp Not tainted 4.2.0-27-generic #32~14.04.1-Ubuntu
>>> Sep 13 08:10:18 nodeB kernel: [1104432.012469] Hardware name: Dell
>>> Inc. PowerEdge R515/0RMRF7, BIOS 2.0.2 10/22/2012
>>> Sep 13 08:10:18 nodeB kernel: [1104432.134659] task: ffff880a61dca940
>>> ti: ffff88084a5ac000 task.ti: ffff88084a5ac000
>>> Sep 13 08:10:18 nodeB kernel: [1104432.265260] RIP:
>>> 0010:[<ffffffffc062026b>]  [<ffffffffc062026b>]
>>> _ocfs2_free_suballoc_bits+0x4db/0x4e0 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104432.406559] RSP:
>>> 0018:ffff88084a5af798  EFLAGS: 00010246
>>> Sep 13 08:10:18 nodeB kernel: [1104432.479958] RAX: 0000000000000000
>>> RBX: ffff881acebcb000 RCX: ffff881fcd372e00
>>> Sep 13 08:10:18 nodeB kernel: [1104432.630768] RDX: ffff881fd0d4dc30
>>> RSI: ffff88197e351bc8 RDI: ffff880fd127b2b0
>>> Sep 13 08:10:18 nodeB kernel: [1104432.789688] RBP: ffff88084a5af818
>>> R08: 0000000000000002 R09: 0000000000007e00
>>> Sep 13 08:10:18 nodeB kernel: [1104432.950053] R10: ffff880d39a21020
>>> R11: ffff88084a5af550 R12: 00000000000000fa
>>> Sep 13 08:10:18 nodeB kernel: [1104433.113014] R13: 0000000000005ab1
>>> R14: 0000000000000000 R15: ffff880fb2d43000
>>> Sep 13 08:10:18 nodeB kernel: [1104433.276484] FS:
>>> 00007fcc68373840(0000) GS:ffff881fdde80000(0000)
>>> knlGS:0000000000000000
>>> Sep 13 08:10:18 nodeB kernel: [1104433.440016] CS:  0010 DS: 0000 ES:
>>> 0000 CR0: 000000008005003b
>>> Sep 13 08:10:18 nodeB kernel: [1104433.521496] CR2: 00005647b2ee6d80
>>> CR3: 0000000198b93000 CR4: 00000000000406e0
>>> Sep 13 08:10:18 nodeB kernel: [1104433.681357] Stack:
>>> Sep 13 08:10:18 nodeB kernel: [1104433.758498]  0000000000000000
>>> ffff880fd127b2e8 ffff881fc6655f08 00005bab00000000
>>> Sep 13 08:10:18 nodeB kernel: [1104433.913655]  ffff881fd0c51d80
>>> ffff88197e351bc8 ffff880fd127b330 ffff880e9eaa6000
>>> Sep 13 08:10:18 nodeB kernel: [1104434.068609]  ffff88197e351bc8
>>> ffffffff817ba6d6 0000000000000001 000000001ac592b1
>>> Sep 13 08:10:18 nodeB kernel: [1104434.223347] Call Trace:
>>> Sep 13 08:10:18 nodeB kernel: [1104434.298560]  [<ffffffff817ba6d6>] ?
>>> mutex_lock+0x16/0x37
>>> Sep 13 08:10:18 nodeB kernel: [1104434.374183]  [<ffffffffc0621bca>]
>>> _ocfs2_free_clusters+0xea/0x200 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.449628]  [<ffffffffc061ecb0>] ?
>>> ocfs2_put_slot+0xe0/0xe0 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.523971]  [<ffffffffc061ecb0>] ?
>>> ocfs2_put_slot+0xe0/0xe0 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.595803]  [<ffffffffc06234e5>]
>>> ocfs2_free_clusters+0x15/0x20 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.666614]  [<ffffffffc05d6037>]
>>> __ocfs2_flush_truncate_log+0x247/0x560 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.806017]  [<ffffffffc05d25a6>] ?
>>> ocfs2_num_free_extents+0x56/0x120 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104434.946141]  [<ffffffffc05db258>]
>>> ocfs2_remove_btree_range+0x4e8/0x760 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.086490]  [<ffffffffc05dc720>]
>>> ocfs2_commit_truncate+0x180/0x590 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.158189]  [<ffffffffc06022b0>] ?
>>> ocfs2_allocate_extend_trans+0x130/0x130 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.297235]  [<ffffffffc05f7e2c>]
>>> ocfs2_truncate_file+0x39c/0x610 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.368060]  [<ffffffffc05fe650>] ?
>>> ocfs2_read_inode_block+0x10/0x20 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.505117]  [<ffffffffc05fa2d7>]
>>> ocfs2_setattr+0x4b7/0xa50 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.574617]  [<ffffffffc064c4fd>] ?
>>> ocfs2_xattr_get+0x9d/0x130 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.643722]  [<ffffffff8120705e>]
>>> notify_change+0x1ae/0x380
>>> Sep 13 08:10:18 nodeB kernel: [1104435.712037]  [<ffffffff811e8436>]
>>> do_truncate+0x66/0xa0
>>> Sep 13 08:10:18 nodeB kernel: [1104435.778685]  [<ffffffff811f8527>]
>>> path_openat+0x277/0x1330
>>> Sep 13 08:10:18 nodeB kernel: [1104435.845776]  [<ffffffffc05f2bed>] ?
>>> __ocfs2_cluster_unlock.isra.36+0x7d/0xb0 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104435.977677]  [<ffffffff811fae8a>]
>>> do_filp_open+0x7a/0xd0
>>> Sep 13 08:10:18 nodeB kernel: [1104436.043693]  [<ffffffff811f9f8f>] ?
>>> getname_flags+0x4f/0x1f0
>>> Sep 13 08:10:18 nodeB kernel: [1104436.108385]  [<ffffffff81208006>] ?
>>> __alloc_fd+0x46/0x110
>>> Sep 13 08:10:18 nodeB kernel: [1104436.171504]  [<ffffffff811ea509>]
>>> do_sys_open+0x129/0x260
>>> Sep 13 08:10:18 nodeB kernel: [1104436.232889]  [<ffffffff811ea65e>]
>>> SyS_open+0x1e/0x20
>>> Sep 13 08:10:18 nodeB kernel: [1104436.294292]  [<ffffffff817bc3b2>]
>>> entry_SYSCALL_64_fastpath+0x16/0x75
>>> Sep 13 08:10:18 nodeB kernel: [1104436.356257] Code: 65 c0 48 c7 c6 e0
>>> 44 65 c0 41 b6 e2 48 8d 5d c8 48 8b 78 28 44 89 24 24 31 c0 49 c7 c4
>>> e2 ff ff ff e8 9a 8d 01 00 e9 c4 fd ff ff <0f> 0b 0f 0b 90 0f 1f 44 00
>>> 00 55 48 89 e5 41 57 41 89 cf b9 01
>>> Sep 13 08:10:18 nodeB kernel: [1104436.549534] RIP
>>> [<ffffffffc062026b>] _ocfs2_free_suballoc_bits+0x4db/0x4e0 [ocfs2]
>>> Sep 13 08:10:18 nodeB kernel: [1104436.681076]  RSP <ffff88084a5af798>
>>> Sep 13 08:10:18 nodeB kernel: [1104436.834529] ---[ end trace
>>> 5f4b84ac539ed56c ]---
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: nodeb.syslog
Type: application/octet-stream
Size: 15906 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20160913/6e0ba0bd/attachment-0002.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: noded.syslog
Type: application/octet-stream
Size: 35376 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20160913/6e0ba0bd/attachment-0003.obj 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: df-h.PNG
Type: image/png
Size: 2565 bytes
Desc: not available
Url : http://oss.oracle.com/pipermail/ocfs2-users/attachments/20160913/6e0ba0bd/attachment-0001.png 


More information about the Ocfs2-users mailing list