[Ocfs2-users] OCFS2 cluster won't come up and stay up

Tony Rios tony at tonyrios.com
Fri Dec 2 13:35:42 PST 2011


Bug #1338 has been created.

http://oss.oracle.com/bugzilla/show_bug.cgi?id=1338


On Dec 1, 2011, at 6:36 PM, Sunil Mushran wrote:

> To analyze one needs the logs. And a bugzilla is a good place holder for the logs. 
> 
> On Dec 1, 2011, at 6:05 PM, Tony Rios <tony at tonyrios.com> wrote:
> 
>> Sunil,
>> Is submitting a bug report the only answer?
>> I'm happy to send in this information, but can I take the cluster down entirely and sort of reset it so we can get these servers back online and talking again in the meanwhile?
>> Tony
>> 
>> On Dec 1, 2011, at 5:05 PM, Sunil Mushran wrote:
>> 
>>> Node 3 is joining the domain. It is having problms getting the superblock cluster lock.
>>> Create a bugzilla on oss.oracle.com and attach the /var/logs/messages from all nodes.
>>> If you have netconsole setup, attach those logs too.
>>> 
>>> On 12/01/2011 04:55 PM, Tony Rios wrote:
>>>> I'm having an issue today where I just can't seem to keep all the servers in the cluster online.
>>>> They aren't losing network connectivity and I can ping the iSCSI host just fine and the host is logged in.
>>>> 
>>>> These are the errors form the dmesg when I try to mount the filesystem:
>>>> 
>>>> root at pedge36:~# dmesg
>>>> [    0.000000] Initializing cgroup subsys cpuset
>>>> [    0.000000] Initializing cgroup subsys cpu
>>>> [    0.000000] Linux version 2.6.38-10-generic (buildd at yellow) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC 2011 (Ubuntu 2.6.38-10.46-generic 2.6.38.7)
>>>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel
>>>> [    0.000000] BIOS-provided physical RAM map:
>>>> [    0.000000]  BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
>>>> [    0.000000]  BIOS-e820: 0000000000100000 - 00000000effc0000 (usable)
>>>> [    0.000000]  BIOS-e820: 00000000effc0000 - 00000000effcfc00 (ACPI data)
>>>> [    0.000000]  BIOS-e820: 00000000effcfc00 - 00000000effff000 (reserved)
>>>> [    0.000000]  BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
>>>> [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved)
>>>> [    0.000000]  BIOS-e820: 00000000fed13000 - 00000000feda0000 (reserved)
>>>> [    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
>>>> [    0.000000]  BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
>>>> [    0.000000]  BIOS-e820: 0000000100000000 - 00000001ffffe000 (usable)
>>>> [    0.000000]  BIOS-e820: 00000001ffffe000 - 0000000200000000 (reserved)
>>>> [    0.000000]  BIOS-e820: 0000000200000000 - 0000000210000000 (usable)
>>>> [    0.000000] debug: ignoring loglevel setting.
>>>> [    0.000000] NX (Execute Disable) protection: active
>>>> [    0.000000] DMI 2.3 present.
>>>> [    0.000000] DMI: Dell Computer Corporation PowerEdge 850/0Y8628, BIOS A04 08/22/2006
>>>> [    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==>  (reserved)
>>>> [    0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
>>>> [    0.000000] No AGP bridge found
>>>> [    0.000000] last_pfn = 0x210000 max_arch_pfn = 0x400000000
>>>> [    0.000000] MTRR default type: uncachable
>>>> [    0.000000] MTRR fixed ranges enabled:
>>>> [    0.000000]   00000-9FFFF write-back
>>>> [    0.000000]   A0000-BFFFF uncachable
>>>> [    0.000000]   C0000-CBFFF write-protect
>>>> [    0.000000]   CC000-EBFFF uncachable
>>>> [    0.000000]   EC000-FFFFF write-protect
>>>> [    0.000000] MTRR variable ranges enabled:
>>>> [    0.000000]   0 base 000000000 mask E00000000 write-back
>>>> [    0.000000]   1 base 200000000 mask FF0000000 write-back
>>>> [    0.000000]   2 base 0F0000000 mask FF0000000 uncachable
>>>> [    0.000000]   3 disabled
>>>> [    0.000000]   4 disabled
>>>> [    0.000000]   5 disabled
>>>> [    0.000000]   6 disabled
>>>> [    0.000000]   7 disabled
>>>> [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
>>>> [    0.000000] e820 update range: 00000000f0000000 - 0000000100000000 (usable) ==>  (reserved)
>>>> [    0.000000] last_pfn = 0xeffc0 max_arch_pfn = 0x400000000
>>>> [    0.000000] found SMP MP-table at [ffff8800000fe710] fe710
>>>> [    0.000000] initial memory mapped : 0 - 20000000
>>>> [    0.000000] init_memory_mapping: 0000000000000000-00000000effc0000
>>>> [    0.000000]  0000000000 - 00efe00000 page 2M
>>>> [    0.000000]  00efe00000 - 00effc0000 page 4k
>>>> [    0.000000] kernel direct mapping tables up to effc0000 @ 1fffa000-20000000
>>>> [    0.000000] init_memory_mapping: 0000000100000000-0000000210000000
>>>> [    0.000000]  0100000000 - 0210000000 page 2M
>>>> [    0.000000] kernel direct mapping tables up to 210000000 @ effb6000-effc0000
>>>> [    0.000000] RAMDISK: 366d0000 - 37360000
>>>> [    0.000000] ACPI: RSDP 00000000000fd160 00014 (v00 DELL  )
>>>> [    0.000000] ACPI: RSDT 00000000000fd174 00038 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: FACP 00000000000fd1b8 00074 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: DSDT 00000000effc0000 01C19 (v01 DELL   PE830    00000001 MSFT 0100000E)
>>>> [    0.000000] ACPI: FACS 00000000effcfc00 00040
>>>> [    0.000000] ACPI: APIC 00000000000fd22c 00074 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: SPCR 00000000000fd2a0 00050 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: HPET 00000000000fd2f0 00038 (v01 DELL   PE830    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: MCFG 00000000000fd328 0003C (v01 DELL   PE830    00000001 MSFT 0100000A)
>>>> [    0.000000] ACPI: Local APIC address 0xfee00000
>>>> [    0.000000] No NUMA configuration found
>>>> [    0.000000] Faking a node at 0000000000000000-0000000210000000
>>>> [    0.000000] Initmem setup node 0 0000000000000000-0000000210000000
>>>> [    0.000000]   NODE_DATA [00000001ffff9000 - 00000001ffffdfff]
>>>> [    0.000000]  [ffffea0000000000-ffffea00073fffff] PMD ->  [ffff8801f7e00000-ffff8801feffffff] on node 0
>>>> [    0.000000] Zone PFN ranges:
>>>> [    0.000000]   DMA      0x00000010 ->  0x00001000
>>>> [    0.000000]   DMA32    0x00001000 ->  0x00100000
>>>> [    0.000000]   Normal   0x00100000 ->  0x00210000
>>>> [    0.000000] Movable zone start PFN for each node
>>>> [    0.000000] early_node_map[4] active PFN ranges
>>>> [    0.000000]     0: 0x00000010 ->  0x000000a0
>>>> [    0.000000]     0: 0x00000100 ->  0x000effc0
>>>> [    0.000000]     0: 0x00100000 ->  0x001ffffe
>>>> [    0.000000]     0: 0x00200000 ->  0x00210000
>>>> [    0.000000] On node 0 totalpages: 2096974
>>>> [    0.000000]   DMA zone: 56 pages used for memmap
>>>> [    0.000000]   DMA zone: 7 pages reserved
>>>> [    0.000000]   DMA zone: 3921 pages, LIFO batch:0
>>>> [    0.000000]   DMA32 zone: 14280 pages used for memmap
>>>> [    0.000000]   DMA32 zone: 964600 pages, LIFO batch:31
>>>> [    0.000000]   Normal zone: 15232 pages used for memmap
>>>> [    0.000000]   Normal zone: 1098878 pages, LIFO batch:31
>>>> [    0.000000] ACPI: PM-Timer IO Port: 0x808
>>>> [    0.000000] ACPI: Local APIC address 0xfee00000
>>>> [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
>>>> [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
>>>> [    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
>>>> [    0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
>>>> [    0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
>>>> [    0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
>>>> [    0.000000] ACPI: IOAPIC (id[0x03] address[0xfec10000] gsi_base[32])
>>>> [    0.000000] IOAPIC[1]: apic_id 3, version 32, address 0xfec10000, GSI 32-55
>>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>>>> [    0.000000] ACPI: IRQ0 used by override.
>>>> [    0.000000] ACPI: IRQ2 used by override.
>>>> [    0.000000] ACPI: IRQ9 used by override.
>>>> [    0.000000] Using ACPI (MADT) for SMP configuration information
>>>> [    0.000000] ACPI: HPET id: 0xffffffff base: 0xfed00000
>>>> [    0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs
>>>> [    0.000000] nr_irqs_gsi: 72
>>>> [    0.000000] PM: Registered nosave memory: 00000000000a0000 - 0000000000100000
>>>> [    0.000000] PM: Registered nosave memory: 00000000effc0000 - 00000000effcf000
>>>> [    0.000000] PM: Registered nosave memory: 00000000effcf000 - 00000000effd0000
>>>> [    0.000000] PM: Registered nosave memory: 00000000effd0000 - 00000000effff000
>>>> [    0.000000] PM: Registered nosave memory: 00000000effff000 - 00000000f0000000
>>>> [    0.000000] PM: Registered nosave memory: 00000000f0000000 - 00000000f4000000
>>>> [    0.000000] PM: Registered nosave memory: 00000000f4000000 - 00000000fec00000
>>>> [    0.000000] PM: Registered nosave memory: 00000000fec00000 - 00000000fed00000
>>>> [    0.000000] PM: Registered nosave memory: 00000000fed00000 - 00000000fed13000
>>>> [    0.000000] PM: Registered nosave memory: 00000000fed13000 - 00000000feda0000
>>>> [    0.000000] PM: Registered nosave memory: 00000000feda0000 - 00000000fee00000
>>>> [    0.000000] PM: Registered nosave memory: 00000000fee00000 - 00000000fee10000
>>>> [    0.000000] PM: Registered nosave memory: 00000000fee10000 - 00000000ffb00000
>>>> [    0.000000] PM: Registered nosave memory: 00000000ffb00000 - 0000000100000000
>>>> [    0.000000] PM: Registered nosave memory: 00000001ffffe000 - 0000000200000000
>>>> [    0.000000] Allocating PCI resources starting at f4000000 (gap: f4000000:ac00000)
>>>> [    0.000000] Booting paravirtualized kernel on bare hardware
>>>> [    0.000000] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:2 nr_node_ids:1
>>>> [    0.000000] PERCPU: Embedded 28 pages/cpu @ffff8800efc00000 s84416 r8192 d22080 u1048576
>>>> [    0.000000] pcpu-alloc: s84416 r8192 d22080 u1048576 alloc=1*2097152
>>>> [    0.000000] pcpu-alloc: [0] 0 1
>>>> [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 2067399
>>>> [    0.000000] Policy zone: Normal
>>>> [    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel
>>>> [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
>>>> [    0.000000] Checking aperture...
>>>> [    0.000000] No AGP bridge found
>>>> [    0.000000] Calgary: detecting Calgary via BIOS EBDA area
>>>> [    0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
>>>> [    0.000000] Memory: 8178472k/8650752k available (5941k kernel code, 262856k absent, 209424k reserved, 5016k data, 956k init)
>>>> [    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
>>>> [    0.000000] Hierarchical RCU implementation.
>>>> [    0.000000]    RCU dyntick-idle grace-period acceleration is enabled.
>>>> [    0.000000]    RCU-based detection of stalled CPUs is disabled.
>>>> [    0.000000] NR_IRQS:16640 nr_irqs:512 16
>>>> [    0.000000] Console: colour dummy device 80x25
>>>> [    0.000000] console [tty0] enabled
>>>> [    0.000000] allocated 83886080 bytes of page_cgroup
>>>> [    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
>>>> [    0.000000] hpet clockevent registered
>>>> [    0.000000] Fast TSC calibration using PIT
>>>> [    0.000000] Detected 3000.094 MHz processor.
>>>> [    0.010004] Calibrating delay loop (skipped), value calculated using timer frequency.. 6000.18 BogoMIPS (lpj=30000940)
>>>> [    0.010017] pid_max: default: 32768 minimum: 301
>>>> [    0.010056] Security Framework initialized
>>>> [    0.010082] AppArmor: AppArmor initialized
>>>> [    0.010088] Yama: becoming mindful.
>>>> [    0.012092] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
>>>> [    0.022482] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
>>>> [    0.024244] Mount-cache hash table entries: 256
>>>> [    0.024453] Initializing cgroup subsys ns
>>>> [    0.024463] ns_cgroup deprecated: consider using the 'clone_children' flag without the ns_cgroup.
>>>> [    0.024472] Initializing cgroup subsys cpuacct
>>>> [    0.024481] Initializing cgroup subsys memory
>>>> [    0.024495] Initializing cgroup subsys devices
>>>> [    0.024501] Initializing cgroup subsys freezer
>>>> [    0.024507] Initializing cgroup subsys net_cls
>>>> [    0.024512] Initializing cgroup subsys blkio
>>>> [    0.024574] CPU: Physical Processor ID: 0
>>>> [    0.024580] CPU: Processor Core ID: 0
>>>> [    0.024586] mce: CPU supports 4 MCE banks
>>>> [    0.024603] CPU0: Thermal monitoring enabled (TM1)
>>>> [    0.024612] using mwait in idle threads.
>>>> [    0.027748] ACPI: Core revision 20110112
>>>> [    0.029308] ftrace: allocating 24323 entries in 96 pages
>>>> [    0.030085] Setting APIC routing to flat
>>>> [    0.030516] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>>> [    0.136419] CPU0: Intel(R) Pentium(R) D CPU 3.00GHz stepping 04
>>>> [    0.140000] Performance Events: Netburst events, Netburst P4/Xeon PMU driver.
>>>> [    0.140000] ... version:                0
>>>> [    0.140000] ... bit width:              40
>>>> [    0.140000] ... generic registers:      18
>>>> [    0.140000] ... value mask:             000000ffffffffff
>>>> [    0.140000] ... max period:             0000007fffffffff
>>>> [    0.140000] ... fixed-purpose events:   0
>>>> [    0.140000] ... event mask:             000000000003ffff
>>>> [    0.140000] Booting Node   0, Processors  #1 Ok.
>>>> [    0.300021] Brought up 2 CPUs
>>>> [    0.300030] Total of 2 processors activated (12000.49 BogoMIPS).
>>>> [    0.300847] devtmpfs: initialized
>>>> [    0.302451] print_constraints: dummy:
>>>> [    0.302485] Time:  0:41:31  Date: 12/02/11
>>>> [    0.302546] NET: Registered protocol family 16
>>>> [    0.302672] Trying to unpack rootfs image as initramfs...
>>>> [    0.310474] ACPI: bus type pci registered
>>>> [    0.310570] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf0000000-0xf3ffffff] (base 0xf0000000)
>>>> [    0.310580] PCI: MMCONFIG at [mem 0xf0000000-0xf3ffffff] reserved in E820
>>>> [    0.340577] PCI: Using configuration type 1 for base access
>>>> [    0.342112] bio: create slab<bio-0>  at 0
>>>> [    0.342934] ACPI: EC: Look up EC in DSDT
>>>> [    0.345243] ACPI: Interpreter enabled
>>>> [    0.345252] ACPI: (supports S0 S4 S5)
>>>> [    0.345278] ACPI: Using IOAPIC for interrupt routing
>>>> [    0.349231] ACPI: No dock devices found.
>>>> [    0.349239] HEST: Table not found.
>>>> [    0.349246] PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug
>>>> [    0.349794] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
>>>> [    0.350838] pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
>>>> [    0.350848] pci_root PNP0A03:00: host bridge window [io  0x0d00-0xffff] (ignored)
>>>> [    0.350856] pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] (ignored)
>>>> [    0.350864] pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
>>>> [    0.350884] pci 0000:00:00.0: [8086:2778] type 0 class 0x000600
>>>> [    0.350946] pci 0000:00:01.0: [8086:2779] type 1 class 0x000604
>>>> [    0.350996] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
>>>> [    0.351005] pci 0000:00:01.0: PME# disabled
>>>> [    0.351066] pci 0000:00:1c.0: [8086:27d0] type 1 class 0x000604
>>>> [    0.351137] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
>>>> [    0.351145] pci 0000:00:1c.0: PME# disabled
>>>> [    0.351178] pci 0000:00:1c.4: [8086:27e0] type 1 class 0x000604
>>>> [    0.351248] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
>>>> [    0.351256] pci 0000:00:1c.4: PME# disabled
>>>> [    0.351285] pci 0000:00:1c.5: [8086:27e2] type 1 class 0x000604
>>>> [    0.351355] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
>>>> [    0.351363] pci 0000:00:1c.5: PME# disabled
>>>> [    0.351391] pci 0000:00:1d.0: [8086:27c8] type 0 class 0x000c03
>>>> [    0.351443] pci 0000:00:1d.0: reg 20: [io  0xbce0-0xbcff]
>>>> [    0.351484] pci 0000:00:1d.1: [8086:27c9] type 0 class 0x000c03
>>>> [    0.351537] pci 0000:00:1d.1: reg 20: [io  0xbcc0-0xbcdf]
>>>> [    0.351577] pci 0000:00:1d.2: [8086:27ca] type 0 class 0x000c03
>>>> [    0.351629] pci 0000:00:1d.2: reg 20: [io  0xbca0-0xbcbf]
>>>> [    0.351680] pci 0000:00:1d.7: [




More information about the Ocfs2-users mailing list