[Ocfs2-users] OCFS2 cluster won't come up and stay up

Sunil Mushran sunil.mushran at oracle.com
Thu Dec 1 18:36:07 PST 2011


To analyze one needs the logs. And a bugzilla is a good place holder for the logs. 

On Dec 1, 2011, at 6:05 PM, Tony Rios <tony at tonyrios.com> wrote:

> Sunil,
> Is submitting a bug report the only answer?
> I'm happy to send in this information, but can I take the cluster down entirely and sort of reset it so we can get these servers back online and talking again in the meanwhile?
> Tony
> 
> On Dec 1, 2011, at 5:05 PM, Sunil Mushran wrote:
> 
>> Node 3 is joining the domain. It is having problms getting the superblock cluster lock.
>> Create a bugzilla on oss.oracle.com and attach the /var/logs/messages from all nodes.
>> If you have netconsole setup, attach those logs too.
>> 
>> On 12/01/2011 04:55 PM, Tony Rios wrote:
>>> I'm having an issue today where I just can't seem to keep all the servers in the cluster online.
>>> They aren't losing network connectivity and I can ping the iSCSI host just fine and the host is logged in.
>>> 
>>> These are the errors form the dmesg when I try to mount the filesystem:
>>> 
>>> root at pedge36:~# dmesg
>>> [    0.000000] Initializing cgroup subsys cpuset
>>> [    0.000000] Initializing cgroup subsys cpu
>>> [    0.000000] Linux version 2.6.38-10-generic (buildd at yellow) (gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu4) ) #46-Ubuntu SMP Tue Jun 28 15:07:17 UTC 2011 (Ubuntu 2.6.38-10.46-generic 2.6.38.7)
>>> [    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel
>>> [    0.000000] BIOS-provided physical RAM map:
>>> [    0.000000]  BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
>>> [    0.000000]  BIOS-e820: 0000000000100000 - 00000000effc0000 (usable)
>>> [    0.000000]  BIOS-e820: 00000000effc0000 - 00000000effcfc00 (ACPI data)
>>> [    0.000000]  BIOS-e820: 00000000effcfc00 - 00000000effff000 (reserved)
>>> [    0.000000]  BIOS-e820: 00000000f0000000 - 00000000f4000000 (reserved)
>>> [    0.000000]  BIOS-e820: 00000000fec00000 - 00000000fed00400 (reserved)
>>> [    0.000000]  BIOS-e820: 00000000fed13000 - 00000000feda0000 (reserved)
>>> [    0.000000]  BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
>>> [    0.000000]  BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
>>> [    0.000000]  BIOS-e820: 0000000100000000 - 00000001ffffe000 (usable)
>>> [    0.000000]  BIOS-e820: 00000001ffffe000 - 0000000200000000 (reserved)
>>> [    0.000000]  BIOS-e820: 0000000200000000 - 0000000210000000 (usable)
>>> [    0.000000] debug: ignoring loglevel setting.
>>> [    0.000000] NX (Execute Disable) protection: active
>>> [    0.000000] DMI 2.3 present.
>>> [    0.000000] DMI: Dell Computer Corporation PowerEdge 850/0Y8628, BIOS A04 08/22/2006
>>> [    0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable) ==>  (reserved)
>>> [    0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
>>> [    0.000000] No AGP bridge found
>>> [    0.000000] last_pfn = 0x210000 max_arch_pfn = 0x400000000
>>> [    0.000000] MTRR default type: uncachable
>>> [    0.000000] MTRR fixed ranges enabled:
>>> [    0.000000]   00000-9FFFF write-back
>>> [    0.000000]   A0000-BFFFF uncachable
>>> [    0.000000]   C0000-CBFFF write-protect
>>> [    0.000000]   CC000-EBFFF uncachable
>>> [    0.000000]   EC000-FFFFF write-protect
>>> [    0.000000] MTRR variable ranges enabled:
>>> [    0.000000]   0 base 000000000 mask E00000000 write-back
>>> [    0.000000]   1 base 200000000 mask FF0000000 write-back
>>> [    0.000000]   2 base 0F0000000 mask FF0000000 uncachable
>>> [    0.000000]   3 disabled
>>> [    0.000000]   4 disabled
>>> [    0.000000]   5 disabled
>>> [    0.000000]   6 disabled
>>> [    0.000000]   7 disabled
>>> [    0.000000] x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106
>>> [    0.000000] e820 update range: 00000000f0000000 - 0000000100000000 (usable) ==>  (reserved)
>>> [    0.000000] last_pfn = 0xeffc0 max_arch_pfn = 0x400000000
>>> [    0.000000] found SMP MP-table at [ffff8800000fe710] fe710
>>> [    0.000000] initial memory mapped : 0 - 20000000
>>> [    0.000000] init_memory_mapping: 0000000000000000-00000000effc0000
>>> [    0.000000]  0000000000 - 00efe00000 page 2M
>>> [    0.000000]  00efe00000 - 00effc0000 page 4k
>>> [    0.000000] kernel direct mapping tables up to effc0000 @ 1fffa000-20000000
>>> [    0.000000] init_memory_mapping: 0000000100000000-0000000210000000
>>> [    0.000000]  0100000000 - 0210000000 page 2M
>>> [    0.000000] kernel direct mapping tables up to 210000000 @ effb6000-effc0000
>>> [    0.000000] RAMDISK: 366d0000 - 37360000
>>> [    0.000000] ACPI: RSDP 00000000000fd160 00014 (v00 DELL  )
>>> [    0.000000] ACPI: RSDT 00000000000fd174 00038 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: FACP 00000000000fd1b8 00074 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: DSDT 00000000effc0000 01C19 (v01 DELL   PE830    00000001 MSFT 0100000E)
>>> [    0.000000] ACPI: FACS 00000000effcfc00 00040
>>> [    0.000000] ACPI: APIC 00000000000fd22c 00074 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: SPCR 00000000000fd2a0 00050 (v01 DELL   PE850    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: HPET 00000000000fd2f0 00038 (v01 DELL   PE830    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: MCFG 00000000000fd328 0003C (v01 DELL   PE830    00000001 MSFT 0100000A)
>>> [    0.000000] ACPI: Local APIC address 0xfee00000
>>> [    0.000000] No NUMA configuration found
>>> [    0.000000] Faking a node at 0000000000000000-0000000210000000
>>> [    0.000000] Initmem setup node 0 0000000000000000-0000000210000000
>>> [    0.000000]   NODE_DATA [00000001ffff9000 - 00000001ffffdfff]
>>> [    0.000000]  [ffffea0000000000-ffffea00073fffff] PMD ->  [ffff8801f7e00000-ffff8801feffffff] on node 0
>>> [    0.000000] Zone PFN ranges:
>>> [    0.000000]   DMA      0x00000010 ->  0x00001000
>>> [    0.000000]   DMA32    0x00001000 ->  0x00100000
>>> [    0.000000]   Normal   0x00100000 ->  0x00210000
>>> [    0.000000] Movable zone start PFN for each node
>>> [    0.000000] early_node_map[4] active PFN ranges
>>> [    0.000000]     0: 0x00000010 ->  0x000000a0
>>> [    0.000000]     0: 0x00000100 ->  0x000effc0
>>> [    0.000000]     0: 0x00100000 ->  0x001ffffe
>>> [    0.000000]     0: 0x00200000 ->  0x00210000
>>> [    0.000000] On node 0 totalpages: 2096974
>>> [    0.000000]   DMA zone: 56 pages used for memmap
>>> [    0.000000]   DMA zone: 7 pages reserved
>>> [    0.000000]   DMA zone: 3921 pages, LIFO batch:0
>>> [    0.000000]   DMA32 zone: 14280 pages used for memmap
>>> [    0.000000]   DMA32 zone: 964600 pages, LIFO batch:31
>>> [    0.000000]   Normal zone: 15232 pages used for memmap
>>> [    0.000000]   Normal zone: 1098878 pages, LIFO batch:31
>>> [    0.000000] ACPI: PM-Timer IO Port: 0x808
>>> [    0.000000] ACPI: Local APIC address 0xfee00000
>>> [    0.000000] ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
>>> [    0.000000] ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
>>> [    0.000000] ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
>>> [    0.000000] ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
>>> [    0.000000] ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
>>> [    0.000000] IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
>>> [    0.000000] ACPI: IOAPIC (id[0x03] address[0xfec10000] gsi_base[32])
>>> [    0.000000] IOAPIC[1]: apic_id 3, version 32, address 0xfec10000, GSI 32-55
>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
>>> [    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
>>> [    0.000000] ACPI: IRQ0 used by override.
>>> [    0.000000] ACPI: IRQ2 used by override.
>>> [    0.000000] ACPI: IRQ9 used by override.
>>> [    0.000000] Using ACPI (MADT) for SMP configuration information
>>> [    0.000000] ACPI: HPET id: 0xffffffff base: 0xfed00000
>>> [    0.000000] SMP: Allowing 2 CPUs, 0 hotplug CPUs
>>> [    0.000000] nr_irqs_gsi: 72
>>> [    0.000000] PM: Registered nosave memory: 00000000000a0000 - 0000000000100000
>>> [    0.000000] PM: Registered nosave memory: 00000000effc0000 - 00000000effcf000
>>> [    0.000000] PM: Registered nosave memory: 00000000effcf000 - 00000000effd0000
>>> [    0.000000] PM: Registered nosave memory: 00000000effd0000 - 00000000effff000
>>> [    0.000000] PM: Registered nosave memory: 00000000effff000 - 00000000f0000000
>>> [    0.000000] PM: Registered nosave memory: 00000000f0000000 - 00000000f4000000
>>> [    0.000000] PM: Registered nosave memory: 00000000f4000000 - 00000000fec00000
>>> [    0.000000] PM: Registered nosave memory: 00000000fec00000 - 00000000fed00000
>>> [    0.000000] PM: Registered nosave memory: 00000000fed00000 - 00000000fed13000
>>> [    0.000000] PM: Registered nosave memory: 00000000fed13000 - 00000000feda0000
>>> [    0.000000] PM: Registered nosave memory: 00000000feda0000 - 00000000fee00000
>>> [    0.000000] PM: Registered nosave memory: 00000000fee00000 - 00000000fee10000
>>> [    0.000000] PM: Registered nosave memory: 00000000fee10000 - 00000000ffb00000
>>> [    0.000000] PM: Registered nosave memory: 00000000ffb00000 - 0000000100000000
>>> [    0.000000] PM: Registered nosave memory: 00000001ffffe000 - 0000000200000000
>>> [    0.000000] Allocating PCI resources starting at f4000000 (gap: f4000000:ac00000)
>>> [    0.000000] Booting paravirtualized kernel on bare hardware
>>> [    0.000000] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:2 nr_node_ids:1
>>> [    0.000000] PERCPU: Embedded 28 pages/cpu @ffff8800efc00000 s84416 r8192 d22080 u1048576
>>> [    0.000000] pcpu-alloc: s84416 r8192 d22080 u1048576 alloc=1*2097152
>>> [    0.000000] pcpu-alloc: [0] 0 1
>>> [    0.000000] Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 2067399
>>> [    0.000000] Policy zone: Normal
>>> [    0.000000] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-2.6.38-10-generic root=UUID=3cd859b8-2605-4a38-8767-a6d1f99d53bd ro debug ignore_loglevel
>>> [    0.000000] PID hash table entries: 4096 (order: 3, 32768 bytes)
>>> [    0.000000] Checking aperture...
>>> [    0.000000] No AGP bridge found
>>> [    0.000000] Calgary: detecting Calgary via BIOS EBDA area
>>> [    0.000000] Calgary: Unable to locate Rio Grande table in EBDA - bailing!
>>> [    0.000000] Memory: 8178472k/8650752k available (5941k kernel code, 262856k absent, 209424k reserved, 5016k data, 956k init)
>>> [    0.000000] SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1
>>> [    0.000000] Hierarchical RCU implementation.
>>> [    0.000000]    RCU dyntick-idle grace-period acceleration is enabled.
>>> [    0.000000]    RCU-based detection of stalled CPUs is disabled.
>>> [    0.000000] NR_IRQS:16640 nr_irqs:512 16
>>> [    0.000000] Console: colour dummy device 80x25
>>> [    0.000000] console [tty0] enabled
>>> [    0.000000] allocated 83886080 bytes of page_cgroup
>>> [    0.000000] please try 'cgroup_disable=memory' option if you don't want memory cgroups
>>> [    0.000000] hpet clockevent registered
>>> [    0.000000] Fast TSC calibration using PIT
>>> [    0.000000] Detected 3000.094 MHz processor.
>>> [    0.010004] Calibrating delay loop (skipped), value calculated using timer frequency.. 6000.18 BogoMIPS (lpj=30000940)
>>> [    0.010017] pid_max: default: 32768 minimum: 301
>>> [    0.010056] Security Framework initialized
>>> [    0.010082] AppArmor: AppArmor initialized
>>> [    0.010088] Yama: becoming mindful.
>>> [    0.012092] Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
>>> [    0.022482] Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
>>> [    0.024244] Mount-cache hash table entries: 256
>>> [    0.024453] Initializing cgroup subsys ns
>>> [    0.024463] ns_cgroup deprecated: consider using the 'clone_children' flag without the ns_cgroup.
>>> [    0.024472] Initializing cgroup subsys cpuacct
>>> [    0.024481] Initializing cgroup subsys memory
>>> [    0.024495] Initializing cgroup subsys devices
>>> [    0.024501] Initializing cgroup subsys freezer
>>> [    0.024507] Initializing cgroup subsys net_cls
>>> [    0.024512] Initializing cgroup subsys blkio
>>> [    0.024574] CPU: Physical Processor ID: 0
>>> [    0.024580] CPU: Processor Core ID: 0
>>> [    0.024586] mce: CPU supports 4 MCE banks
>>> [    0.024603] CPU0: Thermal monitoring enabled (TM1)
>>> [    0.024612] using mwait in idle threads.
>>> [    0.027748] ACPI: Core revision 20110112
>>> [    0.029308] ftrace: allocating 24323 entries in 96 pages
>>> [    0.030085] Setting APIC routing to flat
>>> [    0.030516] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
>>> [    0.136419] CPU0: Intel(R) Pentium(R) D CPU 3.00GHz stepping 04
>>> [    0.140000] Performance Events: Netburst events, Netburst P4/Xeon PMU driver.
>>> [    0.140000] ... version:                0
>>> [    0.140000] ... bit width:              40
>>> [    0.140000] ... generic registers:      18
>>> [    0.140000] ... value mask:             000000ffffffffff
>>> [    0.140000] ... max period:             0000007fffffffff
>>> [    0.140000] ... fixed-purpose events:   0
>>> [    0.140000] ... event mask:             000000000003ffff
>>> [    0.140000] Booting Node   0, Processors  #1 Ok.
>>> [    0.300021] Brought up 2 CPUs
>>> [    0.300030] Total of 2 processors activated (12000.49 BogoMIPS).
>>> [    0.300847] devtmpfs: initialized
>>> [    0.302451] print_constraints: dummy:
>>> [    0.302485] Time:  0:41:31  Date: 12/02/11
>>> [    0.302546] NET: Registered protocol family 16
>>> [    0.302672] Trying to unpack rootfs image as initramfs...
>>> [    0.310474] ACPI: bus type pci registered
>>> [    0.310570] PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf0000000-0xf3ffffff] (base 0xf0000000)
>>> [    0.310580] PCI: MMCONFIG at [mem 0xf0000000-0xf3ffffff] reserved in E820
>>> [    0.340577] PCI: Using configuration type 1 for base access
>>> [    0.342112] bio: create slab<bio-0>  at 0
>>> [    0.342934] ACPI: EC: Look up EC in DSDT
>>> [    0.345243] ACPI: Interpreter enabled
>>> [    0.345252] ACPI: (supports S0 S4 S5)
>>> [    0.345278] ACPI: Using IOAPIC for interrupt routing
>>> [    0.349231] ACPI: No dock devices found.
>>> [    0.349239] HEST: Table not found.
>>> [    0.349246] PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug
>>> [    0.349794] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
>>> [    0.350838] pci_root PNP0A03:00: host bridge window [io  0x0000-0x0cf7] (ignored)
>>> [    0.350848] pci_root PNP0A03:00: host bridge window [io  0x0d00-0xffff] (ignored)
>>> [    0.350856] pci_root PNP0A03:00: host bridge window [mem 0x000a0000-0x000bffff] (ignored)
>>> [    0.350864] pci_root PNP0A03:00: host bridge window [mem 0xf0000000-0xfebfffff] (ignored)
>>> [    0.350884] pci 0000:00:00.0: [8086:2778] type 0 class 0x000600
>>> [    0.350946] pci 0000:00:01.0: [8086:2779] type 1 class 0x000604
>>> [    0.350996] pci 0000:00:01.0: PME# supported from D0 D3hot D3cold
>>> [    0.351005] pci 0000:00:01.0: PME# disabled
>>> [    0.351066] pci 0000:00:1c.0: [8086:27d0] type 1 class 0x000604
>>> [    0.351137] pci 0000:00:1c.0: PME# supported from D0 D3hot D3cold
>>> [    0.351145] pci 0000:00:1c.0: PME# disabled
>>> [    0.351178] pci 0000:00:1c.4: [8086:27e0] type 1 class 0x000604
>>> [    0.351248] pci 0000:00:1c.4: PME# supported from D0 D3hot D3cold
>>> [    0.351256] pci 0000:00:1c.4: PME# disabled
>>> [    0.351285] pci 0000:00:1c.5: [8086:27e2] type 1 class 0x000604
>>> [    0.351355] pci 0000:00:1c.5: PME# supported from D0 D3hot D3cold
>>> [    0.351363] pci 0000:00:1c.5: PME# disabled
>>> [    0.351391] pci 0000:00:1d.0: [8086:27c8] type 0 class 0x000c03
>>> [    0.351443] pci 0000:00:1d.0: reg 20: [io  0xbce0-0xbcff]
>>> [    0.351484] pci 0000:00:1d.1: [8086:27c9] type 0 class 0x000c03
>>> [    0.351537] pci 0000:00:1d.1: reg 20: [io  0xbcc0-0xbcdf]
>>> [    0.351577] pci 0000:00:1d.2: [8086:27ca] type 0 class 0x000c03
>>> [    0.351629] pci 0000:00:1d.2: reg 20: [io  0xbca0-0xbcbf]
>>> [    0.351680] pci 0000:00:1d.7: [



More information about the Ocfs2-users mailing list