[Ocfs2-users] OCFS2 initiating reboot on production machine.

Srinivas Eeda srinivas.eeda at oracle.com
Fri May 28 09:33:18 PDT 2010


May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_write_timeout:172 ERROR: 
Heartbeat write timeout to device sda1 after 60000 milliseconds
May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_stop_all_regions:1967 
ERROR: stopping heartbeat on all active regions.

It means heartbeat took longer than 60seconds. You didn't paste the 
whole message, so not sure what call took longer. Check your storage

On 5/28/2010 9:21 AM, Devender Narula wrote:
>
>     HI Team
>      
>     my ocfs2 1.4.1 running on RHEL 5.4 with 11 G Rel 1 software
>     running causing reboot with below mention error message .. can you
>     please suggest me what is the cause of this and hw we can fix this.
>      
>     please help me in this.
>      
>     Regards,
>      
>     Devender
>      
>     ----------------
>     /var/log/messages output.
>      
>     May 24 02:10:49 ewhpbc3bl7 kernel: (26,0):o2hb_write_timeout:172
>     ERROR: Heartbeat write timeout to device sda1 after 60000 milliseconds
>     May 24 02:10:49 ewhpbc3bl7 kernel:
>     (26,0):o2hb_stop_all_regions:1967 ERROR: stopping heartbeat on all
>     active regions.
>     May 24 02:10:49 ewhpbc3bl7 kernel: ocfs2 is very sorry to be
>     fencing this system by restarting
>     May 24 02:13:41 ewhpbc3bl7 syslogd 1.4.1: restart.
>     May 24 02:13:41 ewhpbc3bl7 kernel: klogd 1.4.1, log source =
>     /proc/kmsg started.
>     May 24 02:13:41 ewhpbc3bl7 kernel: Linux version
>     2.6.18-164.11.1.el5 (mockbuild at ls20-bc2-13.build.redhat.com
>     <mailto:mockbuild at ls20-bc2-13.build.redhat.com>) (gcc version
>     4.1.2 20080704 (Red Hat 4.1.2-46)) #1 SMP Wed Jan 6 13:26:04 EST 2010
>     May 24 02:13:41 ewhpbc3bl7 kernel: Command line: ro root=LABEL=/
>     acpi=off apm=off rhgb quiet
>     May 24 02:13:41 ewhpbc3bl7 kernel: BIOS-provided physical RAM map:
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 0000000000010000 -
>     000000000009f400 (usable)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 000000000009f400 -
>     00000000000a0000 (reserved)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000000f0000 -
>     0000000000100000 (reserved)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 0000000000100000 -
>     00000000d762f000 (usable)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000d762f000 -
>     00000000d763c000 (ACPI data)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000d763c000 -
>     00000000d763d000 (usable)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000d763d000 -
>     00000000dc000000 (reserved)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000fec00000 -
>     00000000fee10000 (reserved)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 00000000ff800000 -
>     0000000100000000 (reserved)
>     May 24 02:13:41 ewhpbc3bl7 kernel:  BIOS-e820: 0000000100000000 -
>     0000000227fff000 (usable)
>     May 24 02:13:41 ewhpbc3bl7 kernel: DMI 2.6 present.
>     May 24 02:13:41 ewhpbc3bl7 kernel: No NUMA configuration found
>     May 24 02:13:41 ewhpbc3bl7 kernel: Faking a node at
>     0000000000000000-0000000227fff000
>     May 24 02:13:41 ewhpbc3bl7 kernel: Bootmem setup node 0
>     0000000000000000-0000000227fff000
>     May 24 02:13:41 ewhpbc3bl7 kernel: Memory for crash kernel (0x0 to
>     0x0) notwithin permissible range
>     May 24 02:13:41 ewhpbc3bl7 kernel: disabling kdump
>     May 24 02:13:41 ewhpbc3bl7 kernel: Intel MultiProcessor
>     Specification v1.4
>     May 24 02:13:41 ewhpbc3bl7 kernel:     Virtual Wire compatibility
>     mode.
>     May 24 02:13:41 ewhpbc3bl7 kernel: OEM ID: HP       Product ID:
>     PROLIANT     APIC at: 0xFEE00000
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #16 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #0 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #2 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #4 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #6 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #18 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #20 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: Processor #22 6:10 APIC version 20
>     May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #8 Version 32 at
>     0xFEC00000.
>     May 24 02:13:41 ewhpbc3bl7 kernel: I/O APIC #0 Version 32 at
>     0xFEC80000.
>     May 24 02:13:41 ewhpbc3bl7 kernel: Setting APIC routing to clustered
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20100528/52e5183e/attachment.html 


More information about the Ocfs2-users mailing list