[Ocfs-devel] Re: URGENT: OCFS2 hang - 32 node cluster POC
Wim Coekaerts
wim.coekaerts at oracle.com
Wed Aug 9 19:24:29 PDT 2006
alt-sysrq-t should still work w/ netdump configured
On Thu, Aug 10, 2006 at 12:22:39PM +1000, Colin Laird wrote:
> The problem is during the hang you can't get on to the box, its
> completely dead.
>
> Something we have found is that the heartbeat is set to 7, on the test
> cluster which has worked fine it is at 61. We are setting this value to
> 61 across the cluster.
>
> Sunil Mushran wrote:
> >Run:
> ># top
> ># vmstat 1
> ># iostat -x /dev/emcpowerb 1
> >
> >The latter two you can save to a file. For top, just monitor cpu usage
> >and see if any process is hogging all of it.
> >
> >Colin Laird wrote:
> >>and the fstab settings:
> >>
> >># This file is edited by fstab-sync - see 'man fstab-sync' for details
> >>/dev/VolGroup00/LogVol01 / ext3
> >>defaults 1 1
> >>LABEL=/boot /boot ext3
> >>defaults 1 2
> >>none /dev/pts devpts
> >>gid=5,mode=620 0 0
> >>none /dev/shm tmpfs
> >>defaults 0 0
> >>/dev/VolGroup00/LogVol02 /home ext3
> >>defaults 1 2
> >>none /proc proc
> >>defaults 0 0
> >>none /sys sysfs
> >>defaults 0 0
> >>/dev/VolGroup00/LogVol00 swap swap
> >>defaults 0 0
> >>/dev/emcpowerb /ocfs2 ocfs2
> >>_netdev 0 0
> >>/dev/hda /media/cdrom auto
> >>pamconsole,exec,noauto,managed 0 0
> >>/dev/fd0 /media/floppy auto
> >>pamconsole,exec,noauto,managed 0 0
> >>
> >>We are not storing the voting disk and cluster reg for RAC in here.
> >>
> >>Thanks
> >>
> >>
> >>Colin Laird wrote:
> >>>Hi,
> >>>
> >>>We are in the middle of a very large bid (Centrelink, Australia)
> >>>with time at a premium. So PLEASE HELP. we have been experiencing
> >>>machine hangs whenever we do large copies (5-18G) into OCFS2.
> >>>Either from ftp or local disk. The whole machine just freezes and
> >>>we need to run off and on. we now cannot get the data available for
> >>>the POC across the nodes!
> >>>
> >>>The setup is:
> >>>
> >>>32 clustered Dell 6850 nodes running RHEL4 U3 - Linux
> >>>c2.au.oracle.com 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006
> >>>x86_64 x86_64 x86_64 GNU/Linux
> >>>
> >>>We have the following ocfs2 packages installed:
> >>>ocfs2-2.6.9-34.ELsmp-1.2.3-1
> >>>ocfs2-2.6.9-34.EL-1.2.3-1
> >>>ocfs2-tools-debuginfo-1.2.1-1
> >>>ocfs2-2.6.9-34.ELlargesmp-1.2.3-1
> >>>ocfs2console-1.2.1-1
> >>>ocfs2-tools-1.2.1-1
> >>>
> >>>We have* elevator=deadline* set as per instructions too.
> >>>
> >>>We are currently looking for a log to see if we can find anything.
> >>>The system and ftp logs show nothing.
> >>>
> >>>Can anyone provide any pointers? Have we missed applying anything?
> >>>
> >>>Thanks,
> >>>
> >>>--
> >>>Colin Laird
> >>>Principal Solutions Consultant
> >>>
> >>>Oracle New Zealand Ltd
> >>>Level 10
> >>>Todd Building
> >>>93-97 Customhouse Quay
> >>>Wellington
> >>>New Zealand
> >>>
> >>>main: +64 4 978 5400
> >>>ddi: +64 4 978 5423
> >>>mob: +64 21 617 025
> >>>fax: +64 4 978 5401
> >>
> >>--
> >>Colin Laird
> >>Principal Solutions Consultant
> >>
> >>Oracle New Zealand Ltd
> >>Level 10
> >>Todd Building
> >>93-97 Customhouse Quay
> >>Wellington
> >>New Zealand
> >>
> >>main: +64 4 978 5400
> >>ddi: +64 4 978 5423
> >>mob: +64 21 617 025
> >>fax: +64 4 978 5401
>
> --
> Colin Laird
> Principal Solutions Consultant
>
> Oracle New Zealand Ltd
> Level 10
> Todd Building
> 93-97 Customhouse Quay
> Wellington
> New Zealand
>
> main: +64 4 978 5400
> ddi: +64 4 978 5423
> mob: +64 21 617 025
> fax: +64 4 978 5401
>
More information about the Ocfs-devel
mailing list