[Ocfs2-users] heartbeat write timeout

Sunil Mushran Sunil.Mushran at oracle.com
Tue Apr 18 12:59:54 CDT 2006


Can you share that workload?

Diane Petersen wrote:
> I also modified elevator=deadline but didn't see any change in fencing 
> behavior until increasing O2CB_HEARTBEAT_THRESHOLD to 16 (30 second 
> timeout).
>
> The issue we were seeing, fencing at precisely 5:15pm every Saturday 
> but we couldn't trace the problem to any specific event or activity 
> occurring at that time. However, we created test a job that was very 
> write intensive to the ocfs2 partition and were then able to crash the 
> nodes at will every time we ran this job. After making the above 
> change to the THRESHOLD neither one of the nodes has fenced/crashed 
> since. It's now been several weeks since making this change.
>
> Configuration: 2 node RAC cluster, EMC shared storage, Linux x86-64 
> RH4 update 2, OCFS2, 10.2.0.2 database standard edition.
>
> Diane Petersen
> Sr. Oracle DBA
> ServerCare, Inc.
>
> */"Weller, Michael" <michael.weller at itz-essen.de>/* wrote:
>
>     I don't know if I mentioned that to the list, elevator=deadline
>     and rising the THRESHOLD to 14 solved my self-fencing issues.
>
>     (We'll see what happens under a possibly extreme load).
>
>     Michael.
>
>     ---
>
>     Dr. Michael Weller
>
>     ITZ Informationstechnologie GmbH
>     Consulting/Systemengineering
>     Bismarckstrasse 57
>     D-45128 Essen
>
>     Phone Office +49 201 24714 28
>     FAX Office +49 201 24714 33
>     Phone Mobile +49 172 2178078
>     E-Mail mailto:michael.weller at itz-essen.de
>
>     > -----Urspr�ngliche Nachricht-----
>     > Von: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-
>     > bounces at oss.oracle.com] Im Auftrag von Zunker, Christian
>     > Gesendet: Dienstag, 18. April 2006 15:21
>     > An: ocfs2-users at oss.oracle.com
>     > Betreff: Re: [Ocfs2-users] heartbeat write timeout
>     >
>     > Hi,
>     >
>     > I experienced the same problems. The elevator=deadline parameter
>     didn't
>     > help. But increasing the threshold to 60 did it. I think you could
>     > decrease the threshold, but didn't test it. In another posting,
>     it is said
>     > to take a timeout between 60 and 90 seconds. This would mean a
>     threshold
>     > between 31 and 46.
>     >
>     > I'll test this later.
>     >
>     > Best regards,
>     > Christian
>     >
>     >
>     > -----Urspr�ngliche Nachricht-----
>     > Von: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-
>     > bounces at oss.oracle.com] Im Auftrag von Weller, Michael
>     > Gesendet: Sonntag, 2. April 2006 14:18
>     > An: Silviu Marin-Caea; ocfs2-users at oss.oracle.com
>     > Betreff: Re: [Ocfs2-users] heartbeat write timeout
>     >
>     > Thx for the hints, I'll try that.
>     >
>     > With regards to the updates, while I generally agree, I can't
>     update the
>     > kernel here, because we'll loose vendor warranty in that case. I
>     know this
>     > is an odd concept, but that's how it works. We'll even loose Oracle
>     > support because the kernel update would void HP SAN-support.
>     >
>     > I mentioned SAN Failover, which for example does not work with
>     current
>     > kernel and current (even the not so current HP checked variant)
>     Qlogic
>     > driver.
>     >
>     > Anyway, I'll try your suggestions on monday and drop the list a
>     note if it
>     > worked.
>     >
>     > Thanks,
>     > Michael.
>     >
>     > ---
>     >
>     > Dr. Michael Weller
>     >
>     > ITZ Informationstechnologie GmbH
>     > Consulting/Systemengineering
>     > Bismarckstrasse 57
>     > D-45128 Essen
>     >
>     > Phone Office +49 201 24714 28
>     > FAX Office +49 201 24714 33
>     > Phone Mobile +49 172 2178078
>     > E-Mail mailto:michael.weller at itz-essen.de
>     > > -----Urspr�ngliche Nachricht-----
>     > > Von: ocfs2-users-bounces at oss.oracle.com [mailto:ocfs2-users-
>     > > bounces at oss.oracle.com] Im Auftrag von Silviu Marin-Caea
>     > > Gesendet: Sonntag, 2. April 2006 08:26
>     > > An: ocfs2-users at oss.oracle.com
>     > > Betreff: Re: [Ocfs2-users] heartbeat write timeout
>     > >
>     > > On Saturday 01 April 2006 22:36, Weller, Michael wrote:
>     > >
>     > > > we are bound to SLES9SP3 (and EXACTLY that, nothing less,
>     not a patch
>     > > > more)
>     > >
>     > > Having latest updates does not hurt, on the contrary, it
>     helps. For
>     > > example,
>     > > the latest kernel has OCFS2 1.1.8, while the kernel from SP3
>     has 1.1.7.
>     > > There are a number of bugfixes.
>     > >
>     > > SLES updates do really have a purpose. Apply them after
>     testing in a
>     > > non-production system.
>     > >
>     > > > It locks up immediately. Definitely nothing like a 12s timeout
>     > expires.
>     > >
>     > > It just looks like it's immediate, actually, the 12s do expire.
>     > >
>     > > > You mention a FAQ regarding some config option which I
>     didn't come
>     > > > across up to now, where can I find it?
>     > >
>     > > /boot/grub/menu.lst
>     > >
>     > > change elevator=cfq to elevator=deadline
>     > >
>     > > http://oss.oracle.com/projects/ocfs2/
>     > > scroll down, look at the red text
>     > >
>     > > > Which options would you recommend to fix the problem or at
>     least make
>     > > > locks much less likely.
>     > >
>     > > You could also increase the timeout:
>     > >
>     > > /etc/sysconfig/o2cb
>     > >
>     > > # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is
>     considered dead.
>     > > O2CB_HEARTBEAT_THRESHOLD=16
>     > >
>     > >
>     > > _______________________________________________
>     > > Ocfs2-users mailing list
>     > > Ocfs2-users at oss.oracle.com
>     > > http://oss.oracle.com/mailman/listinfo/ocfs2-users
>     >
>     >
>     >
>     > _______________________________________________
>     > Ocfs2-users mailing list
>     > Ocfs2-users at oss.oracle.com
>     > http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>     BEGIN:VCARD
>     VERSION:2.1
>     N:Weller;Michael;;Dr.
>     FN:Michael Weller
>     ORG:ITZ Informationstechnologie GmbH;System Engineering. Internet
>     Security, VPN, IP-Routing, Switching, Unix, Linux
>     TITLE:Senior Consultant
>     NOTE;ENCODING=QUOTED-PRINTABLE:=0D=0A
>     TEL;WORK;VOICE:+49 2012471428
>     TEL;CELL;VOICE:+49 1722178078
>     TEL;WORK;FAX:+49 201 2471433
>     ADR;WORK:;;Bismarckstra�e 57;Essen;;45128;Deutschland
>     LABEL;WORK;ENCODING=QUOTED-PRINTABLE:Bismarckstra=DFe
>     57=0D=0AEssen 45128=0D=0ADeutschland
>     EMAIL;PREF;INTERNET:michael.weller at itz-essen.de
>     REV:20050221T135645Z
>     END:VCARD
>     _______________________________________________
>     Ocfs2-users mailing list
>     Ocfs2-users at oss.oracle.com
>     http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
> ------------------------------------------------------------------------
> Yahoo! Messenger with Voice. Make PC-to-Phone Calls 
> <http://us.rd.yahoo.com/mail_us/taglines/postman1/*http://us.rd.yahoo.com/evt=39663/*http://voice.yahoo.com> 
> to the US (and 30+ countries) for 2¢/min or less.
> ------------------------------------------------------------------------
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>   



More information about the Ocfs2-users mailing list