[Ocfs2-users] o2net: connect to node has been idle for 10 secs

Mon Aug 7 11:38:59 PDT 2006

Hi,

    Thanks for the responses. Before anyone thinks its a network
problem, we're running with active/active bonding on a pair of
dedicated gigabit switches, with flow control on. None of the linux
systems, nor any of the network switches register any changes or faults
on the network. OCFS2 runs on one pair of interfaces, on one class
C network, and RAC runs on another pair of interfaces, on a different
class C network. RAC does not have a problem.

    We can also ping across the ocfs2 heartbeat lan with out
interruption and dropped packets as the heartbeat fails.

    As this system is production, we've backed off using ocfs2 
and have gone with NFS in the interim. I'm still committed to using
ocfs2, but we need a stable production platform. I'll be seeing if
I can replicate the issue on a testbed in the next few days.

    I notice from the timer values printed on the console as the 
node dies that o2net_advance_rx is called without triggering the
code path that includes either o2net_process_message or
o2net_check_handshake, as the tv_timer timestamp is less than adv_start
and adv_end timestamps. So either there is an incomplete packet,
or some sort of erroneous message that comes in just before we 
stop for 10 seconds, and then drop dead.

      Andy

On Mon, 2006-08-07 at 11:14 -0700, Alexei_Roudnev wrote:
> Btw.
> 
> Netwo0rk convergence time is BY DEFAULT 30 seconds MINIMUM. It means, than
> in no case I can garantee idle time on interconenct less than 40 seconds.
> 
> Read STP Ethernet protocol, if you have questions.
> 
> Our FibreChannel network have, to compare, 60 second timeout setting for
> MPIO switchover. NetApp have 40 seconds Failover time.
> Protoicols such as EGRP have 20 - 30 seconds reconvergence time (OSPF is
> faster - in some cases).
> 
> So.
> 
> If OCFSv2 see idle time > 10 seconds, it have not ANY RIGHT to reboot. IT
> can do anithing special, such as go to _recovery mode_, trying to reconnect
> and enasuring that other nodes can lost connection with it, but if it
> self-fence or reboots before trying for 60 - 90 seconds, it violate network
> time frames and do not make any sense.
> 
> Moreover, as I said before - if we have not outstanding IO on OCFS, there is
> not much sense to reboot even if system must self-fence - it makes much more
> sense to _reconnect_ to the cluster or make some other attempt to start from
> the scratch. Else, any OCFS, even passive (not used), increase system
> instability in thousands times.
> 
> ----- Original Message ----- 
> From: "Andy Phillips" <Andrew.Phillips at betfair.com>
> To: "Sunil Mushran" <Sunil.Mushran at oracle.com>
> Cc: "ocfs2-users" <ocfs2-users at oss.oracle.com>
> Sent: Thursday, August 03, 2006 10:20 AM
> Subject: Re: [Ocfs2-users] o2net: connect to node has been idle for 10 secs
> 
> 
> > Hello,
> >
> >   Its doubly odd then. We'll need to schedule an upgrade to 1.2.3.
> > In the mean time, we've scheduled a cron job that touches a file
> > on each ocfs2 file system every 3 seconds. This should ensure a
> > constant flow of traffic assuming metadata updates travel across
> > the interconnect.
> >
> >   I've noticed that there is one other person who  seems to have
> > seen this problem -
> > http://oss.oracle.com/pipermail/ocfs2-users/2006-July/000612.html
> >    but they were on an old version of kernel and fs code. Any idea
> > as to what the underlying cause may be if its not a dropped packet?
> >
> >   Would you also mind letting me know what those two line changes were,
> > just for my own interest's sake.
> >
> >    Thanks for the quick response.
> >
> >        Andy
> >
> >
> >
> >
> > On Thu, 2006-08-03 at 09:44 -0700, Sunil Mushran wrote:
> > > 1. o2net talks tcp. It should be able to handle this.
> > > 2. If the cluster is active and the nodes are communicating,
> > > the keepalive packet is rarely sent. It only sends the packet
> > > if it does not hear from the other node for 5 secs.
> > > 3. Try the same with 1.2.3. (We made 2 important 1 line fixes.)
> > > 4. If this does happen again, and you are interested, we
> > > could always give you a drop that dumps the stack of
> > > all the procs, to get a better feel for the situation.
> > >
> > > Andy Phillips wrote:
> > > > Hello,
> > > >
> > > >    Apologies for following up on myself.
> > > >
> > > > in ocfs2/cluster/tcp_internal.h
> > > > #define O2NET_KEEPALIVE_DELAY_SECS      5
> > > > #define O2NET_IDLE_TIMEOUT_SECS         10
> > > >
> > > >
> > > >    Is this really sensible? Potentially, given small variance in
> > > > system clocks losing one keepalive packet (assuming that
> > > > o2net_sc_send_keep_req is the only thing keeping the connection alive)
> > > > the loss of one packet could cause a node to self fence and reboot.
> > > >
> > > >    Would
> > > > #define O2NET_KEEPALIVE_DELAY_SECS      5
> > > > #define O2NET_IDLE_TIMEOUT_SECS         20
> > > >
> > > >    Cause any problems?
> > > >
> > > >    Andy
> > > >
> > > >
> > > >
> > > > On Thu, 2006-08-03 at 12:41 +0100, Andy Phillips wrote:
> > > >
> > > >> Hello,
> > > >>
> > > >>    I've a two node 10gR2 rac cluster on a pair of sun opteron boxes.
> > > >> Redhat AS 4.3 2.6.9-34.0.1.ELsmp x86_64. ocfs 1.2.2. RAC is using
> > > >> ASM to talk to the data files, but we have 3 ocfs2 filesystems up
> > > >> to share dba files, and the usual bits and bobs.
> > > >>
> > > >>    Things were fine until, on mostly idle system, this happened out
> > > >> of the blue;
> > > >>
> > > >> Aug  2 19:06:27 fred kernel: o2net: connection to node barney (num 0)
> at
> > > >> 172.16.6.10:7777 has been idle for 10 seconds, shutting it down.
> > > >> Aug  2 19:06:27 fred kernel: (0,7):o2net_idle_timer:1309 here are
> some
> > > >> times that might help debug the situation: (tmr 1154545576.798263 now
> > > >> 1154545586.796978 dr 1154545576.798238 adv
> > > >> 1154545576.798291:1154545576.798293 func (06aac8a1:1)
> > > >> 1154545566.800782:1154545566.800787)
> > > >> Aug  2 19:06:27 fred kernel: o2net: no longer connected to node
> barney
> > > >> (num 0) at 172.16.6.10:7777
> > > >> Aug  2 19:08:33 fred kernel: (25,7):o2quo_make_decision:143 ERROR:
> > > >> fencing this node because it is connected to
> > > >> a half-quorum of 1 out of 2 nodes which doesn't include the lowest
> > > >> active node 0
> > > >> Aug  2 19:08:33 fred kernel: (25,7):o2hb_stop_all_regions:1908 ERROR:
> > > >> stopping heartbeat on all active regions.
> > > >>
> > > >>    And the node then halted.
> > > >>
> > > >>    Barney is node 0. The systems were idle. We've hammered the ocfs2
> > > >> file systems, and set o2cb_heartbeat_threshold to 61. All is good and
> > > >> stable under heavy i/o.
> > > >>
> > > >>    The interconnect is a bonded interface, with two gig cards, each
> > > >> connected (with flow control on) to two separate FESX424 switches.
> > > >> The switches dont register any problems at this time, nor does linux
> > > >> register any interface issues.
> > > >>
> > > >>    I'm looking at the source code at the moment, but nothing is
> leaping
> > > >> out at me. Any ideas - Do the timer debug lines above mean anything
> to
> > > >> anyone.
> > > >>
> > > >>   Thanks
> > > >>    Andy
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > > >>
> > >
> > > ________________________________________________________________________
> > > In order to protect our email recipients, Betfair use SkyScan from
> > > MessageLabs to scan all Incoming and Outgoing mail for viruses.
> > >
> > > ________________________________________________________________________
> > -- 
> > Andy Phillips, FRAS
> > Systems Architect, Information Systems.
> > Betfair.com
> >
> > Direct Line: 0208 834 8436
> >
> > Betfair Limited (Company No.5140986), Winslow Road, Hammersmith
> > Embankment, London W6 9HP, United Kingdom, +44 208 834 8000, +44 208 834
> > 8501 (direct). The information in this e-mail and any attachment is
> > confidential, may contain legal advice protected by privilege and is
> > intended only for the named recipient(s). The e-mail may not be
> > disclosed or used by any person other than the addressee, nor may it be
> > copied in any way. If you are not a named recipient please notify the
> > sender immediately and delete any copies of this message. Any
> > unauthorized copying, disclosure or distribution of the material in this
> > e-mail is strictly forbidden. Any view or opinions presented are solely
> > those of the author and do not necessarily represent those of the
> > company.
> >
> > _______________________________________________
> > Ocfs2-users mailing list
> > Ocfs2-users at oss.oracle.com
> > http://oss.oracle.com/mailman/listinfo/ocfs2-users
> >
> 
> 
> ________________________________________________________________________
> In order to protect our email recipients, Betfair use SkyScan from 
> MessageLabs to scan all Incoming and Outgoing mail for viruses.
> 
> ________________________________________________________________________
-- 
Andy Phillips, FRAS
Systems Architect, Information Systems.
Betfair.com   

Direct Line: 0208 834 8436

Betfair Limited (Company No.5140986), Winslow Road, Hammersmith
Embankment, London W6 9HP, United Kingdom, +44 208 834 8000, +44 208 834
8501 (direct). The information in this e-mail and any attachment is
confidential, may contain legal advice protected by privilege and is
intended only for the named recipient(s). The e-mail may not be
disclosed or used by any person other than the addressee, nor may it be
copied in any way. If you are not a named recipient please notify the
sender immediately and delete any copies of this message. Any
unauthorized copying, disclosure or distribution of the material in this
e-mail is strictly forbidden. Any view or opinions presented are solely
those of the author and do not necessarily represent those of the
company.