[Ocfs2-users] re: how should ocfs2 react to nic hardware issue

Alexei_Roudnev Alexei_Roudnev at exigengroup.com
Thu Nov 30 16:44:30 PST 2006


These guys never designed clusters, as I can see. There are a few _BASIC_
rules to make a cluster reliable:
- cluster can not rely on a single heartbeat. In our case it means _you must
be able to configure 3 - 4 different heartbeat channels_ .
- cluster must test/verify external object to ensure that it is connected to
the network or it is not. It means that node1 and node2 should both test
connection to the default router/gateway and self-fence (using shared disk
to inform another party about it's decision) if it lost conenction.
Else, we have what we have - my experiments shows that both, OCFSv2 and
Oracle RAC, can not work reliable on 2 node configurations, and both are
very unstable in any configuration if they experience network failure (such
as network switch reboot).

If you open any _PRODUCTION GRADE_ cluster _such as VCS or even heartbeat2_
you wil see all these mesures well implemented and
they really work.

I saw funny situations . If network fail, then node1 on RAC decides to fense
and reboot, while node2 in OCFSv2 decides the same (no one guarantee that
they both will have the same master host). And if we add heartbeat cluster
and forget to use SERIAL conenction as additional heartbeat channel, then we
have one more party which runs aroiund and kills  neighbours (bo stoneith).



----- Original Message ----- 
From: "SCOTT, Gavin" <gavin.l.scott at baesystems.com>
To: <ocfs2-users at oss.oracle.com>
Sent: Thursday, November 30, 2006 3:02 PM
Subject: RE: [Ocfs2-users] re: how should ocfs2 react to nic hardware issue


I've recently gone through this, however I addressed it as a RAC/CRS
issue, as it didn't really relate to OCFS2. But at the risk of being
slightly off topic, I'll cover it briefly here anyway.

In a 2 node RAC cluster, a failure of the interconnect NIC on node 2
causes node 2 to get evicted from the cluster and it reboots. This is
all fine.

I would have thought that a hardware failure of the interconnect NIC on
node 1 would cause node 1 to evict and fence, leaving node 2 to take
over. However this is not the case. Node 2 still evicts and fences,
leaving the faulty node 1 to continue. After some back and forth to
Oracle, they eventually confirmed that this is the correct behaviour,
and that down time on the cluster is necessary to return to full
functionality. Not ideal.

The next problem is that the instance on Node 1 would crash about 5
minutes after NIC failure, reporting at the database level that it could
not locate the NIC (this is on RHEL 4). According to the RAC/CRS people,
I now have to raise this as a TAR with the RDBMS team, however I have
not done this yet.

As for users hanging on node failure when the NIC fails, this could be
attributable to the TCP/IP settings that determine when a connection is
deemed to have failed. You need to tune your IP settings on the server
and client side to ensure that the failure is detected in a timely
fashion. Depending on the OS, there are various settings (on VMS:
tcp_keepalive, tcp_keepcnt, tcp_keepintvl, tcp_keepinit), the defaults
on which will not result in failover in this situation for several
hours.

Note that you will not see these delays on an instance or node failure,
that is were the CRS MISSCOUNT parameter determines when to evict a node
from a cluster and comes into play.

Note also that tuning IP settings will affect the entire network and
that this needs to be balanced to ensure short network glitches don't
trigger a failover.

There is some good info on Metalink on this, but as Peter implied, this
looks more like a CRS and TCP/IP issue than an OCFS2 one.

This info is a quick rehash from some time ago, so sorry if it's got
some holes. Feel free to contact me offline, as I did go through quite a
lot of digging to resolve this.

Regards,
Gavin

-----Original Message-----
From: ocfs2-users-bounces at oss.oracle.com
[mailto:ocfs2-users-bounces at oss.oracle.com] On Behalf Of Adam Kenger
Sent: Friday, 1 December 2006 07:34
To: Peter Santos
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] re: how should ocfs2 react to nic hardware
issue

Peter - depending on how you have your RAC cluster setup, the hang on
the front end is not that unexpected.  It depends on how the user was
connected and how the TAF policy was set up.  Eth0 is your public
interface I assume.  Was the user connecting to the IP on that
interface or to the VIP set up by RAC?  When you down eth0 the VIP on
that interface should get pushed over onto one of the other 2 nodes
in the cluster.  If you're connecting to a "service" versus an actual
"instance" there should be no hang on the front end.  If you're
actually connected directly to the instance on the node, then you'll
be out of luck if you disconnect that instance.  As an example, this
is what the corresponding tnsnames.ora file looks like :

MYDBSERVICE =
   (DESCRIPTION =
     (ADDRESS_LIST =
       (ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 1521))
       (ADDRESS = (PROTOCOL = TCP)(HOST = node2-vip)(PORT = 1521))
       (ADDRESS = (PROTOCOL = TCP)(HOST = node3-vip)(PORT = 1521))
     )
     (CONNECT_DATA =
       (SERVICE_NAME = mydbservice.db.mydomain.com)
     )
   )

MYDB1 =
   (DESCRIPTION =
     (ADDRESS = (PROTOCOL = TCP)(HOST = node1-vip)(PORT = 15
21))
     (CONNECT_DATA =
       (SERVER = DEDICATED)
       (SERVICE_NAME = mydb.db.mydomain.com)
       (INSTANCE_NAME = mydb1)
     )
   )

If you connected to the service "MYDBSERVICE" you could survive the
failure of any given node.  You'd seamlessly fail-over onto one of
the other nodes.  If you connect directly to the "MYDB1" instance,
you'll be out of luck if you drop the connection to it.

As far as o2cb goes, you are right I believe.  Eventually, it will be
determined that the node is no longer heartbeating and will either
panic or reboot.

For your testing, just be careful you're not confusing the OCFS2
layer with the Oracle CRS/RAC layers.

Comments welcome....

Hope that helps

Adam




On Nov 30, 2006, at 3:30 PM, Peter Santos wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> guys,
> I'm trying to test how my 10gR2 oracle cluster (3 nodes) on SuSe

> reacts to a network card hardware failure.
> I have eth0 and eth1 as my network cards, I took down eth0
(ifdown
> eth0) to see what would happen and
> I didn't get any reaction from the o2cb service. This is
probably
> the correct behavior since my
>  /etc/ocfs2/cluster.conf uses eth1 as the connection channel?
>
> If I take down eth1 I suspect o2cb will eventually reboot the
> machine right? I'm not using any bonding.
>
> My concern is that when I took down eth0, I had a user logged
into
> the instance and everything just "hung" for
> that user, until I manually took down the instance with
> "SRVCTL"... then the user connection failed over to
> a working instance.
>
> Anyway, just trying to get some general knowledge of the
behavior
> of o2cb in order to understand
> my testing.
>
> - -peter
>
>
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.1 (GNU/Linux)
> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
> iD8DBQFFbz9Soyy5QBCjoT0RAiqvAJ40UCXsV/4Zdv19a246ByzNL4CiwgCfX704
> +BZwa23LphG878FP/5fQKek=
> =Nhcz
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users


_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users

_______________________________________________
Ocfs2-users mailing list
Ocfs2-users at oss.oracle.com
http://oss.oracle.com/mailman/listinfo/ocfs2-users




More information about the Ocfs2-users mailing list