[Ocfs2-commits] branch, o2net-delay-enotconn, created. ocfs2-1.4.0-207-g4d6018a

svn-commits at oss.oracle.com svn-commits at oss.oracle.com
Wed Oct 6 17:53:49 PDT 2010


This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "The ocfs2 filesystem version 1.4".

The branch, o2net-delay-enotconn has been created
        at  4d6018aa6e45c906d5a963c7fb38f2278993d880 (commit)

- Log -----------------------------------------------------------------
commit 4d6018aa6e45c906d5a963c7fb38f2278993d880
Author: Srinivas Eeda <srinivas.eeda at oracle.com>
Date:   Wed Oct 6 17:09:07 2010 -0700

    o2net: correct keepalive message protocol
    
    Currently keepalive packet is sent to another node if a message is not heard
    from the other node for O2NET_KEEPALIVE_DELAY_MS seconds. The message is not
    resent again till the other node sends a message.
    
    The functionality described above works as we rely on TCP protocol which
    guarantees message delivery. However the intention of this feature was to send
    a keepalive message every timeout seconds. This patch sends a message for
    every keepalive time interval.
    
    Signed-off-by: Srinivas Eeda <srinivas.eeda at oracle.com>

commit 5a074929b7a3c3b23c4837bce0ad14f4b6ef577a
Author: Srinivas Eeda <srinivas.eeda at oracle.com>
Date:   Wed Oct 6 17:09:59 2010 -0700

    o2net: delay enotconn for sends receives till quorum decision
    
    When a ocfs2 network heartbeat times out between two nodes, o2net layer breaks
    the socket connection, and returns -ENOTCONN to processes that are trying
    send/receive messages to/from other node. It also queues a quorum decision to
    be made after the disk timeout to resolve split brain.
    
    The fix queues the quorum decision after network heartbeat timeout but avoids
    socket disconnects. The fix delays socket disconnects till O2HB_NODE_DOWN_CB
    event which is triggered on the surviving node after the node evictions happen.
    Surviving node signals -ENOTCONN to processes waiting to send/receives messages
    to/from evicted node. If network connection comes back before the eviction,
    quorum decision is cancelled and messaging resumes.
    
    Signed-off-by: Srinivas Eeda <srinivas.eeda at oracle.com>

commit 22668a6462b555f5357a31352ee15b92141e3c8a
Author: Srinivas Eeda <srinivas.eeda at oracle.com>
Date:   Wed Oct 6 17:10:39 2010 -0700

    o2net: rollback reconnect on network timeout.
    
    This patch rollbacks earlier fix that tries to re-establish network connection
    when network timeout happens. Reconnect was re-cycling sockets which results
    in lost messages resulting in hangs.
    
    Signed-off-by: Srinivas Eeda <srinivas.eeda at oracle.com>

-----------------------------------------------------------------------


hooks/post-receive
-- 
The ocfs2 filesystem version 1.4



More information about the Ocfs2-commits mailing list