<div dir="ltr"><div class="gmail_default" style="font-family:courier new,monospace;font-size:small">Hi Shencanquan / Srini,<br><br></div><div class="gmail_default" style="font-family:courier new,monospace;font-size:small">
Thanks for the comments.<br><br>If I am ready to compromise the kernel io that are pending, is there a way to do it. what I need is to stop heartbeat when the heartbeat region is not reachable? <br><br>In my case the host has got other types of filesystems as well that users use, and I cannot give an explanation to those users for the host reboot.<br>
<br></div><div class="gmail_default" style="font-family:courier new,monospace;font-size:small">Thanks,<br><br>Vineth<br></div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Sun, Jun 2, 2013 at 3:19 AM, shencanquan <span dir="ltr"><<a href="mailto:shencanquan@huawei.com" target="_blank">shencanquan@huawei.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div bgcolor="#FFFFFF" text="#000000"><div class="im">
On 2013/6/1 1:09, Srinivas Eeda wrote:
<blockquote type="cite">
<div>The reason nodes are fenced during
network failures is because we need to guarantee that no i/o's
are going to happen from this fenced node. If you just change
the fs to read-only we still cannot guarantee that there are no
inflight-io's from this node from previous writes.<br>
<br>
</div>
</blockquote></div>
I agree it.<br>
set the ocfs2 to read-only, it just prevent io from user space
application. on the kernel cache for example page cache or
currently write maybe write to io the SAN.<br>
<br>
the best way is use the SCSI-3 Persistent Group Reservation to fence
the node.<div><div class="h5"><br>
<br>
<blockquote type="cite">
<div> <br>
On 05/31/2013 08:33 AM, Vineeth Thampi wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">Hi,<br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">I have been working around
the issue of Node fence in case of a heartbeat failure /
Network timeout. I modified o2quo_fence_self() in quorum.c
to make all ocfs2 filesystems RO, when tested it worked like
a charm, and the filesystems were made RO, but I am not able
to umount the filesystem or stop O2CB service.<br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">Is there any way by which I
could ask O2CB to abort heartbeat and treat the filesystem
as LOCAL instead of GLOBAL? <br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">The following is the code
change that I made.<br>
<br>
**************************************************<br>
static void make_fs_RO(struct super_block *sb, void *arg)<br>
{<br>
struct ocfs2_super *osb = OCFS2_SB(sb);<br>
<br>
sb->s_flags |= MS_RDONLY;<br>
ocfs2_set_osb_flag(osb, OCFS2_OSB_ERROR_FS);<br>
ocfs2_set_ro_flag(osb, *(int *)arg);<br>
}<br>
<br>
/* this is horribly heavy-handed. It should instead flip
the file<br>
* system RO and call some userspace script. */<br>
static void o2quo_fence_self(void)<br>
{<br>
<br>
<b>...</b><br>
<br>
case O2NM_FENCE_RESET:<br>
printk(KERN_ERR "*** Hard failure in O2CB,
all ocfs2 "<br>
"filesystems made RO ***\n");<br>
<br>
/* Iterate through all ocfs2 super blocks
and make each of <br>
them RO */<br>
fs_type = get_fs_type("ocfs2");<br>
if (fs_type)<br>
iterate_supers_type(fs_type,
make_fs_RO, &hard_reset);<br>
<br>
break;<br>
<b>...</b><br>
<br>
}<br>
***************************************************************<br>
<br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">The error from kern.log:<br>
<br>
=======================================<br>
May 31 16:08:18 localhost kernel: [ 5434.076126]
(kworker/u:2,577,3):dlm_send_remote_convert_request:395
ERROR: Error -107 when sending message 504 (key 0xcfe4a084)
to node 0<br>
May 31 16:08:18 localhost kernel: [ 5434.076178] o2dlm:
Waiting on the death of node 0 in domain
A4E98618A3744717A65AF04E943D035A<br>
=======================================<br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small"> Any pointers would be much
appreciated.<br>
<br>
</div>
<div class="gmail_default" style="font-family:courier new,monospace;font-size:small">Thanks,<br>
<br>
Vineeth<br>
</div>
</div>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Ocfs2-users mailing list
<a href="mailto:Ocfs2-users@oss.oracle.com" target="_blank">Ocfs2-users@oss.oracle.com</a>
<a href="https://oss.oracle.com/mailman/listinfo/ocfs2-users" target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-users</a></pre>
</blockquote>
<br>
<br>
<fieldset></fieldset>
<br>
<pre>_______________________________________________
Ocfs2-users mailing list
<a href="mailto:Ocfs2-users@oss.oracle.com" target="_blank">Ocfs2-users@oss.oracle.com</a>
<a href="https://oss.oracle.com/mailman/listinfo/ocfs2-users" target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-users</a></pre>
</blockquote>
<br>
</div></div></div>
<br>_______________________________________________<br>
Ocfs2-users mailing list<br>
<a href="mailto:Ocfs2-users@oss.oracle.com">Ocfs2-users@oss.oracle.com</a><br>
<a href="https://oss.oracle.com/mailman/listinfo/ocfs2-users" target="_blank">https://oss.oracle.com/mailman/listinfo/ocfs2-users</a><br></blockquote></div><br></div>