<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=gb2312">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:ËÎÌå;
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:"\@ËÎÌå";
        panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
        {font-family:Consolas;
        panose-1:2 11 6 9 2 2 4 3 2 4;}
@font-face
        {font-family:"microsoft yahei";
        panose-1:0 0 0 0 0 0 0 0 0 0;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        text-align:justify;
        text-justify:inter-ideograph;
        font-size:10.5pt;
        font-family:"Calibri","sans-serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:#0563C1;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:#954F72;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-compose;
        font-family:"Calibri","sans-serif";
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;}
/* Page Definitions */
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
        {page:WordSection1;}
--></style><!--[if gte mso 9]><xml>
<o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
<o:shapelayout v:ext="edit">
<o:idmap v:ext="edit" data="1" />
</o:shapelayout></xml><![endif]-->
</head>
<body lang="ZH-CN" link="#0563C1" vlink="#954F72" style="text-justify-trim:punctuation">
<div class="WordSection1">
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Hi everyone,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">I have meet a OCFS2 issue.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">The OS is Oracle Linux 6.5, using the latest Oracle UEK kernel 3.8.13-26.1.1.el6uek.x86_64.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Three are two nodes in the OCFS2 cluster, and all nodes use the iSCSI SAN as share storage.
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">The heartbeat mode of OCFS2 cluster is global. There are three iSCSI LUNs, one is used as
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">heartbeat device and other two are formatted to OCFS2 volume by mkfs.ocfs2 and mounted on each node.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">The problem occurred when I
</span><span lang="EN-US" style="font-size:12.0pt;font-family:"microsoft yahei","serif";color:#333333">intentionally
</span><span lang="EN-US" style="font-family:Consolas">logout one iSCSI LUN (OCFS2 volume) using command : iscsiadm ¨Cm node ¨CT xxx ¨Cu.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">After 5 minutes or more, large same log messages would begin to written into the syslog (/var/log/messages), the contents are as below:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Feb 26 16:06:44 tony kernel: (kworker/u:0,5141,0):ocfs2_dir_foreach_blk_id:1778 ERROR: Unable to read inode block for dir 520<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US">.............................................................................................<o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">The syslog file size increases quickly, and will occupy all the remained capacity of the / directory, which making the host blocked and not responsible.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">According to the error logs, the messages is logged by function ocfs2_dir_foreach_blk_id in source file fs/ocfs2/dir.c<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">static int ocfs2_dir_foreach_blk_id(struct inode *inode,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> u64 *f_version,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> loff_t *f_pos, void *priv,<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> filldir_t filldir, int *filldir_err)<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">{<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> int ret, i, filldir_ret;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> unsigned long offset = *f_pos;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> struct buffer_head *di_bh = NULL;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> struct ocfs2_dinode *di;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> struct ocfs2_inline_data *data;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> struct ocfs2_dir_entry *de;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> ret = ocfs2_read_inode_block(inode, &di_bh);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> if (ret) {<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> mlog(ML_ERROR, "Unable to read inode block for dir %llu\n",<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> (unsigned long long)OCFS2_I(inode)->ip_blkno);<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> goto out;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> }<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> di = (struct ocfs2_dinode *)di_bh->b_data;<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"> data = &di->id2.i_data;<o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US">.............................................................................................<o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">I can use the command: debugfs.ocfs2 ¨Cl ERROR off to disable mlog(ML_ERROR) logging, but a kernel process will be
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">created and occupy large cpu resources, and it cannot be killed.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">5141 root 20 0 0 0 0 R
<b>97.2</b> 0.0 33:03.89 <b>kworker/u:0</b> <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">2464 root 20 0 193m 28m 6212 S 1.0 2.8 0:19.48 Xorg
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">3331 root 20 0 289m 8972 4944 S 0.7 0.9 0:06.58 gnome-terminal
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">2941 root 20 0 130m 4804 1512 S 0.3 0.5 0:00.29 gconfd-2
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">2990 root 20 0 299m 7268 5136 S 0.3 0.7 0:03.71 wnck-applet
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">3056 root 20 0 272m 6572 4092 S 0.3 0.6 0:00.21 notification-da
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">6073 root 20 0 15088 1196 852 R 0.3 0.1 0:00.36 top <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">If I umount the OCFS2 volume mounted within 5 minutes, this problem would not happen, and the volume
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">can be re-mounted successfully. While after 5 minitues or more, the OCFS2 volume cannot be umounted
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">successfully, and the umount process will hang. Even I reconnect the iSCSI LUN, and mount operation
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">will also hang, the OCFS2 volume cannot be mounted anymore.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">This may be a bug of OCFS2. Now I have to reboot the host to solve this problem, is the issue<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">had been solved or any other way to avoid it?
<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Thanks a lot!<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas">Tony Zhang<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-family:Consolas"><o:p> </o:p></span></p>
</div>
</body>
</html>