[Ocfs2-devel]
Bug 48 "[kernel 2.6 porting] System halt during reboot after mount
an OCFS volume." in bugzilla is fixed.
Sonic Zhang
sonic.zhang at intel.com
Wed Mar 24 15:37:50 CST 2004
Hi all,
I successfully root cause and fix bug 48 "[kernel 2.6 porting] System halt
during reboot after mount an OCFS volume.".
In current OCFS v2 driver, ocfs_volume_thread, ocfs_recv_thread and
ocfs_commit_thread are assumed to be terminated by the ocfs_dismount_volume
routine. But, if the system reboots, all processes and kernel threads will
receive signal SIGTERM before ocfs_dismount_volume routine is called.
These kernel threads don't exit correctly. For example, they don't know
they
should exit loop after received signal SIGTERM and clear their task_struct
pointers in ocfs_super to indiate their status. That's the cause of the
system
halt in ocfs_dismount_volume routine when system reboots.
I attach a patch to fix this bug. Please review.
Thank you
This patch is against svn version 807.
----------------------------------------------------------------
--- ocfs2.old/src/journal.c 2004-03-22 16:02:55.000000000 +0800
+++ ocfs2/src/journal.c 2004-03-22 16:09:57.000000000 +0800
@@ -1034,12 +1034,13 @@
/* The OCFS_JOURNAL_IN_SHUTDOWN will signal to commit_cache to not
* drop the trans_lock (which we want to hold until we
* completely destroy the journal. */
- if (osb->commit && osb->commit->c_task) {
- /* Wait for the commit thread */
- LOG_TRACE_STR ("Waiting for ocfs2commit to exit....");
- send_sig (SIGINT, osb->commit->c_task, 0);
- wait_for_completion(&osb->commit->c_complete);
- osb->commit->c_task = NULL;
+ if (osb->commit) {
+ if(osb->commit->c_task) {
+ /* Wait for the commit thread */
+ LOG_TRACE_STR ("Waiting for ocfs2commit to exit....");
+ send_sig (SIGINT, osb->commit->c_task, 0);
+ wait_for_completion(&osb->commit->c_complete);
+ }
ocfs_free(osb->commit);
}
@@ -1808,7 +1809,7 @@
break;
}
-
+ commit->c_task = NULL;
/* Flush all scheduled tasks */
#if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,0)
--- ocfs2.old/src/nm.c.old 2004-03-23 17:09:29.000000000 +0800
+++ ocfs2/src/nm.c 2004-03-24 10:18:35.000000000 +0800
@@ -118,6 +118,8 @@
OcfsIpcCtxt.recv_sock = NULL;
}
+ OcfsIpcCtxt.task = NULL;
+
/* signal main thread of ipcdlm's exit */
complete (&(OcfsIpcCtxt.complete));
@@ -249,6 +251,8 @@
__u64 cfg_seq_num;
int which, pruned, prune_iters = 0;
struct buffer_head *bh = NULL;
+ int signr;
+ siginfo_t info;
LOG_ENTRY ();
@@ -258,6 +262,7 @@
sprintf (proc, "ocfs2nm-%d", osb->osb_id);
ocfs_daemonize (proc, strlen(proc));
+ allow_signal(SIGTERM);
osb->dlm_task = current;
@@ -437,7 +442,11 @@
osb->hbt = 50 + j;
}
set_current_state (TASK_INTERRUPTIBLE);
- schedule_timeout (osb->hbt - j);
+ if( schedule_timeout (osb->hbt - j) < osb->hbt -j ) {
+ signr = dequeue_signal_lock(current, ¤t->blocked, &info);
+ if(signr == SIGTERM)
+ OcfsGlobalCtxt.flags |= OCFS_FLAG_SHUTDOWN_VOL_THREAD;
+ }
}
/* Flush all scheduled tasks */
@@ -447,6 +456,8 @@
flush_scheduled_tasks ();
#endif
+ osb->dlm_task = NULL;
+
complete (&(osb->dlm_complete));
eek:
LOG_EXIT_LONG (0);
More information about the Ocfs2-devel
mailing list