[Oracleasm-commits] jlbec commits r355 - branches/oracleasm-1.0/kernel

svn-commits@oss.oracle.com svn-commits at oss.oracle.com
Fri Jul 28 15:11:18 CDT 2006


Author: jlbec
Date: 2006-07-28 15:11:17 -0500 (Fri, 28 Jul 2006)
New Revision: 355

Modified:
   branches/oracleasm-1.0/kernel/oracleasm.c
Log:

ASM error testing uncovered a very tight race affecting the
asm_request->r_status field.  The asm_update_user_ioc() call would pull
the userland status field in, because userspace might have set
ASM_CANCELLED, and it would add the values to r_status:

    get_user(tmp_status, user_ioc->status_asm_ioc);
    r->r_status |= tmp_status

The function would then use the asm_request to update the user_ioc with
all the new information.  Note, however, that this r_status update was
done without any locking (bad Joel!).

Under normal I/O, this is never a problem.  asm_update_user_ioc() is
called in two places - right after a submit and when userspace picks
up a completion.  If userspace is picking up a completion, the I/O is
done as far as the kernel is concerned, and no other thread can be
touching it.  Safe.  Right after a submit, it's pretty much safe to
assume that the I/O will take some time, even a milisecond is enough.
The update to userspace happens before the I/O paths can touch r_status.

However, the testing was explicitly pulling a disk.  So, any I/O
submitted to the block/scsi layers would immediately complete.  This
happens in irq or softirq context, so it happens on another cpu.  And
we race:

    cpu1			cpu2
    ----			----
    submit_bh()
    asm_update_user_ioc()
        get_user(tmp_status)
				asm_finish_io()
				spin_lock_irqsave()
    r_status |= tmp_status	r_status |= ASM_COMPLETED
				spin_unlock_irqrestore()

Without locking, those two updates can happen at the same time, and
there is nothing keeping the CPU caches in sync.  cpu1 never sees the
ASM_COMPLETED set on r_status, and squashes right over it.  Effectively,
the I/O never completes, and hangs around in the driver forever.

The fix is twofold.  First, we lock the r_status update in
asm_update_user_ioc().  That's fine and dandy, but we really need to
have a consistent view of the request as we update userspace.  Since
asm_update_user_ioc() is full of put_user() calls, we can't take a
lock around them.  Instead, under the lock still, we copy off the
request.  We can then drop the lock and use the copy to update
userspace.  This gives us a safe update, and gives userspace a
consistent view of the request.



Modified: branches/oracleasm-1.0/kernel/oracleasm.c
===================================================================
--- branches/oracleasm-1.0/kernel/oracleasm.c	2006-07-18 21:35:54 UTC (rev 354)
+++ branches/oracleasm-1.0/kernel/oracleasm.c	2006-07-28 20:11:17 UTC (rev 355)
@@ -911,10 +911,13 @@
 }
 
 
-static int asm_update_user_ioc(struct asm_request *r)
+static int asm_update_user_ioc(struct asmfs_file_info *afi,
+			       struct asm_request *r)
 {
+	struct asm_request copy;
 	asm_ioc *ioc;
 	u16 tmp_status;
+	unsigned long flags;
 
 	ioc = r->r_ioc;
 	mlog(ML_IOC, "User IOC is 0x%p\n", ioc);
@@ -923,30 +926,43 @@
 	mlog(ML_IOC, "Getting tmp_status\n");
 	if (get_user(tmp_status, &(ioc->status_asm_ioc)))
 		return -EFAULT;
+
+	/*
+	 * We're going to store off a copy of the request so we can
+	 * provide a consistent view to userspace.
+	 *
+	 * And so we can get/put_user() without locking :-)
+	 */
+	spin_lock_irqsave(&afi->f_lock, flags);
 	r->r_status |= tmp_status;
-	mlog(ML_IOC, "Putting r_status (0x%08X)\n", r->r_status);
-	if (put_user(r->r_status, &(ioc->status_asm_ioc)))
+	copy = *r;
+	spin_unlock_irqrestore(&afi->f_lock, flags);
+
+	/* From here on, ONLY TRUST copy */
+
+	mlog(ML_IOC, "Putting r_status (0x%08X)\n", copy.r_status);
+	if (put_user(copy.r_status, &(ioc->status_asm_ioc)))
 		return -EFAULT;
-	if (r->r_status & ASM_ERROR) {
-		mlog(ML_IOC, "Putting r_error (0x%08X)\n", r->r_error);
-		if (put_user(r->r_error, &(ioc->error_asm_ioc)))
+	if (copy.r_status & ASM_ERROR) {
+		mlog(ML_IOC, "Putting r_error (0x%08X)\n", copy.r_error);
+		if (put_user(copy.r_error, &(ioc->error_asm_ioc)))
 			return -EFAULT;
 	}
-	if (r->r_status & ASM_COMPLETED) {
-		if (put_user(r->r_elapsed, &(ioc->elaptime_asm_ioc)))
+	if (copy.r_status & ASM_COMPLETED) {
+		if (put_user(copy.r_elapsed, &(ioc->elaptime_asm_ioc)))
 			return -EFAULT;
 	}
 	mlog(ML_IOC,
 	     "r_status:0x%08X, bitmask:0x%08X, combined:0x%08X\n",
-	     r->r_status,
+	     copy.r_status,
 	     (ASM_SUBMITTED | ASM_COMPLETED | ASM_ERROR),
-	     (r->r_status & (ASM_SUBMITTED | ASM_COMPLETED | ASM_ERROR)));
-	if (r->r_status & ASM_FREE) {
+	     (copy.r_status & (ASM_SUBMITTED | ASM_COMPLETED | ASM_ERROR)));
+	if (copy.r_status & ASM_FREE) {
 		u64 z = 0ULL;
 		if (copy_to_user(&(ioc->reserved_asm_ioc),
 				 &z, sizeof(ioc->reserved_asm_ioc)))
 			return -EFAULT;
-	} else if (r->r_status &
+	} else if (copy.r_status &
 		   (ASM_SUBMITTED | ASM_ERROR)) {
 		u64 key = (u64)(unsigned long)r;
 		mlog(ML_IOC, "Putting key 0x%p on asm_ioc 0x%p\n",
@@ -1468,7 +1484,7 @@
 #endif  /* RED_HAT_LINUX_KERNEL */
 
 out:
-	ret = asm_update_user_ioc(r);
+	ret = asm_update_user_ioc(afi, r);
 
 	mlog_exit(ret);
 	return ret;
@@ -1573,7 +1589,7 @@
 
 	spin_unlock_irq(&afi->f_lock);
 
-	ret = asm_update_user_ioc(r);
+	ret = asm_update_user_ioc(afi, r);
 
 	mlog(ML_REQUEST, "Freeing request 0x%p\n", r);
 	asm_request_free(r);
@@ -1612,7 +1628,7 @@
 
 	*ioc = r->r_ioc;
 	
-	ret = asm_update_user_ioc(r);
+	ret = asm_update_user_ioc(afi, r);
 
 	asm_request_free(r);
 




More information about the Oracleasm-commits mailing list