[Ocfs2-tools-devel] Re: [PATCH 1/2] The io_cache implementation

Joel Becker Joel.Becker at oracle.com
Tue Mar 27 01:57:45 PDT 2007


On Tue, Mar 27, 2007 at 01:43:51AM -0700, Joel Becker wrote:
> On Tue, Mar 27, 2007 at 02:48:45PM +0800, tao.ma wrote:
> > >+		ret = io_cache_read_one_block(channel, blkno, data);
> > >  
> > Should data change accordingly? data + i * channel->io_blksize?
> > The same error happens in io_cache_write_block.
> 
> > >+	memcpy(icb->icb_buf, data, channel->io_blksize);
> > >+	io_cache_seen(ic, icb);
> > >+
> > >+	ret = unix_io_write_block(channel, blkno, 1, icb->icb_buf);
> > >+	return ret;
> > >+}
> > >  
> > I think the "memcpy" should be moved after the success of 
> > "unix_io_write_block" so that we can keep consistent of the buffer and 
> > the real content on the disk.
> 
> 	Hmm, while I liked the idea of "writing out of the cache", you
> make a valid point that we want the cache to reflect the disk.  On the
> other hand, we have no idea what the disk looks like when a write fails
> - it could be an EIO that actually modified the platter.
> 	Perhaps just remove the block from the cache on failure to force
> a real re-read from disk?

	Ok, here's a patch against my committed code that fixes both
problems.  It also consolidates the logic for removing a block from the
lookup table and for stealing a block from the LRU.  What do you think?
	Also, this makes it clear that my test program doesn't test
multiblock reads or writes.  I'll have to add that tomorrow.

Joel

Index: libocfs2/unix_io.c
===================================================================
--- libocfs2/unix_io.c	(revision 1325)
+++ libocfs2/unix_io.c	(working copy)
@@ -212,6 +212,29 @@ static void io_cache_seen(struct io_cach
 	list_add_tail(&icb->icb_list, &ic->ic_lru);
 }
 
+static void io_cache_disconnect(struct io_cache *ic,
+				struct io_cache_block *icb)
+{
+	/*
+	 * This icb should longer be looked up.
+	 * If rb_parent is NULL, it's already disconnected.
+	 */
+	if (icb->icb_node.rb_parent) {
+		rb_erase(&icb->icb_node, &ic->ic_lookup);
+		memset(&icb->icb_node, 0, sizeof(struct rb_node));
+	}
+}
+
+static struct io_cache_block *io_cache_pop_lru(struct io_cache *ic)
+{
+	struct io_cache_block *icb;
+
+	icb = list_entry(ic->ic_lru.next, struct io_cache_block, icb_list);
+	io_cache_disconnect(ic, icb);
+
+	return icb;
+}
+
 static errcode_t io_cache_read_one_block(io_channel *channel, int64_t blkno,
 					 char *data)
 {
@@ -223,14 +246,8 @@ static errcode_t io_cache_read_one_block
 	if (icb)
 		goto found;
 
-	/*
-	 * Ok, this blkno isn't in the cache.  Steal something.
-	 */
-	icb = list_entry(ic->ic_lru.next, struct io_cache_block, icb_list);
-	if (icb->icb_node.rb_parent) {
-		rb_erase(&icb->icb_node, &ic->ic_lookup);
-		memset(&icb->icb_node, 0, sizeof(struct rb_node));
-	}
+	/* Ok, this blkno isn't in the cache.  Steal something. */
+	icb = io_cache_pop_lru(ic);
 
 	/*
 	 * If the read fails, we leave the block at the end of the LRU
@@ -258,7 +275,7 @@ static errcode_t io_cache_read_block(io_
 	int i;
 	errcode_t ret = 0;
 
-	for (i = 0; i < count; i++, blkno++) {
+	for (i = 0; i < count; i++, blkno++, data += channel->io_blksize) {
 		ret = io_cache_read_one_block(channel, blkno, data);
 		if (ret)
 			break;
@@ -278,14 +295,8 @@ static errcode_t io_cache_write_one_bloc
 	if (icb)
 		goto found;
 
-	/*
-	 * Ok, this blkno isn't in the cache.  Steal something.
-	 */
-	icb = list_entry(ic->ic_lru.next, struct io_cache_block, icb_list);
-	if (icb->icb_node.rb_parent) {
-		rb_erase(&icb->icb_node, &ic->ic_lookup);
-		memset(&icb->icb_node, 0, sizeof(struct rb_node));
-	}
+	/* Ok, this blkno isn't in the cache.  Steal something. */
+	icb = io_cache_pop_lru(ic);
 
 	icb->icb_blkno = blkno;
 	io_cache_insert(ic, icb);
@@ -295,6 +306,9 @@ found:
 	io_cache_seen(ic, icb);
 
 	ret = unix_io_write_block(channel, blkno, 1, icb->icb_buf);
+	if (ret)
+		io_cache_disconnect(ic, icb);
+
 	return ret;
 }
 
@@ -305,7 +319,7 @@ static errcode_t io_cache_write_block(io
 	int i;
 	errcode_t ret = 0;
 
-	for (i = 0; i < count; i++, blkno++) {
+	for (i = 0; i < count; i++, blkno++, data += channel->io_blksize) {
 		ret = io_cache_write_one_block(channel, blkno, data);
 		if (ret)
 			break;
-- 

To spot the expert, pick the one who predicts the job will take the
longest and cost the most.

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-tools-devel mailing list