[Ocfs2-devel] [PATCH] ocfs2: Avoid livelock in ocfs2_readpage()
Jan Kara
jack at suse.cz
Mon Jun 27 03:47:46 PDT 2011
On Sun 26-06-11 00:26:44, Joel Becker wrote:
> On Thu, Jun 23, 2011 at 10:51:47PM +0200, Jan Kara wrote:
> > When someone writes to an inode, readers accessing the same inode via
> > ocfs2_readpage() just busyloop trying to get ip_alloc_sem because
> > do_generic_file_read() looks up the page again and retries ->readpage()
> > when previous attempt failed with AOP_TRUNCATED_PAGE. When there are enough
> > readers, they can occupy all CPUs and in non-preempt kernel the system is
> > deadlocked because writer holding ip_alloc_sem is never run to release the
> > semaphore. Fix the problem by making reader block on ip_alloc_sem to break
> > the busy loop.
> >
> > Signed-off-by: Jan Kara <jack at suse.cz>
> > ---
> > fs/ocfs2/aops.c | 8 ++++++++
> > 1 files changed, 8 insertions(+), 0 deletions(-)
> >
> > diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
> > index ac97bca..0919e8f 100644
> > --- a/fs/ocfs2/aops.c
> > +++ b/fs/ocfs2/aops.c
> > @@ -290,7 +290,15 @@ static int ocfs2_readpage(struct file *file, struct page *page)
> > }
> >
> > if (down_read_trylock(&oi->ip_alloc_sem) == 0) {
> > + /*
> > + * Unlock the page and cycle ip_alloc_sem so that we don't
> > + * busyloop waiting for ip_alloc_sem to unlock
> > + */
> > ret = AOP_TRUNCATED_PAGE;
> > + unlock_page(page);
> > + unlock = 0;
> > + down_read(&oi->ip_alloc_sem);
> > + up_read(&oi->ip_alloc_sem);
> > goto out_inode_unlock;
> > }
>
> Question: First, is it safe to drop the page lock here?
> Can all callers of readpage (not just g_f_a_r()) handle that?
We do exactly the same thing in ocfs2_inode_lock_with_page() so we
definitely don't introduce a new problem. And every caller of readpage() is
*supposed* to handle AOP_TRUNCATED_PAGE return value (because page reading
can race with truncate) in which case page must be returned unlocked. So it
should be safe. Admittedly I did not bother to check the call sites.
Honza
--
Jan Kara <jack at suse.cz>
SUSE Labs, CR
More information about the Ocfs2-devel
mailing list