[Ocfs2-devel] The root cause analysis about buffer read getting starvation

Junxiao Bi junxiao.bi at oracle.com
Sun Dec 20 23:36:58 PST 2015


On 12/21/2015 10:53 AM, Gang He wrote:
> Hello Mark and all,
> 
> 
[ snip ]

>> > 
>> > You are correct here - the change was introduced to solve a deadlock between
>> > page lock and ip_alloc_sem(). Basically, ->readpage is going to be called
>> > with the page lock held and we need to be aware of that.
> Hello guys, my main question is, why we changed ip_alloc_sem lock/unlock position from SLES10 to SLES11?
> In SLES10, we get ip_alloc_sem lock before calling generic_file_read() or generic_file_write_nolock in file.c, 
> but in SLES11, we get ip_alloc_sem lock in ocfs2_readpage in aops.c, and more, getting/putting the page lock and ip_alloc_sem lock orders are NOT consistent in read/write path.
> I just want to know the background behind this code evolution. If we keep getting the ip_alloc_sem lock before calling generic_file_aio_read in SLES11, the deadlock can be avoided?
> then, we need not to use nonblocking way to get the lock in read_page(), buffer read will not getting starvation in such case, the read/write IO behavior will be the same with SLES10.
Holding locks during generic_file_read() will stop reader and writer
running parallel.
For ip_alloc_sem, running parallel is bad as reader and writer may touch
different pages.
For inode_lock, looks acceptable, parallel running reader and writer
will cause a lock ping-pang issue and keep truncating and flushing pages
caches, this will cause bad performance. Of course, need fixing the
recursive locking issue, or it will be very easy to run into deadlock.

Thanks,
Junxiao.
> 
> Thanks
> Gang
> 




More information about the Ocfs2-devel mailing list