[Ocfs2-devel] Long io response time doubt

Eric Ren zren at suse.com
Wed Nov 25 17:49:35 PST 2015


Hi Joseph,

You've cleared up my confusion! Very good explaination~

Thanks,
Eric

On 11/26/15 09:34, Joseph Qi wrote:
> Hi Eric,
> convert has two types, upconvert and downconvert. And please note that,
> PR and EX is not compatible.
> Assume read node has gotten PR first, then write node wants to get EX,
> it requires read node to downconvert PR to NL. Then read node want's to
> get PR again, write node should downconvert EX to PR (highest
> compatible) and then read node can upconvert NL to PR. And so forth.
> So both the read/write nodes will do upconvert and downconvert.
> The code you paste is calling into fs/dlm which I am not familiar with:(
> I think you can list your questions and send to cluster-devel.
>
> Thanks,
> Joseph
>
> On 2015/11/24 18:05, Eric Ren wrote:
>> Sorry, forget to add the pieces of code flow...
>>
>> On reading node:
>>
>>   3)  dlm_ast-4278  =>  ocfs2dc-4277
>>   ------------------------------------------
>>
>>   3)               |  ocfs2_process_blocked_lock() {
>>   3)               |    ocfs2_unblock_lock() {
>>   3)   0.116 us    |      ocfs2_prepare_cancel_convert();
>>   3)               |      ocfs2_cancel_convert() {
>>   3)               |        user_dlm_unlock() {
>>   3)               |          dlm_unlock() {
>>   3)   0.120 us    |            dlm_find_lockspace_local();
>>   3)   0.158 us    |            find_lkb();
>>   3)               |            cancel_lock() {
>>   3)               |              validate_unlock_args() {
>>   3)   0.093 us    |                del_timeout();
>>   3)   0.782 us    |              }
>>   3)               |              _cancel_lock() {
>>   3)               |                send_common() {
>>   3)   0.189 us    |                  add_to_waiters();
>>   3)               |                  create_message() {
>>   3)               |                    _create_message() {
>>   3)               |                      dlm_lowcomms_get_buffer() {
>>   3)   0.156 us    |                        nodeid2con();
>>   3)   1.680 us    |                      }
>>   3)   0.108 us    |                      dlm_our_nodeid();
>>   3)   2.821 us    |                    }
>>   3)   3.319 us    |                  }
>>   3)   0.094 us    |                  send_args();
>>   3)               |                  send_message() {
>>   3)   0.070 us    |                    dlm_message_out();
>>   3)   9.485 us    |                    dlm_lowcomms_commit_buffer();
>>   3) + 10.609 us   |                  }
>>   3) + 16.054 us   |                }
>>   3) + 16.632 us   |              }
>>   3)   0.156 us    |              put_rsb();
>>   3) + 19.044 us   |            }
>>   3)               |            dlm_put_lkb() {
>>   3)   0.094 us    |              __put_lkb();
>>   3)   0.632 us    |            }
>>   3)   0.074 us    |            dlm_put_lockspace();
>>   3) + 22.513 us   |          }
>>   3) + 23.028 us   |        }
>>   3) + 23.727 us   |      }
>>   3) + 25.004 us   |    }
>>   3)               |    ocfs2_schedule_blocked_lock() {
>>   3)   0.073 us    |      lockres_set_flags();
>>   3)   0.592 us    |    }
>>   3) + 26.852 us   |  }
>>   ------------------------------------------
>>   3)  ocfs2dc-4277  =>  dlm_ast-4278
>>   ------------------------------------------
>>
>>   3)               |  process_asts() {
>>   3)   0.202 us    |    dlm_rem_lkb_callback();
>>   3)   0.081 us    |    dlm_rem_lkb_callback();
>>   3)               |    fsdlm_lock_ast_wrapper() {
>>   3)               |      ocfs2_unlock_ast() {
>>   3)   0.099 us    |        ocfs2_get_inode_osb();
>>   3)   1.290 us    |        ocfs2_wake_downconvert_thread();
>>   3)               |        lockres_clear_flags() {
>>   3)   8.539 us    |          lockres_set_flags();
>>   3)   9.096 us    |        }
>>   3) + 12.055 us   |      }
>>   3) + 12.673 us   |    }
>>   3)               |    dlm_put_lkb() {
>>   3)   0.161 us    |      __put_lkb();
>>   3)   0.718 us    |    }
>>   3) + 16.133 us   |  }
>>
>>
>> On writing node:
>>
>>   3)  kworker-443   =>  ocfs2dc-4456
>>   ------------------------------------------
>>
>>   3)               |  ocfs2_process_blocked_lock() {
>>   3)               |    ocfs2_unblock_lock() {
>>   3)   0.269 us    |      ocfs2_prepare_cancel_convert();
>>   3)               |      ocfs2_cancel_convert() {
>>   3)               |        user_dlm_unlock() {
>>   3)               |          dlm_unlock() {
>>   3)   0.321 us    |            dlm_find_lockspace_local();
>>   3)   0.286 us    |            find_lkb();
>>   3)               |            cancel_lock() {
>>   3)               |              validate_unlock_args() {
>>   3)   0.122 us    |                del_timeout();
>>   3)   0.901 us    |              }
>>   3)               |              _cancel_lock() {
>>   3)               |                do_cancel() {
>>   3)               |                  revert_lock() {
>>   3)               |                    move_lkb() {
>>   3)   0.155 us    |                      del_lkb();
>>   3)   0.243 us    |                      add_lkb();
>>   3)   1.778 us    |                    }
>>   3)   2.577 us    |                  }
>>   3)               |                  queue_cast() {
>>   3)   0.102 us    |                    del_timeout();
>>   3)               |                    dlm_add_ast() {
>>   3)   0.165 us    |                      dlm_add_lkb_callback();
>>   3) + 14.492 us   |                    }
>>   3) + 16.381 us   |                  }
>>   3) + 20.384 us   |                }
>>   3)               |                grant_pending_locks() {
>>   3)               |                  grant_pending_convert() {
>>   3)               |                    can_be_granted() {
>>   3)   0.143 us    |                      _can_be_granted();
>>   3)   0.906 us    |                    }
>>   3)   1.900 us    |                  }
>>   3)   2.738 us    |                }
>>   3) + 24.670 us   |              }
>>   3)   0.154 us    |              put_rsb();
>>   3) + 28.068 us   |            }
>>   3)               |            dlm_put_lkb() {
>>   3)   0.163 us    |              __put_lkb();
>>   3)   1.029 us    |            }
>>   3)   0.195 us    |            dlm_put_lockspace();
>>   3) + 34.035 us   |          }
>>   3) + 34.914 us   |        }
>>   3) + 35.919 us   |      }
>>   3) + 37.864 us   |    }
>>   3)               |    ocfs2_schedule_blocked_lock() {
>>   3)   0.210 us    |      lockres_set_flags();
>>   0)               |  process_asts() {
>>   3)   0.998 us    |    }
>>   0)   0.215 us    |    dlm_rem_lkb_callback();
>>   3) + 40.671 us   |  }
>>   0)   0.084 us    |    dlm_rem_lkb_callback();
>>   0)               |    fsdlm_lock_ast_wrapper() {
>>   0)               |      ocfs2_unlock_ast() {
>>   0)   0.088 us    |        ocfs2_get_inode_osb();
>>   0)   9.498 us    |        ocfs2_wake_downconvert_thread();
>>   0)               |        lockres_clear_flags() {
>>   0)   1.272 us    |          lockres_set_flags();
>>   0)   1.757 us    |        }
>>   0) + 13.396 us   |      }
>>   0) + 13.983 us   |    }
>>   0)               |    dlm_put_lkb() {
>>   0)   0.136 us    |      __put_lkb();
>>   0)   0.641 us    |    }
>>   0) + 17.224 us   |  }
>>
>>
>> Thank,
>> Eric
>> On 11/24/15 18:02, Eric Ren wrote:
>>> Hi Joseph,
>>>
>>> I use ftrace's function tracer to record some code flow. There's a question that makes me confused -
>>> why does ocfs2_cancel_convert() be called here in ocfs2dc thread? In other words, what do we expect it
>>> to do here?
>>>
>>> ocfs2_unblock_lock(){
>>>       ...
>>>       if(lockres->l_flags & OCFS2_LOCK_BUSY){
>>>          ...
>>>          ocfs2_cancel_convert()
>>>         ...
>>>      }
>>> }
>>>
>>>  From what I understand, ocfs2_cancel_convert()->ocfs2_dlm_unlock()->user_dlm_unlock()->dlm_unlock(DLM_LKF_CANCEL) puts
>>> the lock back on the the grand queue at its old grant mode.  In my case, you know, read/write the same shared file from two nodes,
>>> I think the up-conversion can only happen on the writing node - (PR->EX), while on the reading node, no up-conversion  is need, right?
>>>
>>> But, the following output from writing and reading nodes, shows that ocfs2_cancel_convert() has been called on both nodes. why could
>>> this happen in this scenario?
>>>
>>> On 11/16/15 09:40, Joseph Qi wrote:
>>>>> Sorry, I'm confused about b). You mean b) is also part of ocfs2cmt's
>>>>> work? Does b) have something to do with a)? And what's the meaning of "evict inode"?
>>>>> Actually, I can hardly understand the idea of b).
>>>> You can go through the code flow:
>>>> iput->iput_final->evict->evict_inode->ocfs2_evict_inode
>>>> ->ocfs2_clear_inode->ocfs2_checkpoint_inode->ocfs2_start_checkpoint
>>>>
>>>> It happens that one node do not use the inode any longer (but not
>>>> delete), and will free its related lockres.
>>> OK, thanks~
>>>
>>> Eric
>
>




More information about the Ocfs2-devel mailing list