[Ocfs2-users] null pointer dereference
Pawel
pzlist at mp.pl
Fri Aug 24 00:45:04 PDT 2012
On 2012-08-22 18:23, srinivas eeda wrote:
> crash looks similar to what patch
> https://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008469.html
> trying to address. The fix is not yet accepted because as explained in
> the patch description we need to fix the master node to skip sending
> BAST after receiving unlock message.
>
> regarding ERROR: status = -17 what storage do you use? could be due to
> stale data.
Size of storage is 400G
OCFS2 works over aoe
>
> On 8/22/2012 2:25 AM, Pawel wrote:
>> It was done multiple times,
>> even more: system was recreated by mkfs.
>> Still the same behavior...
>>
>>
>> Pawel
>>
>> On 2012-08-22 04:21, Sunil Mushran wrote:
>>> You may want to run a full fsck on the fs.
>>>
>>> fsck.ocfs2 -fy /dev/xxxx
>>>
>>> On Tue, Aug 21, 2012 at 12:49 AM, Pawel <pzlist at mp.pl
>>> <mailto:pzlist at mp.pl>> wrote:
>>>
>>> Hi,
>>> After upgrading ocfs2 my cluster is instable.
>>>
>>> At least ones per week I can see:
>>> kernel panic: Null pointer dereference at 00048
>>> o2dlm_blocking_ast_wrapper + 0x8/0x20 [ocfs2_stack_o2cb]
>>> stack:
>>> dlm_do_local_bast [ocfs2_dlm]
>>> dlm_lookup_lockers [ocfs2_dlm]
>>> dlm_proxy_ast_handler
>>> add_timer
>>> ..
>>>
>>> After that sometimes deadlock happens on another nodes. Entire
>>> cluster
>>> restart solve the issue.
>>> I see in log:
>>> (dlm_thread,7227,3):dlm_send_proxy_ast_msg:484 ERROR:
>>> ECB9442E19A94EAC896641BFADD55E4B: res
>>> M0000000000000001f411c900000000,
>>> error -107 send AST to node 4
>>> (dlm_thread,7227,3):dlm_flush_asts:605 ERROR: status = -107
>>> o2net: No connection established with node 4 after 10.0 seconds,
>>> giving up.
>>> o2net: No connection established with node 4 after 10.0 seconds,
>>> giving up.
>>> o2net: No connection established with node 4 after 10.0 seconds,
>>> giving up.
>>> (dlm_thread,7227,4):dlm_send_proxy_ast_msg:484 ERROR:
>>> ECB9442E19A94EAC896641BFADD55E4B: res
>>> M0000000000000001f411c900000000,
>>> error -107 send AST to node 4
>>> (dlm_thread,7227,4):dlm_flush_asts:605 ERROR: status = -107
>>> o2cb: o2dlm has evicted node 4 from domain
>>> ECB9442E19A94EAC896641BFADD55E4B
>>> o2cb: o2dlm has evicted node 4 from domain
>>> ECB9442E19A94EAC896641BFADD55E4B
>>> o2dlm: Begin recovery on domain ECB9442E19A94EAC896641BFADD55E4B
>>> for node 4
>>> o2dlm: Node 5 (he) is the Recovery Master for the dead node 4 in
>>> domain
>>> ECB9442E19A94EAC896641BFADD55E4B
>>> o2dlm: End recovery on domain ECB9442E19A94EAC896641BFADD55E4B
>>>
>>>
>>> Additionaly ~4 times per day I see:
>>>
>>> ocfs2_check_dir_for_entry:2119 ERROR: status = -17
>>> ocfs2_mknod:459 ERROR: status = -17
>>> ocfs2_create:629 ERROR: status = -17
>>>
>>>
>>> I currently use kernel 3.4.2
>>> my filesystem has been created with:
>>> -N 8-b 4096 -C 32768 --fs-features
>>> backup-super,strict-journal-super,sparse,extended-slotmap,inline-data,metaecc,xattr,indexed-dirs,refcount,discontig-bg,unwritten,usrquota,grpquota
>>>
>>> Could you tell me what could make my system instable? Which
>>> feature ?
>>>
>>> Thanks for any help
>>>
>>> Pawel
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>
>>
>>
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> https://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20120824/89c25f37/attachment.html
More information about the Ocfs2-users
mailing list