[Ocfs2-users] null pointer dereference

srinivas eeda srinivas.eeda at oracle.com
Wed Aug 22 09:23:07 PDT 2012


crash looks similar to what patch 
https://oss.oracle.com/pipermail/ocfs2-devel/2012-January/008469.html 
trying to address. The fix is not yet accepted because as explained in 
the patch description we need to fix the master node to skip sending 
BAST after receiving unlock message.

regarding ERROR: status = -17 what storage do you use? could be due to 
stale data.

On 8/22/2012 2:25 AM, Pawel wrote:
> It was done multiple times,
> even more: system was recreated  by mkfs.
> Still the same behavior...
>
>
> Pawel
>
> On 2012-08-22 04:21, Sunil Mushran wrote:
>> You may want to run a full fsck on the fs.
>>
>> fsck.ocfs2 -fy /dev/xxxx
>>
>> On Tue, Aug 21, 2012 at 12:49 AM, Pawel <pzlist at mp.pl 
>> <mailto:pzlist at mp.pl>> wrote:
>>
>>     Hi,
>>     After upgrading ocfs2 my cluster is instable.
>>
>>     At least ones per week I can see:
>>     kernel panic: Null pointer dereference  at 00048
>>     o2dlm_blocking_ast_wrapper + 0x8/0x20 [ocfs2_stack_o2cb]
>>     stack:
>>     dlm_do_local_bast [ocfs2_dlm]
>>     dlm_lookup_lockers [ocfs2_dlm]
>>     dlm_proxy_ast_handler
>>     add_timer
>>     ..
>>
>>     After that sometimes deadlock happens on another nodes. Entire
>>     cluster
>>     restart solve the issue.
>>     I see in log:
>>     (dlm_thread,7227,3):dlm_send_proxy_ast_msg:484 ERROR:
>>     ECB9442E19A94EAC896641BFADD55E4B: res
>>     M0000000000000001f411c900000000,
>>     error -107 send AST to node 4
>>     (dlm_thread,7227,3):dlm_flush_asts:605 ERROR: status = -107
>>     o2net: No connection established with node 4 after 10.0 seconds,
>>     giving up.
>>     o2net: No connection established with node 4 after 10.0 seconds,
>>     giving up.
>>     o2net: No connection established with node 4 after 10.0 seconds,
>>     giving up.
>>     (dlm_thread,7227,4):dlm_send_proxy_ast_msg:484 ERROR:
>>     ECB9442E19A94EAC896641BFADD55E4B: res
>>     M0000000000000001f411c900000000,
>>     error -107 send AST to node 4
>>     (dlm_thread,7227,4):dlm_flush_asts:605 ERROR: status = -107
>>     o2cb: o2dlm has evicted node 4 from domain
>>     ECB9442E19A94EAC896641BFADD55E4B
>>     o2cb: o2dlm has evicted node 4 from domain
>>     ECB9442E19A94EAC896641BFADD55E4B
>>     o2dlm: Begin recovery on domain ECB9442E19A94EAC896641BFADD55E4B
>>     for node 4
>>     o2dlm: Node 5 (he) is the Recovery Master for the dead node 4 in
>>     domain
>>     ECB9442E19A94EAC896641BFADD55E4B
>>     o2dlm: End recovery on domain ECB9442E19A94EAC896641BFADD55E4B
>>
>>
>>     Additionaly ~4 times per day I see:
>>
>>     ocfs2_check_dir_for_entry:2119 ERROR: status = -17
>>     ocfs2_mknod:459 ERROR: status = -17
>>     ocfs2_create:629 ERROR: status = -17
>>
>>
>>     I currently use kernel 3.4.2
>>     my filesystem has been created with:
>>     -N 8-b 4096 -C 32768 --fs-features
>>     backup-super,strict-journal-super,sparse,extended-slotmap,inline-data,metaecc,xattr,indexed-dirs,refcount,discontig-bg,unwritten,usrquota,grpquota
>>
>>     Could you tell me what could make my system instable? Which feature ?
>>
>>     Thanks for any  help
>>
>>     Pawel
>>
>>
>>     _______________________________________________
>>     Ocfs2-users mailing list
>>     Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>>     https://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20120822/046bbb74/attachment.html 


More information about the Ocfs2-users mailing list