[Ocfs2-devel] [2.6.6 svn 1364]System hang randomly when writing
to the same file from different processes of the same node
Mark Fasheh
mark.fasheh at oracle.com
Fri Aug 20 11:43:15 CDT 2004
Are all your nodes updated to r1364 btw? That'd make a big difference as the
voting flags got juggled around a bit (sorry!) Otherwise it looks like it's
hung doing a TRUNCATE_PAGES message which would be very troubling indeed. If
both nodes *are* in fact, running 1364, you mind posting your test code up
so I can give it a try? Thanks,
--Mark
On Fri, Aug 20, 2004 at 04:24:37PM +0800, Chen, Yukun wrote:
> Hi all
>
>
> Steps to duplicate:
>
> 1.Do some operation ,such as mkdir&touch , on node A and node B
>
>
>
> 2.on node A process1 write to a file at a specific position(such as offset
> 1000) ,100 times
>
>
>
> 2.also on node A, at the same time , process2 write to the same file at the
>
>
>
> same position, 100 times
>
>
>
> Repeat step 1-2 several times, system will hang with the following message
> found in node A:
>
>
>
> state=1, lockid=22765568, flags = 0x1000, asked type = 5 master = 1, state =
> 0x0, type = 5
>
> (18397) ERROR at /tmp/trunk/src/dlm.c, 461: status = -110
>
> (18397) ERROR at /tmp/trunk/src/vote.c, 910: inode 5558, vote_status=0,
> vote_state=1, lockid=22765568, flags = 0x1000, asked type = 5 master = 1, state
> = 0x0, type = 5
>
> ...
>
>
>
> on node B , error message with dmesg:
>
> Call Trace:
>
> recalc_task_prio
>
> shedule
>
> ocfs_comm_process_msg
>
> ocfs_dlm_recv_msg
>
> worker_thread
>
> ocfs_dlm_recv_msg
>
> default_wake_function
>
> ....
>
>
>
> Any ideas on it? thanx.
>
>
>
> Aaron
>
> Intel China Software Lab
>
> Tel: 8621-52574545 Ext.1587
>
> E_mail:yukun.chen at intel.com
>
>
>
> _______________________________________________
> Ocfs2-devel mailing list
> Ocfs2-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-devel
--
Mark Fasheh
Software Developer, Oracle Corp
mark.fasheh at oracle.com
More information about the Ocfs2-devel
mailing list