[Ocfs2-users] Strange problems (deadlock) in ocfs2 (rpm 1.2.4-2 and svn 2982) - dlm related?

Marcus Alves Grando marcus.grando at terra.com.br
Mon Mar 5 12:29:15 PST 2007


Sunil Mushran wrote:
> 
> # tcpdump -i <eth1> -C 10 -W 15 -s 10000 -Sw /tmp/`hostname 
> -s`_tcpdump.log -ttt 'port 7777' &
> 
> Initiate tcpdumps on the other 3 nodes. Start the dd's on one node.
> Kill that node. Let it boot back up.
> 
> When you see the problem, do:
> 
> # ps -e -o pid,stat,comm,wchan=WIDE-WCHAN-COLUMN
> 
> 
> Stop the tcpdumps and make them available to me via some ftp site or 
> whatever.
> 
> Also, file a bugzilla for tracking purposes.

Ok. Already opened: http://oss.oracle.com/bugzilla/show_bug.cgi?id=858

Thanks

> 
> Marcus Alves Grando wrote:
>> Sunil Mushran wrote:
>>> How many nodes in the cluster?
>>
>> Four.
>>
>>>
>>> Marcus Alves Grando wrote:
>>>> Hi list,
>>>>
>>>> I have some problems testing ocfs2. My test consist in:
>>>>
>>>> #server1: dd if=/dev/random of=/ocfs2_1/test &
>>>> #server1: dd if=/dev/random of=/ocfs2_2/test &
>>>> #server1: dd if=/dev/random of=/ocfs2_3/test &
>>>> ...
>>>> #server1: dd if=/dev/random of=/ocfs2_12/test &
>>>> #server1:<Ctrl><Alt><SysRQ>B
>>
>> Correct is: <Alt>+<SysRQ>+b
>>
>> Regards
>>
>>>>
>>>> After that, another node begin recovery. After some time (+- 3min), 
>>>> recovery is done. When server1 boot and try mounting all ocfs2 
>>>> filesystems, some problem occurs. Most filesystems mount, but one 
>>>> doesn't. In another node i try to access this filesystem (like ls or 
>>>> cd), and freeze sheel. With ps i can see status of that process: 
>>>> "D+" (Uninterruptible sleep).
>>>>
>>>> Today i'm use svn version 2982 (ocfs2-1.2 branch), and doesn't help. 
>>>> ocfs2-tool are 1.2.3. And i test ocfs2-1.2.3 and ocfs2-1.2.4 redhat 
>>>> AS4 rpms too without success. Servers are RedHat AS4.4, with all 
>>>> updated applied.
>>>>
>>>> The only way to back this filesystem online are rebooting all nodes. :(
>>>>
>>>> Someone know about this problem or have fix for that? Maybe dlm 
>>>> ralated issue? I see many commits dlm related in git...
>>>>
>>>> Regards
>>>>
>>

-- 
Marcus Alves Grando <marcus.grando [] terra.com.br>
Suporte Engenharia 1
Terra Networks Brasil S/A
Tel: 55 (51) 3284-4238

Qual é a sua Terra?



More information about the Ocfs2-users mailing list