[Ocfs2-users] out of memory... doing heavy IO on ocfs2 is wasting (low) memory?!

Kurt C. Hackel kurt.hackel at oracle.com
Mon Aug 21 18:06:32 PDT 2006


Hi,

Peter McMahon wrote:
> Any advice on best approach for backing up an OCFS2
> volume with 100's of thousands of files, such as an
> APPL_TOP (or multiple APPL_TOPs as in our case)
>   
Same as my previous advice.  If you are able to unmount from any one 
node in the cluster and remount, you will clear the cache.  If this is 
not possible, you will have to wait for our next release.  At this time, 
the patch for fixing this is cleaning up memory, but we are experiencing 
some other problems (hangs) that need to be addressed before it can even 
be checked into source control.

Thanks
kurt

>
>
>
> --- Alexander Finger <a.finger at fotofinder.net> wrote:
>
>   
>> Hello!
>>
>> Thanks for the fast reply.
>>
>> Kurt Hackel wrote:
>>     
>>> Hi,
>>>
>>> Alexander Finger wrote:
>>>       
>>>> Hello,
>>>>
>>>> my problem: When I want to create a large number
>>>>         
>> of small files on 
>>     
>>>> any node at my ocfs2 cluster, after some time the
>>>>         
>> oom killer starts 
>>     
>>>> killing processes because of low LowMem. All
>>>>         
>> error messages and 
>>     
>>>> memory stats are at the end of this mail.
>>>>         
>>> This is a known issue that is being currently
>>>       
>> fixed for the next 
>>     
>>> scheduled release.  At this time, once a node
>>>       
>> masters a lock resource 
>>     
>>> (from the filesystem this would happen if the node
>>>       
>> were the first node 
>>     
>>> to access that file) it cannot drop the mastery of
>>>       
>> that resource until 
>>     
>>> it unmounts.  The fix is nontrivial but I'm almost
>>>       
>> done with it.  Once 
>>     
>>> the fix is done it will need extensive testing.
>>>       
>> This is very bad... I have prepared the whole
>> cluster (9 nodes) already 
>> and thought I am "close" to deployment... while
>> functional testing the 
>> clusters behavior was "normal" (bonnie & iozone
>> reported good results) 
>> after setting the scheduler to deadline, and doing
>> other fine tuning it 
>> crashed within minutes when I tried to copy our
>> production data into it. 
>> I need just minutes to crash the cluster because I
>> need the cluster to 
>> hold about 10 mio. files (each about 3-5 kB).
>>
>> So I would suggest you send your fix to me for
>> testing... once its 
>> done.  ;-) ... please!
>>     
>>>> The only way to avoid this behavoir is to unmount
>>>>         
>> the ocfs2 partition 
>>     
>>>> after some disk operations, because LowMem
>>>>         
>> (LowFree) stays low until 
>>     
>>>> unmount... I searched the web and found many
>>>>         
>> descriptions of this 
>>     
>>>> error, but no answer how to handle this problem.
>>>>         
>>> Correct.  The only current workaround is to
>>>       
>> unmount, or to attempt to 
>>     
>>> spread the lock resources out across all the nodes
>>>       
>> of the cluster 
>>     
>>> (which may be impossible in your usage case).
>>>       
>> Wonderful, how can I spread the resources? I did
>> recognize such an 
>> option at the documentation. The ocfs2 volume is
>> needed "just" to store 
>> a fast changing and very large directory tree,
>> containing metadata files 
>> (xml). I do not use it (at this point) for
>> database(s) or anythying 
>> else. The cluster has a size of ~ 290 GB. If you
>> need further 
>> information to explain if spreading the lock
>> resources to other nodes or 
>> not may help me, I'll be happy to send them to you.
>>
>>
>> Best regards,
>>
>> Alexander
>>
>> -- 
>> Fotofinder GmbH         USt-IdNr. DE812854514
>> Software Entwicklung    Web:
>> http://www.fotofinder.net/
>> Potsdamer Str. 96       Tel: +49 30 25792890
>> 10785 Berlin            Fax: +49 30 257928999
>>
>>     
>>> begin:vcard
>>>       
>> fn:Fotofinder GmbH / Alexander Finger
>> n:Finger;Alexander
>> org:Fotofinder GmbH;Software Entwicklung
>> adr:;;Potsdamer Str. 96;Berlin;Berlin;10785;DEU
>> email;internet:a.finger at fotofinder.net
>> tel;work:+49 30 25792890
>> tel;fax:+49 30 257928999
>> tel;home:+49 30 25792890
>> x-mozilla-html:FALSE
>> url:http://www.fotofinder.net
>> version:2.1
>> end:vcard
>>
>>     
>>> _______________________________________________
>>>       
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>     
>
>
>
> 		
> ____________________________________________________ 
> On Yahoo!7 
> Answers: Real people ask and answer questions on any topic. 
> http://www.yahoo7.com.au/answers
>   




More information about the Ocfs2-users mailing list