[Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

Marek Królikowski admin at wset.edu.pl
Wed Dec 21 08:22:19 PST 2011


Hello
After 24 hours testing without quota and without all features kernel don`t 
give me oops but when i use debugfs i still is see:
TEST-MAIL1 ~ # echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
    154     764    7163
TEST-MAIL1 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
debugfs.ocfs2 1.6.4
    531    2649   24882

Thanks


-----Oryginalna wiadomość----- 
From: Srinivas Eeda
Sent: Tuesday, December 20, 2011 8:50 PM
To: Marek Królikowski
Cc: ocfs2-users at oss.oracle.com
Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from both

The link prompts for username/passwd.

On top of the changes you made, please add the following patch

diff -uNrp linux-2.6.32.x86_64.orig/fs/ocfs2/dlmglue.c 
linux-2.6.32.x86_64/fs/ocfs2/dlmglue.c
--- linux-2.6.32.x86_64.orig/fs/ocfs2/dlmglue.c 2011-11-28 
21:51:21.000000000 -0800
+++ linux-2.6.32.x86_64/fs/ocfs2/dlmglue.c 2011-11-28 
22:04:55.000000000 -0800
@@ -3808,6 +3808,8 @@ static int ocfs2_dentry_convert_worker(s
  * for a downconvert.
  */
  d_delete(dentry);
+ if (dentry)
+ d_drop(dentry);
  dput(dentry);

  spin_lock(&dentry_attach_lock);


The patches that I mentioned earlier are made to address a deadlock when
quotas are enabled but I am not sure what the deadlock was and if you
are willing to help, I would suggest the following plan.

1. Disable quotas, revert the patches that I pointed earlier and also
add the above patch and run your test case. You shouldn't see any more
orphans. To  verify (please run the echo command I mentioned)

2. If you are not seeing any more orphans problem 1 is solved, now
enable quotas and run the tests. If you see any deadlock, run the
following on all nodes and provide us the messages files.
  a) echo t > /proc/sysrq-trigger from all nodes

Thanks,
--Srini

Marek Królikowski wrote:
> Hello
> Thank You for answer.
> The most problem i need quota because that will be a /home directory for 
> my maildir users.
> And few days ago like i say i contact with Sunil Mushran  and he tell me 
> to remove this patches and i do this but don`t help me - take a look:
> https://wizja2.tktelekom.pl/ocfs2/
> Thanks
>
> -----Oryginalna wiadomość----- From: Srinivas Eeda
> Sent: Tuesday, December 20, 2011 7:58 PM
> To: Marek Królikowski
> Cc: ocfs2-users at oss.oracle.com
> Subject: Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from 
> both
>
> Marek Królikowski wrote:
>> Sorry i don`t copy everything:
>> TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 5239722 26198604 246266859
> ^^^^^ those numbers (5239722, 6074335) are the problem. What they are
> telling is the orphan directory is filled with flood of files. This is
> because of the change of unlink behavior introduced by patch
> "ea455f8ab68338ba69f5d3362b342c115bea8e13".
>
> If you are interested in details, ... in normal unlink case an entry for
> the deleting file is created in orphan directory as an intermediate step
> and the entry is cleared towards the end of the unlink process. But
> because of that patch, entry doesn't get cleared and sticks around.
>
> OCFS2 has a function called orphan scan which is executed as part of a
> thread which gets a ex lock on orphan scan lock and it then scans to
> clear all entries but it can't because the open lock is still around.
> Since this can takes longer because of the huge number of entries
> getting created, *new deletes will get delayed* as they need the ex lock.
>
> So what can be done? for now if you are not using quotas feature you
> should get a new kernel by backing out the following patches
>
> 5fd131893793567c361ae64cbeb28a2a753bbe35
> f7b1aa69be138ad9d7d3f31fa56f4c9407f56b6a
> ea455f8ab68338ba69f5d3362b342c115bea8e13
>
> or periodically umount the file system on all nodes and remount whenever
> the problem becomes severe.
>
> Thanks,
> --Srini
>
>> TEST-MAIL1# echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 6074335 30371669 285493670
>>  TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 5239722 26198604 246266859
>> TEST-MAIL2 ~ # echo "ls //orphan_dir:0001"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 6074335 30371669 285493670
>>  Thanks for Your help.
>>  *From:* Marek Królikowski <mailto:admin at wset.edu.pl>
>> *Sent:* Tuesday, December 20, 2011 6:39 PM
>> *To:* ocfs2-users at oss.oracle.com <mailto:ocfs2-users at oss.oracle.com>
>> *Subject:* Re: [Ocfs2-users] ocfs2 - Kernel panic on many write/read from 
>> both
>>
>> > I think you are running into a known issue. Are there lot of orphan
>> > files in orphan directory? I am not sure if the problem is still
>> there,
>> > if not please run the same test and once you see the same symptoms,
>> > please run the following and provide me the output
>> >
>> > echo "ls //orphan_dir:0000"|debugfs.ocfs2 <device>|wc
>> > echo "ls //orphan_dir:0001"|debugfs.ocfs2 <device>|wc
>> Hello
>> Thank You for answer - strange i don`t get email with Your answer.
>> This is what You want:
>> TEST-MAIL1# echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 5239722 26198604 246266859
>>  TEST-MAIL2 ~ # echo "ls //orphan_dir:0000"|debugfs.ocfs2 /dev/dm-0|wc
>> debugfs.ocfs2 1.6.4
>> 5239722 26198604 246266859
>>  This is my testing cluster so if u need do more tests please tell me i 
>> do for You.
>> Thanks
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> Ocfs2-users mailing list
>> Ocfs2-users at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
> 




More information about the Ocfs2-users mailing list