[Ocfs2-users] Did anything substantial change between 1.2.4 and 1.3.9?

Mon Apr 21 17:33:29 PDT 2008

Does the web server that the proxy server talks to have any extended 
debugging you can turn on?  In particular, would it be able to log 
timestamps of things it does, so you can narrow down where the hic-up 
occurs?  A brute force method to do this would be to run strace -T on 
all server processes, and look for things that take much longer than 
they should, like disk reads exceeding 100ms, or other syscalls taking 
much longer than usual.  Ideally you'd have some timing around code you 
suspect, and log a message if the time exceeds some configurable limit.

Thanks,
Herbert.

mike wrote:
> You're right, it -is- possible, but if you look at it (and I can log
> it for hours) it only seems to do that right before I get a timeout
> message from the proxy. The two appear to be related.
>
> I will continue to monitor this and make sure that my hypothesis is
> correct. Something is flaking out every so often.
>
> I get this on my nginx proxy server:
>
> 2008/04/21 17:37:01 [error] 1256#0: *7406286 upstream timed out (110:
> Connection timed out) while reading response header from upstream,
> client: 1.2.3.4, server: lvs01.domain.com, request: "GET /someURL.php
> HTTP/1.1", upstream: "http://10.13.5.12:80/someURL.php", host:
> "somedomain.com", referrer: "http://somedomain.com/someURL.php"
>
> That only happens after it's sitting for 3 real-time seconds waiting
> for a reply from the server. Note: this happens no matter what proxy
> and webserver I use. It does not seem to be anything related to that.
>
>
> On 4/21/08, Herbert van den Bergh <herbert.van.den.bergh at oracle.com> wrote:
>   
>> Mike,
>>
>> Are you sure it's not possible for sdb to be idle for just 1 second?  If you
>> look at the interval right after the one you pointed out, you'll see r/s is
>> 2.97 and w/s is .99, so it did 3 reads and 1 write in that one second
>> interval.  The device appears to be used very little.  I think it's quite
>> possible that some 1 second intervals have no reads or writes at all, don't
>> you think?
>>
>> Thanks,
>> Herbert.
>>
>>
>>
>> mike wrote:
>>     
>>> Thanks.
>>>
>>> If I have the opportunity to run the (buggy) new kernel again I will
>>> try this. That is a definately problem and I think I need to set the
>>> oracle behavior to crash and not auto reboot for this to be effective,
>>> right?
>>>
>>> That is just one issue.
>>> 1) 2.6.24-16 with load completely crashes node producing largest i/o
>>> 2) 2.6.22-19 utilization drops to 0% and causes a hiccup randomly (I
>>> don't see a pattern and no batch jobs, or other things running at the
>>> time it happens) - this is more important as it still is happening
>>> even though I'm runnign the more "stable" kernel.
>>>
>>>
>>> On 4/21/08, Sunil Mushran <Sunil.Mushran at oracle.com> wrote:
>>>
>>>
>>>       
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=Documentation/networking/netconsole.txt;h=3c2f2b3286385337ce5ec24afebd4699dd1e6e0a;hb=HEAD
>>     
>>>> netconsole is a facility to capture oops traces. It is not a console
>>>> per se and does not require a head/gtk/x11 etc to work. The link above
>>>> explains the usage, etc.
>>>>
>>>>
>>>> mike wrote:
>>>>
>>>>
>>>>         
>>>>> Well these are headless production servers, CLI only. no GTK, no X11.
>>>>> also I am not running the newer kernels (and I can't...) it looks like
>>>>> I cannot run a hybrid of 2.6.24-16 and 2.6.22-19, whichever one has
>>>>> mounted the drive first is the winner.
>>>>>
>>>>> If I mix them, I can get the 2.6.24's to mount, then the older ones
>>>>> give the "number too large" error or whatever. So I can't currently
>>>>> use one server on my cluster to test because it would require
>>>>> upgrading all of them just for this test.
>>>>>
>>>>> On 4/21/08, Sunil Mushran <Sunil.Mushran at oracle.com> wrote:
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>> Setting up netconsole does not require a reboot. The idea is to
>>>>>> catch the oops trace when the oops happens. Without that trace,
>>>>>> we are flying blind.
>>>>>>
>>>>>>
>>>>>> mike wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> Since these are production I can't do much.
>>>>>>>
>>>>>>> But I did get an error (it's not happening as much but it still
>>>>>>>               
>> blips
>>     
>>>>>>> here and there)
>>>>>>>
>>>>>>> Notice that /dev/sdb (my iscsi target using ocfs2) hits 0.00%
>>>>>>> utilization, 3 seconds before my proxy says "hey, timeout" - every
>>>>>>> other second there is -always- some utilization going on.
>>>>>>>
>>>>>>> What could be steps to figure out this issue? Using debugfs.ocfs2
>>>>>>>               
>> or
>>     
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>> something?
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>> It's mounted as:
>>>>>>> /dev/sdb1 on /home type ocfs2
>>>>>>>
>>>>>>>               
>> (rw,_netdev,noatime,data=writeback,heartbeat=local)
>>     
>>>>>>> I know I'm not being much help, but I'm willing to try almost
>>>>>>>               
>> anything
>>     
>>>>>>> as long as it doesn't cause downtime or require cluster-wide
>>>>>>>               
>> changes
>>     
>>>>>>> (since those require downtime...) - I want to try to go back to
>>>>>>> 2.6.24-16 with data=writeback and see if that fixes the crashing
>>>>>>> issue, but if I'm having issues already like this perhaps I should
>>>>>>> resolve this before moving up.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [root at web03 ~]# cat /root/web03-iostat.txt
>>>>>>>
>>>>>>> Time: 02:11:46 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         3.71    0.00   27.23    8.91    0.00   60.15
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00    54.46    0.00  309.90     0.00  2914.85
>>>>>>> 9.41    23.08   74.47   0.93  28.71
>>>>>>> sdb              12.87     0.00   17.82    0.00   245.54     0.00
>>>>>>> 13.78     0.33   17.78  18.33  32.67
>>>>>>>
>>>>>>> Time: 02:11:47 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.25    0.00   26.24    2.23    0.00   71.29
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00     0.00    0.00    0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sdb               5.94     0.00   22.77    0.99   228.71     0.99
>>>>>>> 9.67     0.42   17.92  17.08  40.59
>>>>>>>
>>>>>>> Time: 02:11:48 PM   <- THIS HAS THE ISSUE
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.00    0.00   25.99    0.00    0.00   74.01
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00    10.89    0.00    2.97     0.00   110.89
>>>>>>> 37.33     0.00    0.00   0.00   0.00
>>>>>>> sdb               0.00     0.00    0.00    0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>>
>>>>>>>
>>>>>>> Time: 02:11:49 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.25    0.00   14.85    0.99    0.00   83.91
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00     0.00    0.00    0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sdb               0.99     0.00    2.97    0.99    30.69     0.99
>>>>>>> 8.00     0.07   17.50  17.50   6.93
>>>>>>>
>>>>>>> Time: 02:11:50 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.74    0.00    1.24    1.73    0.00   96.29
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00     0.00    0.00    0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sdb               0.99     0.00    5.94    0.00    55.45     0.00
>>>>>>> 9.33     0.07   11.67  11.67   6.93
>>>>>>>
>>>>>>> Time: 02:11:51 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.00    0.00    1.24   16.34    0.00   82.43
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00   153.47    0.00  494.06     0.00  5156.44
>>>>>>> 10.44    55.62  107.23   1.16  57.43
>>>>>>> sdb               2.97     0.00   11.88    0.99   117.82     0.99
>>>>>>> 9.23     0.26   13.08  20.00  25.74
>>>>>>>
>>>>>>> Time: 02:11:52 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.00    0.00    0.25    3.22    0.00   96.53
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00     0.00    0.00   16.83     0.00   158.42
>>>>>>> 9.41     0.13  164.71   1.18   1.98
>>>>>>> sdb               1.98     0.00    2.97    0.00    39.60     0.00
>>>>>>> 13.33     0.13   73.33  43.33  12.87
>>>>>>>
>>>>>>> Time: 02:11:53 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         0.50    0.00    0.25    4.70    0.00   94.55
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00     0.00    0.00    0.00     0.00     0.00
>>>>>>> 0.00     0.00    0.00   0.00   0.00
>>>>>>> sdb               5.94     0.00   11.88    0.99   141.58     0.99
>>>>>>> 11.08     0.20   15.38  15.38  19.80
>>>>>>>
>>>>>>> Time: 02:11:54 PM
>>>>>>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>>>>>>         3.96    0.00   10.15    0.74    0.00   85.15
>>>>>>>
>>>>>>> Device:         rrqm/s   wrqm/s     r/s     w/s   rsec/s   wsec/s
>>>>>>> avgrq-sz avgqu-sz   await  svctm  %util
>>>>>>> sda               0.00    20.79    0.00    4.95     0.00   205.94
>>>>>>> 41.60     0.00    0.00   0.00   0.00
>>>>>>> sdb               4.95     0.00    5.94    0.00    87.13     0.00
>>>>>>> 14.67     0.07   11.67  11.67   6.93
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 4/21/08, Sunil Mushran <Sunil.Mushran at oracle.com> wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>>> Do you have the panic output... kernel stack trace. We'll need
>>>>>>>> that to figure this out. Without that, we can only speculate.
>>>>>>>>
>>>>>>>> mike wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> On 4/21/08, Tao Ma <tao.ma at oracle.com> wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> mike wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> I have changed my kernel back to 2.6.22-14-server, and now
>>>>>>>>>>>                       
>> I
>>     
>>>>>>>>>>>                       
>>>> don't
>>>>
>>>>
>>>>         
>>>>>>>>>>>                       
>>>>>> get
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>> the kernel panics. It seems like an issue with 2.6.24-16
>>>>>>>>>>>                       
>> and
>>     
>>>>>>>>>>>                       
>>>> some
>>>>
>>>>
>>>>         
>>>>>>>>>>>                       
>>>>>> i/o
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>> made it crash...
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> OK, so it seems that it is a bug for ocfs2 kernel, not the
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>> ocfs2-tools.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>                     
>>>>>>>> :)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>> Then could you please describe it in more detail about how
>>>>>>>>>>                     
>> the
>>     
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>> kernel
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>                     
>>>>>>>> panic
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>> happens?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> Yeah, this specific issue seems like a kernel issue.
>>>>>>>>>
>>>>>>>>> I don't know, these are production systems and I am already
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>> getting
>>>>
>>>>
>>>>         
>>>>>>>>> angry customers. I can't really test anymore. Both are
>>>>>>>>>                   
>> standard
>>     
>>>>>>>>>                   
>>>> Ubuntu
>>>>
>>>>
>>>>         
>>>>>>>>> kernels.
>>>>>>>>>
>>>>>>>>> Okay: 2.6.22-14-server (I think still minor file access
>>>>>>>>>                   
>> issues)
>>     
>>>>>>>>> Breaks under load: 2.6.24-16-server
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>>> However I am still getting file access timeouts once in a
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>> while. I
>>>>
>>>>
>>>>         
>>>>>>>>>>>                       
>>>>>> am
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>> nervous about putting more load on the setup.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> Also please provide more details about it.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>> I am using nginx for a frontend load balancer, and nginx for a
>>>>>>>>> webserver as well. This doesn't seem to be related to the
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>> webserver at
>>>>
>>>>
>>>>         
>>>>>>>>> all though, it was happening before this.
>>>>>>>>>
>>>>>>>>> lvs01 proxies traffic in to web01, web02, and web03 (currently
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>> using
>>>>
>>>>
>>>>         
>>>>>>>>> nginx, before I was using LVS/ipvsadm)
>>>>>>>>>
>>>>>>>>> Every so often, one of the webservers sends me back
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>>> [root at raid01 .batch]# cat /etc/default/o2cb
>>>>>>>>>>>
>>>>>>>>>>> # O2CB_ENABLED: 'true' means to load the driver on boot.
>>>>>>>>>>> O2CB_ENABLED=true
>>>>>>>>>>>
>>>>>>>>>>> # O2CB_BOOTCLUSTER: If not empty, the name of a cluster to
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>> start.
>>>>
>>>>
>>>>         
>>>>>>>>>>> O2CB_BOOTCLUSTER=mycluster
>>>>>>>>>>>
>>>>>>>>>>> # O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>> considered
>>>>
>>>>
>>>>         
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>> dead.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>>> O2CB_HEARTBEAT_THRESHOLD=7
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> This value is a little smaller, so how did you build up your
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>> shared
>>>>
>>>>
>>>>         
>>>>>>>>>> disk(iSCSI or ...)? The most common value I heard of is 61.
>>>>>>>>>>                     
>> It
>>     
>>>>>>>>>>                     
>>>> is
>>>>
>>>>
>>>>         
>>>>>>>>>>                     
>>>>>> about
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>                     
>>>>>>>> 120
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>> secs. I don't know the reason and maybe Sunil can tell you.
>>>>>>>>>>                     
>> ;)
>>     
>>>>>>>>>> You can also refer to
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>> http://oss.oracle.com/projects/ocfs2/dist/documentation/ocfs2_faq.html#TIMEOUT.
>>     
>>>>         
>>>>>>             
>>>>>>>>                 
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> # O2CB_IDLE_TIMEOUT_MS: Time in ms before a network
>>>>>>>>>>>                       
>> connection
>>     
>>>>>>>>>>>                       
>>>> is
>>>>
>>>>
>>>>         
>>>>>>>>>>> considered dead.
>>>>>>>>>>> O2CB_IDLE_TIMEOUT_MS=10000
>>>>>>>>>>>
>>>>>>>>>>> # O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a
>>>>>>>>>>>                       
>> keepalive
>>     
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>> packet is
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>> sent
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> O2CB_KEEPALIVE_DELAY_MS=5000
>>>>>>>>>>>
>>>>>>>>>>> # O2CB_RECONNECT_DELAY_MS: Min time in ms between
>>>>>>>>>>>                       
>> connection
>>     
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>> attempts
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>> O2CB_RECONNECT_DELAY_MS=2000
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 4/21/08, Tao Ma <tao.ma at oracle.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>                       
>>>>>>>>>>>> Hi Mike,
>>>>>>>>>>>>   Are you sure it is caused by the update of
>>>>>>>>>>>>                         
>> ocfs2-tools?
>>     
>>>>>>>>>>>> AFAIK, the ocfs2-tools only include tools like mkfs,
>>>>>>>>>>>>                         
>> fsck
>>     
>>>>>>>>>>>>                         
>>>> and
>>>>
>>>>
>>>>         
>>>>>>>>>>>>                         
>>>>>> tunefs
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>> etc. So
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> if you don't make any change to the disk(by using this
>>>>>>>>>>>>                         
>> new
>>     
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>> tools),
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>                         
>>>>>>>> it
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>>>> shouldn't cause the problem of kernel panic since they
>>>>>>>>>>>>                         
>> are
>>     
>>>>>>>>>>>>                         
>>>> all
>>>>
>>>>
>>>>         
>>>>>>>>>>>>                         
>>>>>> user
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>> space
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> tools.
>>>>>>>>>>>> Then there is only one thing maybe. Have you modify
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>> /etc/sysconfig/o2cb(This
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> is the place for RHEL, not sure the place in ubuntu)? I
>>>>>>>>>>>>                         
>> have
>>     
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>> checked
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>                         
>>>>>>>> the
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>>>>                         
>>>>>>>>>> rpm
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> package for RHEL, it will update /etc/sysconfig/o2cb and
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>> this
>>>>
>>>>
>>>>         
>>>>>>>>>>>>                         
>>>>>> file
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>                         
>>>>>>>> has
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>>>>                         
>>>>>>>>>> some
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> timeouts defined in it.
>>>>>>>>>>>> So do you have some backups for this file? If yes,
>>>>>>>>>>>>                         
>> please
>>     
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>> restore it
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>                         
>>>>>>>> to
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>                 
>>>>>>>>>>>>                         
>>>>>>>>>> see
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> whether it helps(I can't say it for sure).
>>>>>>>>>>>> If not, do you remember the old value of some timeouts
>>>>>>>>>>>>                         
>> you
>>     
>>>>>>>>>>>>                         
>>>> set
>>>>
>>>>
>>>>         
>>>>>>>>>>>>                         
>>>>>> for
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>             
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>> ocfs2? If
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>>> yes, you can use o2cb configure to set them by yourself.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>                         
>>>>>>>>>>                     
>> _______________________________________________
>>     
>>>>>>>>> Ocfs2-users mailing list
>>>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>>
>>>>
>>>>         
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>                 
>>>>>>> _______________________________________________
>>>>>>> Ocfs2-users mailing list
>>>>>>> Ocfs2-users at oss.oracle.com
>>>>>>>
>>>>>>>               
>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>     
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>               
>>>>>>             
>>>>         
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>>
>>>
>>>