[Ocfs2-users] OCFS2 performance and debug help
Adelino Monteiro
adelino.monteiro at gmail.com
Wed Apr 11 06:31:04 PDT 2012
Some more information that could help debug these issues. I just
issued a ps axl that returns the information of the system call where
the process is sleeping.
[cld at BO01 ~]$ ps axl | grep D
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 500 1502 23156 20 0 103308 840 - R+ pts/3 0:00 grep D
1 0 1697 2 0 -20 0 0 msleep S< ? 1:33
[o2hb-DBAAFEC1F3]
1 0 1712 2 0 -20 0 0 msleep S< ? 1:35
[o2hb-D465A573D9]
1 0 1757 2 0 -20 0 0 msleep S< ? 1:34
[o2hb-1D67B38925]
1 0 17525 2 0 -20 0 0 msleep S< ? 0:03
[o2hb-83D827E8AB]
1 0 17532 2 20 0 0 0 jbd2_j D ? 0:01
[jbd2/sdg-65]
4 0 19524 1 20 0 1587484 919776 sync_b Dl pts/0 20:49
ruby /usr/local/rvm/gems/ruby-1.9.2-p180/bin/rake fixes:covers[6]
--trace
4 0 24576 1 20 0 1449236 705832 ocfs2_ Dl ? 136:36
ruby /usr/local/rvm/gems/ruby-1.9.2-p180/bin/rake jobs:work
RAILS_ENV=production QUEUE=p_covers --trace
0 500 24712 1 20 0 1873140 1134856 start_ Dl ? 3488:11
ruby /usr/local/rvm/gems/ruby-1.9.2-p180/bin/rake jobs:work
RAILS_ENV=production QUEUE=pl_covers_soap --trace
0 500 24714 1 20 0 1249756 511468 start_ Dl ? 1002:28
ruby /usr/local/rvm/gems/ruby-1.9.2-p180/bin/rake jobs:work
RAILS_ENV=production QUEUE=pl_covers --trace
Has anyone some tips to help debug these issues?
Adelino
On Mon, Apr 9, 2012 at 7:23 PM, Adelino Monteiro
<adelino.monteiro at gmail.com> wrote:
> Hello,
>
> Slowdowns are new but we're also just now beginning to have more
> Reads, before we where mainly filling up the filesystem. Most of the
> files (I would say 95%) are not changed after copying, they are only
> read. Here is a simple df -h from the mounted partitions
>
> /dev/sdd 14T 3.2T 11T 24% /mnt/3
> /dev/sde 14T 1.2T 13T 9% /mnt/4
> /dev/sdh 14T 2.8T 11T 21% /mnt/7
> /dev/sdb 14T 13T 1.5T 90% /mnt/1
> /dev/sdf 14T 11T 3.0T 79% /mnt/5
>
> As you can see there is only one that is 90% full but the problems are
> on all of them now.
>
> In order to see if the problem was somehow related with the partition
> we copied the contents (a simple cp /mnt/6/* /mnt/5) from one
> partition to another and surprisingly or not the issue is also on the
> "new" partition.
>
> I just tried to copy 240Mb to this partition and after 9 min of
> waiting the copy just went on and 30 seconds later all was copied.
>
> The hardware this runs on is a DELL MD 3200 on a VMWare ESX 5 environment.
>
> I would love to give you some numbers just let me know the commands I
> need to run.
>
> Adelino
>
> On Mon, Apr 9, 2012 at 5:35 PM, Joel Becker <jlbec at evilplan.org> wrote:
>> On Thu, Apr 05, 2012 at 05:11:59PM +0100, Adelino Monteiro wrote:
>>> Hello all,
>>>
>>> For 4 month now I'm using OCFS in an environment with 7 partitions
>>> each with 14 Tb running Oracle Linux 6.2 and until last week
>>> everything was fine.
>>> Now however we're running into severe performance problems when doing
>>> simple copies.
>>>
>>> I have one of the 7 partitions mounted as RW on one server and 4
>>> servers with RO. I did a simple cp of various files on the RW server
>>> and during that copy the process got into D state and a simple df for
>>> instance blocked. It took minutes for something that should be
>>> immediate. This is happening on any of those partitions.
>>
>> Hey Adelino,
>> I'd love to understand your problems. You say you've been
>> running these systems for four months. Are the slowdowns new?
>> Was anything happening on the RO servers at the time?
>> Especially touching the same files or directories? How full are the
>> filesystems? How much change to they have (that is, are the files
>> long-lived or constantly being deleted and created)?
>>
>> Joel
>>
>> --
>>
>> "Also, all of life's big problems include the words 'indictment' or
>> 'inoperable.' Everything else is small stuff."
>> - Alton Brown
>>
>> http://www.jlbec.org/
>> jlbec at evilplan.org
>
>
>
> --
> Cumprimentos / Best Regards
>
> Adelino Monteiro
--
Cumprimentos / Best Regards
Adelino Monteiro
More information about the Ocfs2-users
mailing list