[Ocfs2-users] High load average on Apache Cluster with drbd + ocfs2

Joel Becker Joel.Becker at oracle.com
Wed Mar 3 03:26:04 PST 2010


On Wed, Mar 03, 2010 at 09:47:08PM +1100, Brad Plant wrote:
> Turns out I'd hit a free space fragmentation problem. While df reported I had heaps of free space (>50% from memory!), I couldn't write (echo >>) to the log files on the problem web server. Note that you'll find you can still create small files and append to small files, but not the larger apache log files.
> 
> The fact that it happens late at night was very confusing, but eventually made sense. As the day goes on, the log files get bigger and bigger pieces of contiguous free space are required to extend the file. Eventually, a contiguous piece of free space cannot be found and your writes will start to fail.

	This is my assumption, which is why I asked for the debugfs
output.  This will give us layout information.
	There are actually two fragmentation issues here.  First, free
space is fragmented.  This prevents us from allocating metadata blocks.
These metadata blocks are used in the housekeeping of files with many
data extents.  This fits your description of lots of free space, yet
files are unable to grow.
	The second issue is fragmentation of the actual file.  Log files
are written in small hunks.  As each node takes turns extending the log
file, the hunks come from disparate parts of the disk.  This means the
log files have lots of small extents instead of a few large ones.  This
is why we recommend separate log files for each node.  Separate log
files would grow in a much more contiguous fashion, allowing fewer,
larger extents.  With fewer extents, they would need fewer metadata
blocks to grow.

> A *partial* fix went into 2.6.33. It's partial because it doesn't fix the free space fragmentation issue but rather allows the problem node to steal some free space from the node that is still ok. All it does is prolong the problem a little such that writes will start to fail on both nodes at the same time.

	This alleviates the first problem some, as you point out.  It
does nothing for the second problem.
	We have code coming that will help files like these allocate
larger extents, reducing their fragmentation.  However, it will never
solve the alternating extend problem that comes from having both nodes
extend the same log file.

Joel

-- 

"Reality is merely an illusion, albeit a very persistent one."
        - Albert Einstien

Joel Becker
Principal Software Developer
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127



More information about the Ocfs2-users mailing list