[Ocfs2-users] ENOSPC
David Johle
djohle at industrialinfo.com
Tue Mar 23 17:31:19 PDT 2010
So in light of prior issues with lock contention and such due to
writing apache logs to shared files I have started storing them
locally on each node. I made a script to combine them nightly before
the statistics generator kicks off for the previous day's traffic analysis.
This script, using logresolvemerge.pl, is actually writing the output
back to the shard volume for easy reference later. I figure I would
not have issues with this as it's a large amount of sequential writes
from a single node at off-peak time. However, It's been getting hung
with high CPU from the merger.
I'm pretty sure I'm running into the famous "free space
fragmentation" problem, but wanted to confirm that this was the case
or see if there was additional troubleshooting I can do.
Here's the disk, plenty of overall free space:
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/mpath1 209725440 85311460 124413980 41% /san/live-websites
While my merging was going 100% of a CPU core, but the merged file
was not growing in size and not much I/O actually happening to the
shared volume, I did an strace to see what it was doing and got this:
# strace -p 16844
Process 16844 attached - interrupt to quit
read(3, "1\" 200 936 \"http://www.industria"..., 4096) = 4096
write(1, ".NET CLR 1.1.4322; .NET CLR 2.0."..., 4096) = -1 ENOSPC (No
space left on device)
read(4, "oration&locationName=South+Jerse"..., 4096) = 4096
write(1, "ivers=8&ngPipelines=600&kvtl230="..., 4096) = -1 ENOSPC (No
space left on device)
read(4, "1\" 200 936 \"http://www.industria"..., 4096) = 4096
write(1, "gan+Boulevard&locationCSZ=Salem%"..., 4096) = -1 ENOSPC (No
space left on device)
read(3, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
write(1, "elta=.375&zoomlevel=6&label=Sout"..., 4096) = -1 ENOSPC (No
space left on device)
read(4, "HTTP/1.0\" 200 4096 \"-\" \"WinampMP"..., 4096) = 4096
write(1, "ident/4.0; .NET CLR 1.1.4322; .N"..., 4096) = -1 ENOSPC (No
space left on device)
read(3, "0 36516 \"-\" \"Mozilla/5.0 (compat"..., 4096) = 4096
Now I'm really worried about the cluster stability from other routine
writes that might fail soon. I know the typical workaround is to
reduce the node slots, but I don't have any excess slots to
spare. Are there any other tricks to improve/reduce freespace fragmentation?
More information about the Ocfs2-users
mailing list