[Ocfs2-users] Hangups
Andy Herrero
andy at internetborder.se
Wed Nov 16 03:36:37 PST 2011
Hi all!
I have a four-node OCFS2-kluster running RHEL 5.7 (x86_64) on ProLiants DL360-G5's
with 6 GB RAM each. Two webservers running Apache and two MySQL-servers. I've
shut down OCFS2 on the DB-servers since it's never really been used there, so it's only
live on the webservers.
OCFS2 is a 14-disk RAID-10 connected via dedicated QLogic iSCSI-NICs via a Gigabit
switch. The serves are connected to the same switch on a separate VLAN for the
heartbeat. They also have dedicated Gbps-NICs for frontend and backend traffic.
The problem is that last couple of weeks the write-performance has intermittently
slowed down to a crawl and right now the system is pretty much unusable. httpd has
its DocumentRoot on /san and these processes often go into D-state. "iostat -dmx 1"
often reports ~99 %util and writing anything hangs disturbingly often. My biggest
problem right now is that I've got very little to work with (that is no juicy kernel-panics
and such) and the only OCFS-related stuff in the logs look like this:
---CUT---
Nov 14 14:13:49 web02 kernel: INFO: task httpd:3959 blocked for more than 120 seconds.
Nov 14 14:13:49 web02 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 14 14:13:49 web02 kernel: httpd D ffff810001004420 0 3959 3149 3962
3954 (NOTLB)
Nov 14 14:13:49 web02 kernel: ffff81018c407ea8 0000000000000082 ffff81019d600140 ffffffff8875b54b
Nov 14 14:13:49 web02 kernel: ffff810184210000 000000000000000a ffff81018eb957a0 ffffffff80314b60
Nov 14 14:13:49 web02 kernel: 000000b90056ee85 000000000018d3ad ffff81018eb95988 0000000000000003
Nov 14 14:13:49 web02 kernel: Call Trace:
Nov 14 14:13:49 web02 kernel: [<ffffffff8875b54b>] :ocfs2:ocfs2_inode_lock_full+0x5fb/0xfe2
Nov 14 14:13:49 web02 kernel: [<ffffffff8001b035>] cp_new_stat+0xe5/0xfd
Nov 14 14:13:49 web02 kernel: [<ffffffff80063c53>] __mutex_lock_slowpath+0x60/0x9b
Nov 14 14:13:49 web02 kernel: [<ffffffff80063c9d>] .text.lock.mutex+0xf/0x14
Nov 14 14:13:49 web02 kernel: [<ffffffff80013f1b>] generic_file_llseek+0x2a/0x8b
Nov 14 14:13:49 web02 kernel: [<ffffffff80025788>] sys_lseek+0x40/0x60
Nov 14 14:13:49 web02 kernel: [<ffffffff8005d116>] system_call+0x7e/0x83
---CUT---
Also, I see these kind of errors regarding eth4 (heartbeat-network):
# grep 'e1000e.*Detected' /var/log/messages
Nov 13 13:25:00 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 14 09:10:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 14 21:54:49 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 14 21:58:35 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 15 01:42:33 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 15 13:34:01 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 15 20:55:24 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
Nov 16 02:42:55 web02 kernel: e1000e 0000:0b:00.0: eth4: Detected Hardware Unit Hang:
However, they happen relatively seldom and I've tried flood-pinging hosts over that VLAN and
I've never experienced timeouts nor packet-loss so really don't think that's the issue.
I managed to catch an strace running a dd to the drive and it hanged during the 8th 10MB-write for
approx 1 minute and 30 seconds. Things are looking surprisingly cheerful right now as only maybe
1 out of 10 tries hangs like this.
---CUT---
# strace dd if=/dev/zero of=/san/testfile bs=1M count=10
...
open("/san/testfile", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 1
...
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576<APPROX 1,5
MIN DELAY>) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
read(0, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
write(1, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 1048576) = 1048576
close(0) = 0
close(1) = 0
...
write(2, "10+0 records in\n10+0 records out"..., 3310+0 records in
10+0 records out
) = 33
write(2, "10485760 bytes (10 MB) copied", 2910485760 bytes (10 MB) copied) = 29
write(2, ", 86.8347 seconds, 121 kB/s\n", 28, 86.8347 seconds, 121 kB/s
) = 28
...
---CUT---
I've made sure the disk-enclosure is OK, running latest firmware and experiencing no errors,
no other NIC-related problems other than the one mentioned above (eth4).
I realize that there's not much pointing towards OCFS having problems but I cannot for the life
of me comprehend the nature of this problem. If nothing else, I'd appreciate any help in determining
that OCFS can be eliminated from the equation! I've added some more information below. Please
let me know if you need any additional information!
Thanks in advance,
-- Andy Herrero
# uname -a
Linux web01 2.6.18-274.7.1.el5 #1 SMP Mon Oct 17 11:57:14 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
# rpm -qa | grep ocfs
ocfs2-tools-1.4.4-1.el5
ocfs2-2.6.18-238.9.1.el5-1.4.7-1.el5
ocfs2-2.6.18-274.el5-1.4.7-1.el5
ocfs2-2.6.18-274.7.1.el5-1.4.7-1.el5
# cat /etc/ocfs2/cluster.conf
cluster:
name = ocfs2
node_count = 4
node:
name = web01
cluster = ocfs2
number = 0
ip_address = 10.60.2.11
ip_port = 7777
node:
name = web02
cluster = ocfs2
number = 1
ip_address = 10.60.2.12
ip_port = 7777
node:
name = db01
cluster = ocfs2
number = 2
ip_address = 10.60.2.13
ip_port = 7777
node:
name = db02
cluster = ocfs2
number = 3
ip_address = 10.60.2.14
ip_port = 7777
# cat /etc/sysconfig/o2cb
#
# This is a configuration file for automatic startup of the O2CB
# driver. It is generated by running /etc/init.d/o2cb configure.
# On Debian based systems the preferred method is running
# 'dpkg-reconfigure ocfs2-tools'.
#
# O2CB_ENABLED: 'true' means to load the driver on boot.
O2CB_ENABLED=true
# O2CB_STACK: The name of the cluster stack backing O2CB.
O2CB_STACK=o2cb
# O2CB_BOOTCLUSTER: If not empty, the name of a cluster to start.
O2CB_BOOTCLUSTER=ocfs2
# O2CB_HEARTBEAT_THRESHOLD: Iterations before a node is considered dead.
O2CB_HEARTBEAT_THRESHOLD=61
# O2CB_IDLE_TIMEOUT_MS: Time in ms before a network connection is considered dead.
O2CB_IDLE_TIMEOUT_MS=
# O2CB_KEEPALIVE_DELAY_MS: Max time in ms before a keepalive packet is sent
O2CB_KEEPALIVE_DELAY_MS=
# O2CB_RECONNECT_DELAY_MS: Min time in ms between connection attempts
O2CB_RECONNECT_DELAY_MS=
# mounted.ocfs2 -f
Device FS Nodes
/dev/sda1 ocfs2 web01, web02
# df -h /dev/sda1
Filesystem Size Used Avail Use% Mounted on
/dev/sda1 6.4T 5.4T 1001G 85% /san
# df -i /dev/sda1
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/sda1 1708654583 1446362212 262292371 85% /san
# debugfs.ocfs2 -R "ls -l //" /dev/sda1
6 drwxr-xr-x 7 0 0 4096 17-Dec-2008 16:17 .
6 drwxr-xr-x 7 0 0 4096 17-Dec-2008 16:17 ..
7 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 bad_blocks
8 -rw-r--r-- 1 0 0 831488 17-Dec-2008 16:17 global_inode_alloc
9 -rw-r--r-- 1 0 0 4096 17-Dec-2008 16:47 slot_map
10 -rw-r--r-- 1 0 0 1048576 17-Dec-2008 16:17 heartbeat
11 -rw-r--r-- 1 0 0 6998649171968 17-Dec-2008 16:17 global_bitmap
12 drwxr-xr-x 2 0 0 221184 16-Nov-2011 11:08 orphan_dir:0000
13 drwxr-xr-x 2 0 0 32768 16-Nov-2011 12:01 orphan_dir:0001
14 drwxr-xr-x 2 0 0 45056 14-Nov-2011 13:45 orphan_dir:0002
15 drwxr-xr-x 2 0 0 73728 14-Nov-2011 17:07 orphan_dir:0003
16 drwxr-xr-x 2 0 0 4096 17-Dec-2008 16:17 orphan_dir:0004
17 -rw-r--r-- 1 0 0 310378496 17-Dec-2008 16:17 extent_alloc:0000
18 -rw-r--r-- 1 0 0 62914560 17-Dec-2008 16:17 extent_alloc:0001
19 -rw-r--r-- 1 0 0 243269632 17-Dec-2008 16:17 extent_alloc:0002
20 -rw-r--r-- 1 0 0 146800640 17-Dec-2008 16:17 extent_alloc:0003
21 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 extent_alloc:0004
22 -rw-r--r-- 1 0 0 1044381696 17-Dec-2008 16:17 inode_alloc:0000
23 -rw-r--r-- 1 0 0 792723456 17-Dec-2008 16:17 inode_alloc:0001
24 -rw-r--r-- 1 0 0 1132462080 17-Dec-2008 16:17 inode_alloc:0002
25 -rw-r--r-- 1 0 0 1040187392 17-Dec-2008 16:17 inode_alloc:0003
26 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 inode_alloc:0004
27 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0000
28 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0001
29 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0002
30 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0003
31 -rw-r--r-- 1 0 0 268435456 17-Dec-2008 16:40 journal:0004
32 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0000
33 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0001
34 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0002
35 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0003
36 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 local_alloc:0004
37 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0000
38 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0001
39 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0002
40 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0003
41 -rw-r--r-- 1 0 0 0 17-Dec-2008 16:17 truncate_log:0004
# hdparm -tT /dev/sda1
/dev/sda1:
Timing cached reads: 22396 MB in 2.00 seconds = 11225.69 MB/sec
Timing buffered disk reads: 238 MB in 3.00 seconds = 79.32 MB/sec
More information about the Ocfs2-users
mailing list