[Ocfs2-tools-devel] ocfs2-tools-o2cb-1.8.2 - critical issue o2cb
Eugene Istomin
E.Istomin at edss.ee
Fri Mar 15 10:26:39 PDT 2013
Ok, thanks anyway.
A couple of hours i cloned git:master and build rpms with as little as needed
patches.
Here is the project:
https://build.opensuse.org/package/show?package=ocfs2-
tools&project=home%3Aedssvirt%3Abranches%3Anetwork%3Aha-clustering
Here is buid log:
https://build.opensuse.org/package/live_build_log?arch=x86_64&package=ocfs2-
tools&project=home%3Aedssvirt%3Abranches%3Anetwork%3Aha-
clustering&repository=openSUSE_12.2
Here are rpms:
http://download.opensuse.org/repositories/home:/edssvirt:/branches:/network:/ha-
clustering/openSUSE_12.2/x86_64/
Results stays the same:
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 3
name = storage2
node:
number = 0
cluster = storage2
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
node:
number = 1
cluster = storage2
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
node:
number = 2
cluster = storage2
ip_port = 7777
ip_address = 10.251.2.13
name = tsc-hv03
heartbeat:
cluster = storage2
region = 68677F18B1654877BB92D78D400E7E51
# o2cb -vvv add-node storage2 tsc-hv04 --ip 10.251.2.14
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv04' in cluster 'storage2' having ip '10.251.2.14', port '-1'
and number '-1'
Validated IP address '10.251.2.14'
Validated node number '1' <--- so strange
Added node 'tsc-hv04' in cluster 'storage2' having ip '10.251.2.14', port
'7777' and number '1'
# o2cb -vvv add-node storage2 tsc-hv05 --ip 10.251.2.15 --number 6
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv05' in cluster 'storage2' having ip '10.251.2.15', port '-1'
and number '6'
Validated IP address '10.251.2.15'
Validated node number '6' <-- ok here
Added node 'tsc-hv05' in cluster 'storage2' having ip '10.251.2.15', port
'7777' and number '6'
cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 5
name = storage2
node:
number = 0
cluster = storage2
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
heartbeat:
cluster = storage2
region = 68677F18B1654877BB92D78D400E7E51
Can anyone help us?
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Friday 15 March 2013 08:30:55 Sunil Mushran wrote:
This is a tool issue. Not kernel.
Did you try building 1.8.2 from oss.oracle.com/git? It was working fine when I
last worked on it.
Maybe someone else on this list can assist you further. Specifically someone
needs to put a
breakpoint in o2cb_config_store() as that is where we issue the write. Could be
it is not being
called. The flow is simple enough.... and looks correct on the git tree.
On Fri, Mar 15, 2013 at 1:24 AM, Eugene Istomin <E.Istomin at edss.ee> wrote:
Sunil,
I use packages from
https://build.opensuse.org/package/show?project=network%3Aha-
clustering&package=ocfs2-tools
Here is the compile log:
https://build.opensuse.org/package/rawlog?arch=x86_64&package=ocfs2-
tools&project=network%3Aha-clustering&repository=openSUSE_12.2
I spent 4 hours yesterday to try different compilcation variants to double
check of linux kernels & package versions problems - all results are the same.
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Friday 15 March 2013 10:14:23 Eugene Istomin wrote:
Hello Sunil,
here is step-by-step to reproduce this issue:
1) Delete current conf
# rm /etc/ocfs2/cluster.conf
2) Create cluster & autocreate conf
# /tmp/o2cb-1.8.2 -vvv add-cluster storage
3) # cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = local
node_count = 0
name = storage
4) Adding 2 nodes using o2cb 1.8.2
# /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv01 --ip 10.251.2.11
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port '-1'
and number '-1'
Validated IP address '10.251.2.11'
Validated node number '0'
Added node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port
'7777' and number '0'
# /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv02 --ip 10.251.2.12
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port '-1'
and number '-1'
Validated IP address '10.251.2.12'
Validated node number '1'
Added node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port
'7777' and number '1'
5) Checking conf for nodes
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = local
node_count = 2
name = storage
node:
number = 0
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
6) Lets try 1.8.0
# /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv03 --ip 10.251.2.13
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port '-1'
and number '-1'
Validated IP address '10.251.2.13'
Validated node number '1'
Added node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port
'7777' and number '1'
# /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv04 --ip 10.251.2.14
Using config file '/etc/ocfs2/cluster.conf'
Add node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port '-1'
and number '-1'
Validated IP address '10.251.2.14'
Validated node number '2'
Added node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port
'7777' and number '2'
7) Checking conf for nodes
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = local
node_count = 4
name = storage
node:
number = 0
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.13
name = tsc-hv03
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.14
name = tsc-hv04
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Thursday 14 March 2013 20:22:42 Sunil Mushran wrote:
So you are saying 1.8.2 is broken. Enable verbose tracing. That may tell us
more.
Do "o2cb -vvv add-nodes ..." to enable verbose tracing.
On Thu, Mar 14, 2013 at 2:49 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
Ok, i explain the problem
1) # cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 1
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
2)o2cb - 1.8.0
#./o2cb-1.8.0 -V
o2cb-1.8.0 1.8.0
# ./o2cb-1.8.0 add-node storage tsc-hv02 --ip 10.251.2.12
# ./o2cb-1.8.0 add-node storage tsc-hv03 --ip 10.251.2.13
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 3
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
node:
number = 0
cluster = storage
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.13
name = tsc-hv03
Seems ok
3) o2cb 1.8.2
# ./o2cb-1.8.2 add-node storage tsc-hv04 --ip 10.251.2.14
# ./o2cb-1.8.2 add-node storage tsc-hv05 --ip 10.251.2.15
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 5
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
All of other node records are disappeared, strace gets that valid config is
opened but new conf is consist of only first node in list (but node_count is
incremented).
I try lastest git, results are the same.
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Thursday 14 March 2013 14:41:18 Sunil Mushran wrote:
add-node only adds to the local config file. You have to do add-node on all
nodes... followed by register-cluster on all nodes.
Until that is done, the cluster will refuse to mount new volumes on any node.
On Thu, Mar 14, 2013 at 2:37 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
Additional log:
tsc-hv01:/tmp # o2cb add-node storage tsc-hv03 --ip 10.251.2.13
tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
node: 0 tsc-hv03 10.251.2.13:7777 storage
tsc-hv01:/tmp # o2cb add-node storage tsc-hv04 --ip 10.251.2.14
tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
node: 0 tsc-hv03 10.251.2.13:7777 storage
node: 3 tsc-hv04 10.251.2.14:7777 storage
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 4
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
node:
number = 0
cluster = storage
ip_port = 7777
ip_address = 10.251.2.13
name = tsc-hv03
node:
number = 3
cluster = storage
ip_port = 7777
ip_address = 10.251.2.14
name = tsc-hv04
but
tsc-hv01:/tmp # /sbin/o2cb add-node storage tsc-hv05 --ip 10.251.2.15
tsc-hv01:/tmp # cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 5
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Thursday 14 March 2013 23:33:58 Eugene Istomin wrote:
Thanks for the answer,
# /sbin/o2cb -V
o2cb.old 1.8.2
# /sbin/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
but
#/tmp/o2cb -V
o2cb 1.8.0
# /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
On Thursday 14 March 2013 14:20:53 Sunil Mushran wrote:
strace is hard to read.
list-nodes --online prints the nodes that have been registered. If a node
shows fewer than in the config file, then the cluster needs to be (re)registered
on that node.
On Thu, Mar 14, 2013 at 12:21 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
Hello Sunil,
we have critical issue in o2cb part of ocfs2 1.8.2 - getting list of node or
adding node does not affect to ocfs2.conf.
We have this issue on 3 different linuxes (kernel 3.2 - 3.8) so i thik this
might be a sort of general o2cb problems.
#####
Here is some debug info
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 2
name = storage
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
In ocfs2 1.8.0 (return 2 nodes):
# strace -s 2048 ./o2cb list-nodes --oneline storage
stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) = 0
open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname =
tsc-hv02\n\n", 4000) = 261
read(3, "", 4000) = 0
close(3) = 0
write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01
10.251.2.11:7777 storage
) = 42
write(1, "node: 2 tsc-hv02 10.251.2.12:7777 storage\n", 42node: 2 tsc-hv02
10.251.2.12:7777 storage
) = 42
exit_group(0) = ?
In ocfs2 1.8.2 (return 1 node but config have 2 nodes ):
#strace -s 2048 /sbin/o2cb list-nodes --oneline storage
stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) = 0
open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname =
tsc-hv02\n\n", 4000) = 261
read(3, "", 4000) = 0
close(3) = 0
write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01
10.251.2.11:7777 storage
) = 42
exit_group(0) = ?
I can mail you any info you need, please help to resolve this issue.
--
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01
_______________________________________________
Ocfs2-tools-devel mailing list
Ocfs2-tools-devel at oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
_______________________________________________
Ocfs2-tools-devel mailing list
Ocfs2-tools-devel at oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
_______________________________________________
Ocfs2-tools-devel mailing list
Ocfs2-tools-devel at oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-tools-devel/attachments/20130315/7cea3acf/attachment-0001.html
More information about the Ocfs2-tools-devel
mailing list