[Ocfs2-tools-devel] ocfs2-tools-o2cb-1.8.2 - critical issue o2cb

Eugene Istomin E.Istomin at edss.ee
Thu Mar 14 14:49:20 PDT 2013


Ok, i explain the problem


1)  # cat /etc/ocfs2/cluster.conf
cluster:
        heartbeat_mode = global
        node_count = 1
        name = storage

node:
        number = 1
        cluster = storage
        ip_port = 7777
        ip_address = 10.251.2.11
        name = tsc-hv01


2)o2cb - 1.8.0

#./o2cb-1.8.0 -V
o2cb-1.8.0 1.8.0

# ./o2cb-1.8.0 add-node storage tsc-hv02 --ip 10.251.2.12
# ./o2cb-1.8.0 add-node storage tsc-hv03 --ip 10.251.2.13
# cat /etc/ocfs2/cluster.conf
cluster:
        heartbeat_mode = global
        node_count = 3
        name = storage

node:
        number = 1
        cluster = storage
        ip_port = 7777
        ip_address = 10.251.2.11
        name = tsc-hv01

node:
        number = 0
        cluster = storage
        ip_port = 7777
        ip_address = 10.251.2.12
        name = tsc-hv02

node:
        number = 2
        cluster = storage
        ip_port = 7777
        ip_address = 10.251.2.13
        name = tsc-hv03


Seems ok



3) o2cb 1.8.2
# ./o2cb-1.8.2 add-node storage tsc-hv04 --ip 10.251.2.14
# ./o2cb-1.8.2 add-node storage tsc-hv05 --ip 10.251.2.15
# cat /etc/ocfs2/cluster.conf
cluster:
        heartbeat_mode = global
        node_count = 5
        name = storage

node:
        number = 1
        cluster = storage
        ip_port = 7777
        ip_address = 10.251.2.11
        name = tsc-hv01



All of other node records are disappeared, strace gets that valid config is 
opened but new conf is consist of only first node in list (but node_count is 
incremented).

I try lastest git, results are the same.

-- 
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee
Work: +372-640-96-01


On Thursday 14 March 2013 14:41:18 Sunil Mushran wrote:

add-node only adds to the local config file. You have to do add-node on all 
nodes... followed by register-cluster on all nodes.

Until that is done, the cluster will refuse to mount new volumes on any node.




On Thu, Mar 14, 2013 at 2:37 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:

Additional log:
 
tsc-hv01:/tmp # o2cb add-node storage tsc-hv03 --ip 10.251.2.13
tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
node: 0 tsc-hv03 10.251.2.13:7777 storage
 
tsc-hv01:/tmp # o2cb add-node storage tsc-hv04 --ip 10.251.2.14
tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
node: 0 tsc-hv03 10.251.2.13:7777 storage
node: 3 tsc-hv04 10.251.2.14:7777 storage
 
 
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 4
name = storage
 
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
 
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
 
node:
number = 0
cluster = storage
ip_port = 7777
ip_address = 10.251.2.13
name = tsc-hv03
 
node:
number = 3
cluster = storage
ip_port = 7777
ip_address = 10.251.2.14
name = tsc-hv04
 
 
but
 
 
tsc-hv01:/tmp # /sbin/o2cb add-node storage tsc-hv05 --ip 10.251.2.15
tsc-hv01:/tmp # cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 5
name = storage
 
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
 
 
 
-- 
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee 
Work: +372-640-96-01 
 
 

On Thursday 14 March 2013 23:33:58 Eugene Istomin wrote:

Thanks for the answer,
 
 
# /sbin/o2cb -V
o2cb.old 1.8.2
 
# /sbin/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
 
 
but
 
 
#/tmp/o2cb -V
o2cb 1.8.0
 
# /tmp/o2cb list-nodes --oneline storage
node: 1 tsc-hv01 10.251.2.11:7777 storage
node: 2 tsc-hv02 10.251.2.12:7777 storage
 
 
-- 
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee 
Work: +372-640-96-01 
 
 

On Thursday 14 March 2013 14:20:53 Sunil Mushran wrote:

strace is hard to read.


list-nodes --online prints the nodes that have been registered. If a node 
shows fewer than in the config file, then the cluster needs to be (re)registered 
on that node.




On Thu, Mar 14, 2013 at 12:21 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:

Hello Sunil,
 
we have critical issue in o2cb part of ocfs2 1.8.2 - getting list of node or 
adding node does not affect to ocfs2.conf. 
 
We have this issue on 3 different linuxes (kernel 3.2 - 3.8) so i thik this 
might be a sort of general o2cb problems.
 
 
#####
 
Here is some debug info
 
# cat /etc/ocfs2/cluster.conf
cluster:
heartbeat_mode = global
node_count = 2
name = storage
 
node:
number = 1
cluster = storage
ip_port = 7777
ip_address = 10.251.2.11
name = tsc-hv01
 
node:
number = 2
cluster = storage
ip_port = 7777
ip_address = 10.251.2.12
name = tsc-hv02
 
 
 
In ocfs2 1.8.0 (return 2 nodes):
# strace -s 2048 ./o2cb list-nodes --oneline storage
 
 
stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) = 0
open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname = 
storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port = 
7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber = 
2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname = 
tsc-hv02\n\n", 4000) = 261 
read(3, "", 4000) = 0
close(3) = 0
write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01 
10.251.2.11:7777 storage 
) = 42
write(1, "node: 2 tsc-hv02 10.251.2.12:7777 storage\n", 42node: 2 tsc-hv02 
10.251.2.12:7777 storage 
) = 42
exit_group(0) = ?
 
 
 
In ocfs2 1.8.2 (return 1 node but config have 2 nodes ): 
#strace -s 2048 /sbin/o2cb list-nodes --oneline storage
 
stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) = 0
open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname = 
storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port = 
7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber = 
2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname = 
tsc-hv02\n\n", 4000) = 261 
read(3, "", 4000) = 0
close(3) = 0
write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01 
10.251.2.11:7777 storage 
) = 42
exit_group(0) = ?
 
 
 
 
I can mail you any info you need, please help to resolve this issue.
-- 
Best regards,
Eugene Istomin
Senior System Administrator
EDS Systems
E.Istomin at edss.ee 
Work: +372-640-96-01 

_______________________________________________
Ocfs2-tools-devel mailing list
Ocfs2-tools-devel at oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel










_______________________________________________
Ocfs2-tools-devel mailing list
Ocfs2-tools-devel at oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel





-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-tools-devel/attachments/20130314/35a3149a/attachment-0001.html 


More information about the Ocfs2-tools-devel mailing list