[Ocfs2-tools-devel] ocfs2-tools-o2cb-1.8.2 - critical issue o2cb

Sunil Mushran sunil.mushran at gmail.com
Fri Mar 15 08:30:55 PDT 2013


This is a tool issue. Not kernel.

Did you try building 1.8.2 from oss.oracle.com/git? It was working fine
when I last worked on it.
Maybe someone else on this list can assist you further. Specifically
someone needs to put a
breakpoint in o2cb_config_store() as that is where we issue the write.
Could be it is not being
called. The flow is simple enough.... and looks correct on the git tree.


On Fri, Mar 15, 2013 at 1:24 AM, Eugene Istomin <E.Istomin at edss.ee> wrote:

> **
>
> Sunil,
>
>
>
> I use packages from
> https://build.opensuse.org/package/show?project=network%3Aha-clustering&package=ocfs2-tools
>
>
>
> Here is the compile log:
> https://build.opensuse.org/package/rawlog?arch=x86_64&package=ocfs2-tools&project=network%3Aha-clustering&repository=openSUSE_12.2
>
>
>
> I spent 4 hours yesterday to try different compilcation variants to double
> check of linux kernels & package versions problems - all results are the
> same.
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Friday 15 March 2013 10:14:23 Eugene Istomin wrote:
>
> Hello Sunil,
>
>
>
> here is step-by-step to reproduce this issue:
>
>
>
> 1) Delete current conf
>
> # rm /etc/ocfs2/cluster.conf
>
>
>
> 2) Create cluster & autocreate conf
>
> # /tmp/o2cb-1.8.2 -vvv add-cluster storage
>
>
>
> 3) # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 0
>
> name = storage
>
>
>
>
>
>
>
> 4) Adding 2 nodes using o2cb 1.8.2
>
>
>
> # /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv01 --ip 10.251.2.11
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port
> '-1' and number '-1'
>
> Validated IP address '10.251.2.11'
>
> Validated node number '0'
>
> Added node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port
> '7777' and number '0'
>
>
>
> # /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv02 --ip 10.251.2.12
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port
> '-1' and number '-1'
>
> Validated IP address '10.251.2.12'
>
> Validated node number '1'
>
> Added node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port
> '7777' and number '1'
>
>
>
> 5) Checking conf for nodes
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 2
>
> name = storage
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> 6) Lets try 1.8.0
>
> # /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv03 --ip 10.251.2.13
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port
> '-1' and number '-1'
>
> Validated IP address '10.251.2.13'
>
> Validated node number '1'
>
> Added node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port
> '7777' and number '1'
>
>
>
> # /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv04 --ip 10.251.2.14
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port
> '-1' and number '-1'
>
> Validated IP address '10.251.2.14'
>
> Validated node number '2'
>
> Added node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port
> '7777' and number '2'
>
>
>
> 7) Checking conf for nodes
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 4
>
> name = storage
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.14
>
> name = tsc-hv04
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Thursday 14 March 2013 20:22:42 Sunil Mushran wrote:
>
> So you are saying 1.8.2 is broken. Enable verbose tracing. That may tell
> us more.
>
> Do "o2cb -vvv add-nodes ..." to enable verbose tracing.
>
>
>
> On Thu, Mar 14, 2013 at 2:49 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Ok, i explain the problem
>
>
>
>
>
> 1) # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 1
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
> 2)o2cb - 1.8.0
>
>
>
> #./o2cb-1.8.0 -V
>
> o2cb-1.8.0 1.8.0
>
>
>
> # ./o2cb-1.8.0 add-node storage tsc-hv02 --ip 10.251.2.12
>
> # ./o2cb-1.8.0 add-node storage tsc-hv03 --ip 10.251.2.13
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 3
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
>
>
> Seems ok
>
>
>
>
>
>
>
> 3) o2cb 1.8.2
>
> # ./o2cb-1.8.2 add-node storage tsc-hv04 --ip 10.251.2.14
>
> # ./o2cb-1.8.2 add-node storage tsc-hv05 --ip 10.251.2.15
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 5
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
>
>
> All of other node records are disappeared, strace gets that valid config
> is opened but new conf is consist of only first node in list (but
> node_count is incremented).
>
>
>
> I try lastest git, results are the same.
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Thursday 14 March 2013 14:41:18 Sunil Mushran wrote:
>
> add-node only adds to the local config file. You have to do add-node on
> all nodes... followed by register-cluster on all nodes.
>
> Until that is done, the cluster will refuse to mount new volumes on any
> node.
>
>
>
> On Thu, Mar 14, 2013 at 2:37 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Additional log:
>
>
>
> tsc-hv01:/tmp # o2cb add-node storage tsc-hv03 --ip 10.251.2.13
>
> tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
> node: 0 tsc-hv03 10.251.2.13:7777 storage
>
>
>
> tsc-hv01:/tmp # o2cb add-node storage tsc-hv04 --ip 10.251.2.14
>
> tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
> node: 0 tsc-hv03 10.251.2.13:7777 storage
>
> node: 3 tsc-hv04 10.251.2.14:7777 storage
>
>
>
>
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 4
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
> node:
>
> number = 3
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.14
>
> name = tsc-hv04
>
>
>
>
>
> but
>
>
>
>
>
> tsc-hv01:/tmp # /sbin/o2cb add-node storage tsc-hv05 --ip 10.251.2.15
>
> tsc-hv01:/tmp # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 5
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
>
>
> On Thursday 14 March 2013 23:33:58 Eugene Istomin wrote:
>
> Thanks for the answer,
>
>
>
>
>
> # /sbin/o2cb -V
>
> o2cb.old 1.8.2
>
>
>
> # /sbin/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
>
>
>
>
> but
>
>
>
>
>
> #/tmp/o2cb -V
>
> o2cb 1.8.0
>
>
>
> # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
>
>
> On Thursday 14 March 2013 14:20:53 Sunil Mushran wrote:
>
> strace is hard to read.
>
> list-nodes --online prints the nodes that have been registered. If a node
> shows fewer than in the config file, then the cluster needs to be
> (re)registered on that node.
>
>
>
> On Thu, Mar 14, 2013 at 12:21 PM, Eugene Istomin <E.Istomin at edss.ee>
> wrote:
>
> Hello Sunil,
>
>
>
> we have critical issue in o2cb part of ocfs2 1.8.2 - getting list of node
> or adding node does not affect to ocfs2.conf.
>
>
>
> We have this issue on 3 different linuxes (kernel 3.2 - 3.8) so i thik
> this might be a sort of general o2cb problems.
>
>
>
>
>
> #####
>
>
>
> Here is some debug info
>
>
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 2
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
>
>
>
>
> In ocfs2 1.8.0 (return 2 nodes):
>
> # strace -s 2048 ./o2cb list-nodes --oneline storage
>
>
>
>
>
> stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...})
> = 0
>
> open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
>
> read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
> storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
> 7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
> 2\n\tcluster = storage\n\tip_port = 7777\n\tip_address =
> 10.251.2.12\n\tname = tsc-hv02\n\n", 4000) = 261
>
> read(3, "", 4000) = 0
>
> close(3) = 0
>
> write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1
> tsc-hv01 10.251.2.11:7777 storage
>
> ) = 42
>
> write(1, "node: 2 tsc-hv02 10.251.2.12:7777 storage\n", 42node: 2
> tsc-hv02 10.251.2.12:7777 storage
>
> ) = 42
>
> exit_group(0) = ?
>
>
>
>
>
>
>
> In ocfs2 1.8.2 (return 1 node but config have 2 nodes ):
>
> #strace -s 2048 /sbin/o2cb list-nodes --oneline storage
>
>
>
> stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...})
> = 0
>
> open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
>
> read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
> storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
> 7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
> 2\n\tcluster = storage\n\tip_port = 7777\n\tip_address =
> 10.251.2.12\n\tname = tsc-hv02\n\n", 4000) = 261
>
> read(3, "", 4000) = 0
>
> close(3) = 0
>
> write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1
> tsc-hv01 10.251.2.11:7777 storage
>
> ) = 42
>
> exit_group(0) = ?
>
>
>
>
>
>
>
>
>
> I can mail you any info you need, please help to resolve this issue.
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-tools-devel/attachments/20130315/73dfcde9/attachment-0001.html 


More information about the Ocfs2-tools-devel mailing list