[Ocfs2-tools-devel] ocfs2-tools-o2cb-1.8.2 - critical issue o2cb

Goldwyn Rodrigues rgoldwyn at gmail.com
Fri Mar 15 16:24:33 PDT 2013


Could you try the RPMs from
https://build.opensuse.org/project/show?project=home%3Agoldwynr%3Abranches%3Anetwork%3Aha-clustering%3AFactory

They have 12.2 repositories as well.

Patch deb5ade9145f8809f1fde19cf53bdfdf1fb7963e (yeah.. I wrote it :( )
was over-zealous in
removing warnings and g_list_append() got deleted.

Restore g_list_append() to fix the issue.

Signed-off-by: Goldwyn Rodrigues <rgoldwyn at suse.com>
---

diff --git a/o2cb_ctl/jconfig.c b/o2cb_ctl/jconfig.c
index bad34ba..79bbddb 100644
--- a/o2cb_ctl/jconfig.c
+++ b/o2cb_ctl/jconfig.c
@@ -1082,6 +1082,8 @@ JConfigStanza *j_config_add_stanza(JConfig *cf,
                             g_strdup(stanza_name),
                             elem);
     }
+    else
+	g_list_append(elem, cfs);

     return(cfs);
 }  /* j_config_add_stanza() */



On Fri, Mar 15, 2013 at 12:26 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
> Ok, thanks anyway.
>
>
>
>
>
> A couple of hours i cloned git:master and build rpms with as little as
> needed patches.
>
>
>
> Here is the project:
>
> https://build.opensuse.org/package/show?package=ocfs2-tools&project=home%3Aedssvirt%3Abranches%3Anetwork%3Aha-clustering
>
>
>
> Here is buid log:
> https://build.opensuse.org/package/live_build_log?arch=x86_64&package=ocfs2-tools&project=home%3Aedssvirt%3Abranches%3Anetwork%3Aha-clustering&repository=openSUSE_12.2
>
>
>
> Here are rpms:
>
> http://download.opensuse.org/repositories/home:/edssvirt:/branches:/network:/ha-clustering/openSUSE_12.2/x86_64/
>
>
>
>
>
>
>
>
>
> Results stays the same:
>
>
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 3
>
> name = storage2
>
>
>
> node:
>
> number = 0
>
> cluster = storage2
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 1
>
> cluster = storage2
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
> node:
>
> number = 2
>
> cluster = storage2
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
> heartbeat:
>
> cluster = storage2
>
> region = 68677F18B1654877BB92D78D400E7E51
>
>
>
>
>
>
>
>
>
> # o2cb -vvv add-node storage2 tsc-hv04 --ip 10.251.2.14
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv04' in cluster 'storage2' having ip '10.251.2.14', port '-1'
> and number '-1'
>
> Validated IP address '10.251.2.14'
>
> Validated node number '1' <--- so strange
>
> Added node 'tsc-hv04' in cluster 'storage2' having ip '10.251.2.14', port
> '7777' and number '1'
>
>
>
> # o2cb -vvv add-node storage2 tsc-hv05 --ip 10.251.2.15 --number 6
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv05' in cluster 'storage2' having ip '10.251.2.15', port '-1'
> and number '6'
>
> Validated IP address '10.251.2.15'
>
> Validated node number '6' <-- ok here
>
> Added node 'tsc-hv05' in cluster 'storage2' having ip '10.251.2.15', port
> '7777' and number '6'
>
>
>
>
>
>
>
> cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 5
>
> name = storage2
>
>
>
> node:
>
> number = 0
>
> cluster = storage2
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> heartbeat:
>
> cluster = storage2
>
> region = 68677F18B1654877BB92D78D400E7E51
>
>
>
>
>
>
>
>
>
> Can anyone help us?
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Friday 15 March 2013 08:30:55 Sunil Mushran wrote:
>
> This is a tool issue. Not kernel.
>
> Did you try building 1.8.2 from oss.oracle.com/git? It was working fine when
> I last worked on it.
> Maybe someone else on this list can assist you further. Specifically someone
> needs to put a
> breakpoint in o2cb_config_store() as that is where we issue the write. Could
> be it is not being
> called. The flow is simple enough.... and looks correct on the git tree.
>
>
>
> On Fri, Mar 15, 2013 at 1:24 AM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Sunil,
>
>
>
> I use packages from
> https://build.opensuse.org/package/show?project=network%3Aha-clustering&package=ocfs2-tools
>
>
>
> Here is the compile log:
> https://build.opensuse.org/package/rawlog?arch=x86_64&package=ocfs2-tools&project=network%3Aha-clustering&repository=openSUSE_12.2
>
>
>
> I spent 4 hours yesterday to try different compilcation variants to double
> check of linux kernels & package versions problems - all results are the
> same.
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Friday 15 March 2013 10:14:23 Eugene Istomin wrote:
>
> Hello Sunil,
>
>
>
> here is step-by-step to reproduce this issue:
>
>
>
> 1) Delete current conf
>
> # rm /etc/ocfs2/cluster.conf
>
>
>
> 2) Create cluster & autocreate conf
>
> # /tmp/o2cb-1.8.2 -vvv add-cluster storage
>
>
>
> 3) # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 0
>
> name = storage
>
>
>
>
>
>
>
> 4) Adding 2 nodes using o2cb 1.8.2
>
>
>
> # /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv01 --ip 10.251.2.11
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port '-1'
> and number '-1'
>
> Validated IP address '10.251.2.11'
>
> Validated node number '0'
>
> Added node 'tsc-hv01' in cluster 'storage' having ip '10.251.2.11', port
> '7777' and number '0'
>
>
>
> # /tmp/o2cb-1.8.2 -vvv add-node storage tsc-hv02 --ip 10.251.2.12
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port '-1'
> and number '-1'
>
> Validated IP address '10.251.2.12'
>
> Validated node number '1'
>
> Added node 'tsc-hv02' in cluster 'storage' having ip '10.251.2.12', port
> '7777' and number '1'
>
>
>
> 5) Checking conf for nodes
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 2
>
> name = storage
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> 6) Lets try 1.8.0
>
> # /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv03 --ip 10.251.2.13
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port '-1'
> and number '-1'
>
> Validated IP address '10.251.2.13'
>
> Validated node number '1'
>
> Added node 'tsc-hv03' in cluster 'storage' having ip '10.251.2.13', port
> '7777' and number '1'
>
>
>
> # /tmp/o2cb-1.8.0 -vvv add-node storage tsc-hv04 --ip 10.251.2.14
>
> Using config file '/etc/ocfs2/cluster.conf'
>
> Add node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port '-1'
> and number '-1'
>
> Validated IP address '10.251.2.14'
>
> Validated node number '2'
>
> Added node 'tsc-hv04' in cluster 'storage' having ip '10.251.2.14', port
> '7777' and number '2'
>
>
>
> 7) Checking conf for nodes
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = local
>
> node_count = 4
>
> name = storage
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.14
>
> name = tsc-hv04
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Thursday 14 March 2013 20:22:42 Sunil Mushran wrote:
>
> So you are saying 1.8.2 is broken. Enable verbose tracing. That may tell us
> more.
>
> Do "o2cb -vvv add-nodes ..." to enable verbose tracing.
>
>
>
> On Thu, Mar 14, 2013 at 2:49 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Ok, i explain the problem
>
>
>
>
>
> 1) # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 1
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
> 2)o2cb - 1.8.0
>
>
>
> #./o2cb-1.8.0 -V
>
> o2cb-1.8.0 1.8.0
>
>
>
> # ./o2cb-1.8.0 add-node storage tsc-hv02 --ip 10.251.2.12
>
> # ./o2cb-1.8.0 add-node storage tsc-hv03 --ip 10.251.2.13
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 3
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
>
>
> Seems ok
>
>
>
>
>
>
>
> 3) o2cb 1.8.2
>
> # ./o2cb-1.8.2 add-node storage tsc-hv04 --ip 10.251.2.14
>
> # ./o2cb-1.8.2 add-node storage tsc-hv05 --ip 10.251.2.15
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 5
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
>
>
> All of other node records are disappeared, strace gets that valid config is
> opened but new conf is consist of only first node in list (but node_count is
> incremented).
>
>
>
> I try lastest git, results are the same.
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
> On Thursday 14 March 2013 14:41:18 Sunil Mushran wrote:
>
> add-node only adds to the local config file. You have to do add-node on all
> nodes... followed by register-cluster on all nodes.
>
> Until that is done, the cluster will refuse to mount new volumes on any
> node.
>
>
>
> On Thu, Mar 14, 2013 at 2:37 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Additional log:
>
>
>
> tsc-hv01:/tmp # o2cb add-node storage tsc-hv03 --ip 10.251.2.13
>
> tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
> node: 0 tsc-hv03 10.251.2.13:7777 storage
>
>
>
> tsc-hv01:/tmp # o2cb add-node storage tsc-hv04 --ip 10.251.2.14
>
> tsc-hv01:/tmp # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
> node: 0 tsc-hv03 10.251.2.13:7777 storage
>
> node: 3 tsc-hv04 10.251.2.14:7777 storage
>
>
>
>
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 4
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
> node:
>
> number = 0
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.13
>
> name = tsc-hv03
>
>
>
> node:
>
> number = 3
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.14
>
> name = tsc-hv04
>
>
>
>
>
> but
>
>
>
>
>
> tsc-hv01:/tmp # /sbin/o2cb add-node storage tsc-hv05 --ip 10.251.2.15
>
> tsc-hv01:/tmp # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 5
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
>
>
> On Thursday 14 March 2013 23:33:58 Eugene Istomin wrote:
>
> Thanks for the answer,
>
>
>
>
>
> # /sbin/o2cb -V
>
> o2cb.old 1.8.2
>
>
>
> # /sbin/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
>
>
>
>
> but
>
>
>
>
>
> #/tmp/o2cb -V
>
> o2cb 1.8.0
>
>
>
> # /tmp/o2cb list-nodes --oneline storage
>
> node: 1 tsc-hv01 10.251.2.11:7777 storage
>
> node: 2 tsc-hv02 10.251.2.12:7777 storage
>
>
>
>
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
>
>
>
>
> On Thursday 14 March 2013 14:20:53 Sunil Mushran wrote:
>
> strace is hard to read.
>
> list-nodes --online prints the nodes that have been registered. If a node
> shows fewer than in the config file, then the cluster needs to be
> (re)registered on that node.
>
>
>
> On Thu, Mar 14, 2013 at 12:21 PM, Eugene Istomin <E.Istomin at edss.ee> wrote:
>
> Hello Sunil,
>
>
>
> we have critical issue in o2cb part of ocfs2 1.8.2 - getting list of node or
> adding node does not affect to ocfs2.conf.
>
>
>
> We have this issue on 3 different linuxes (kernel 3.2 - 3.8) so i thik this
> might be a sort of general o2cb problems.
>
>
>
>
>
> #####
>
>
>
> Here is some debug info
>
>
>
> # cat /etc/ocfs2/cluster.conf
>
> cluster:
>
> heartbeat_mode = global
>
> node_count = 2
>
> name = storage
>
>
>
> node:
>
> number = 1
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.11
>
> name = tsc-hv01
>
>
>
> node:
>
> number = 2
>
> cluster = storage
>
> ip_port = 7777
>
> ip_address = 10.251.2.12
>
> name = tsc-hv02
>
>
>
>
>
>
>
> In ocfs2 1.8.0 (return 2 nodes):
>
> # strace -s 2048 ./o2cb list-nodes --oneline storage
>
>
>
>
>
> stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) =
> 0
>
> open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
>
> read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
> storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
> 7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
> 2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname
> = tsc-hv02\n\n", 4000) = 261
>
> read(3, "", 4000) = 0
>
> close(3) = 0
>
> write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01
> 10.251.2.11:7777 storage
>
> ) = 42
>
> write(1, "node: 2 tsc-hv02 10.251.2.12:7777 storage\n", 42node: 2 tsc-hv02
> 10.251.2.12:7777 storage
>
> ) = 42
>
> exit_group(0) = ?
>
>
>
>
>
>
>
> In ocfs2 1.8.2 (return 1 node but config have 2 nodes ):
>
> #strace -s 2048 /sbin/o2cb list-nodes --oneline storage
>
>
>
> stat("/etc/ocfs2/cluster.conf", {st_mode=S_IFREG|0644, st_size=261, ...}) =
> 0
>
> open("/etc/ocfs2/cluster.conf", O_RDONLY) = 3
>
> read(3, "cluster:\n\theartbeat_mode = global\n\tnode_count = 2\n\tname =
> storage\n\nnode:\n\tnumber = 1\n\tcluster = storage\n\tip_port =
> 7777\n\tip_address = 10.251.2.11\n\tname = tsc-hv01\n\nnode:\n\tnumber =
> 2\n\tcluster = storage\n\tip_port = 7777\n\tip_address = 10.251.2.12\n\tname
> = tsc-hv02\n\n", 4000) = 261
>
> read(3, "", 4000) = 0
>
> close(3) = 0
>
> write(1, "node: 1 tsc-hv01 10.251.2.11:7777 storage\n", 42node: 1 tsc-hv01
> 10.251.2.11:7777 storage
>
> ) = 42
>
> exit_group(0) = ?
>
>
>
>
>
>
>
>
>
> I can mail you any info you need, please help to resolve this issue.
>
> --
>
> Best regards,
>
> Eugene Istomin
>
> Senior System Administrator
>
> EDS Systems
>
> E.Istomin at edss.ee
>
> Work: +372-640-96-01
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
>
>
>
>
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel



-- 
Goldwyn



More information about the Ocfs2-tools-devel mailing list