[Ocfs2-users] Question about eth communication for ocfs2

Alain.Moulle Alain.Moulle at bull.net
Fri Feb 11 06:38:42 PST 2011


In fact, when I trace in the o2cb ocf script, I got :
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: get_cluster_type: 
Cluster type is: 'openais'.
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: 
init_ais_connection_classic: Creating connection to our Corosync plugin
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: 
init_ais_connection_classic: AIS connection established
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: get_ais_nodeid: Server 
details: id=16777483 uname=chili0 cname=pcmk
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: 
init_ais_connection_once: Connection to 'classic openais (with plugin)': 
established
ocfs2_controld[9265]: 2011/02/11_15:40:15 debug: crm_new_peer: Creating 
entry for node chili0/16777483
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_new_peer: Node 
chili0 now has id: 16777483
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_new_peer: Node 
16777483 is now known as chili0
1297435215 setup_stack at 169: Cluster connection established.  Local node 
id: 16777483
1297435215 setup_stack at 173: Added Pacemaker as client 1 with fd 6
1297435215 setup_ckpt at 609: Initializing CKPT service (try 1)
1297435215 setup_ckpt at 615: Connected to CKPT service with handle 
0x327b23c600000000
1297435215 call_ckpt_open at 160: Opening checkpoint 
"ocfs2:controld:0100010b" (try 1)
1297435215 call_ckpt_open at 170: Opened checkpoint 
"ocfs2:controld:0100010b" with handle 0x6633487300000000
1297435215 call_section_write at 340: Writing to section 
"daemon_max_protocol" on checkpoint "ocfs2:controld:0100010b" (try 1)
1297435215 call_section_create at 292: Creating section 
"daemon_max_protocol" on checkpoint "ocfs2:controld:0100010b" (try 1)
1297435215 call_section_create at 300: Created section 
"daemon_max_protocol" on checkpoint "ocfs2:controld:0100010b"
1297435215 call_section_write at 340: Writing to section 
"ocfs2_max_protocol" on checkpoint "ocfs2:controld:0100010b" (try 1)
1297435215 call_section_create at 292: Creating section 
"ocfs2_max_protocol" on checkpoint "ocfs2:controld:0100010b" (try 1)
1297435215 call_section_create at 300: Created section "ocfs2_max_protocol" 
on checkpoint "ocfs2:controld:0100010b"
1297435215 start_join at 588: Starting join for group "ocfs2:controld"
1297435215 start_join at 592: cpg_join succeeded
1297435215 loop at 975: setup done
ocfs2_controld[9265]: 2011/02/11_15:40:15 notice: ais_dispatch_message: 
Membership 85304: quorum acquired
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_update_peer: Node 
chili0: id=16777483 state=member (new) addr=r(0) ip(11.1.0.1)  (new) 
votes=1 (new) born=85304 seen=85304 
proc=00000000000000000000000000111312 (new)
ocfs2_controld[9265]: 2011/02/11_15:40:15 debug: crm_new_peer: Creating 
entry for node chili1/33554699
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_new_peer: Node 
chili1 now has id: 33554699
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_new_peer: Node 
33554699 is now known as chili1
ocfs2_controld[9265]: 2011/02/11_15:40:15 info: crm_update_peer: Node 
chili1: id=33554699 state=member (new) addr=r(0) ip(11.1.0.2)  votes=1 
born=85304 seen=85304 proc=00000000000000000000000000111312
1297435215 confchg_cb at 495: confchg called
1297435215 daemon_change at 398: ocfs2_controld (group "ocfs2:controld") 
confchg: members 1, left 0, joined 1
1297435215 cpg_joined at 909: CPG is live, we are the first daemon
1297435215 call_ckpt_open at 160: Opening checkpoint "ocfs2:controld" (try 1)
1297435215 call_ckpt_open at 170: Opened checkpoint "ocfs2:controld" with 
handle 0x2ae8944a00000001
1297435215 call_section_write at 340: Writing to section "daemon_protocol" 
on checkpoint "ocfs2:controld" (try 1)
1297435215 call_section_create at 292: Creating section "daemon_protocol" 
on checkpoint "ocfs2:controld" (try 1)
1297435215 call_section_create at 300: Created section "daemon_protocol" on 
checkpoint "ocfs2:controld"
1297435215 call_section_write at 340: Writing to section "ocfs2_protocol" 
on checkpoint "ocfs2:controld" (try 1)
1297435215 call_section_create at 292: Creating section "ocfs2_protocol" on 
checkpoint "ocfs2:controld" (try 1)
1297435215 call_section_create at 300: Created section "ocfs2_protocol" on 
checkpoint "ocfs2:controld"
1297435215 cpg_joined at 923: Daemon protocol is 1.0
1297435215 cpg_joined at 925: fs protocol is 1.0
1297435215 cpg_joined at 927: Connecting to dlm_controld
1297435215 cpg_joined at 934: Opening control device
1297435215 cpg_joined at 938: Error opening control device: Unable to 
access cluster service
1297435215 exit_dlmcontrol at 363: Closing dlm_controld connection
1297435215 start_leave at 613: leaving group "ocfs2:controld"
1297435215 start_leave at 626: cpg_leave succeeded
1297435215 exit_cpg at 760: closing cpg connection
1297435215 call_ckpt_close at 240: Closing checkpoint 
"ocfs2:controld:0100010b" (try 1)
1297435215 call_ckpt_close at 246: Closed checkpoint "ocfs2:controld:0100010b"
1297435215 exit_ckpt at 643: Disconnecting from CKPT service (try 1)
1297435215 exit_ckpt at 647: Disconnected from CKPT service
1297435215 exit_stack at 145: closing pacemaker connection
ocfs2_controld[9265]: 2011/02/11_15:40:15 notice: 
terminate_ais_connection: Disconnecting from AIS
~                                                                                                  


Hi,

I've tried to configure ocfs2.pcmk again with ocfs2 rpms 1.6.3.1
and rpms pacemaker-1.1.2-7.el6 & corosync-1.2.3-21.el6 :
I've configured the clone-dlm and the o2cb-dlm likewise I did
with 1.4.3-3 and with the  export in /etc/init.d/corosync :
export 
COROSYNC_DEFAULT_CONFIG_IFACE="openaisserviceenableexperimental:corosync_parser"
but when starting corosync , clone-dlm start successfully,
but on clone-o2cb the bringup_daemon fails , so I tried to start it 
manually with -D and got:
 ocfs2_controld.pcmk -D
ocfs2_controld[8533]: 2011/02/11_15:09:39 CRIT: get_cluster_type: This 
installation of Pacemaker does not support the '(null)' cluster 
infrastructure.  Terminating.
whereas it was starting when using ocfs2 rpms 1.4.3-3 with same releases 
of pacemaker and corosync.

Any idea ?
Thanks a lot.
Alain



Sunil Mushran a écrit :
> On 02/08/2011 01:32 AM, Alain.Moulle wrote:
>   
>> OK but what I wonder now is :
>> is OCFS2 really capable of fencing an adjacent node ?
>> or is it only capable of "node self-fencing" ?
>> I thought that ocfs2 was only capable of "node self-fencing" because
>> there is no configuration of any fencing device (i.e. ipmi ,etc.) in the
>> ocfs2 configuration, so a node can't fence another node as in a HA
>> software (like Cluster Suite, Pacemaker, etc.)
>>
>> So only "node self-fencing" possibility , right ?
>>
>> So, in case of linkcom breakdown, you seem to tell me that
>> only the slave node will self-fence , and we can't really know
>> which one of both nodes will self-fence as we can't know which
>> one is the slave when the linkcom breakdown happens , right ?
>>     
>
> Note that while ocfs2 comes with a default cluster stack called o2cb
> it can also be configured to work with pacemaker and cman.
>
> The previous answers were all relating to o2cb. And o2cb does not
> support configurable fencing agents.
>
> For that you'll have to use pcmk/cman. sles11, opensuse, fedora,
> debian, ubuntu ship ocfs2 with pcmk/cman.
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>
>
>   

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110211/a48b19ca/attachment.html 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: Portion de message jointe
Url: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20110211/a48b19ca/attachment.pl 


More information about the Ocfs2-users mailing list