[Ocfs2-users] Is it one issue. Do you have some good ideas, thanks a lot.
Srinivas Eeda
srinivas.eeda at oracle.com
Sun Apr 28 11:50:30 PDT 2013
On 04/28/2013 03:54 AM, Guozhonghua wrote:
>
> Hi, everyone
>
> I have some questions with the OCFS2 when using it as vm-store.
>
> With Ubuntu 1204, kernel version is 3.2.40, and ocfs2-tools version is
> 1.6.4.
>
> As the network configure change, there are some issues as the log below.
>
> Why is there the information of "Node 255 (he) is the Recovery Master
> for the dead node 255" in the syslog?
>
This appears to be a bug. Not sure how big the cluster is, but since
this node lost connection to at least few nodes it should have got
evicted itself (unless its a very big cluster)
>
> Why the host ZHJD-VM6 is blocked until it reboot one day time later,
> and what is it wait for still?
>
last message seems to be at Apr 27 17:44:52, so the node is effectively
dead at this time even though it restarted much later. What are the
timeouts set to? what is the fence_method set to? Please forward me the
messages files from other nodes, I would like to see how other nodes
behaved.
>
> Thanks a lot.
>
> Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057330] o2net: Connection to
> node ZHJD-VM5 (num 5) at 185.200.1.16:7100 has been idle for 30.100
> secs, shutting it down.
>
> Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.057359] o2net: No longer
> connected to node ZHJD-VM5 (num 5) at 185.200.1.16:7100
>
> Apr 27 17:35:59 ZHJD-VM6 kernel: [ 3734.058212] o2net: Connected to
> node ZHJD-VM5 (num 5) at 185.200.1.16:7100
>
> Apr 27 17:36:01 ZHJD-VM6 CRON[17869]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:36:01 ZHJD-VM6 CRON[17868]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:37:01 ZHJD-VM6 CRON[18199]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:37:01 ZHJD-VM6 CRON[18198]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:38:01 ZHJD-VM6 CRON[18536]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:38:01 ZHJD-VM6 CRON[18535]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:39:01 ZHJD-VM6 CRON[18798]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:39:01 ZHJD-VM6 CRON[18799]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.123993] INFO: task
> kworker/u:0:5 blocked for more than 120 seconds.
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124000] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124006] kworker/u:0 D
> ffffffff81806240 0 5 2 0x00000000
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124011] ffff8805f792ba70
> 0000000000000046 ffff8805f792ba60 ffff8805d7cf5000
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124017] ffff8805f792bfd8
> ffff8805f792bfd8 ffff8805f792bfd8 0000000000013780
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124022] ffff8805f74696f0
> ffff8805f7905bc0 ffff8805f2a25200 7fffffffffffffff
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124027] Call Trace:
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124035] [<ffffffff8165a55f>]
> schedule+0x3f/0x60
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124039] [<ffffffff8165aba5>]
> schedule_timeout+0x2a5/0x320
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124046] [<ffffffffa036e020>]
> ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124051] [<ffffffff8158346a>]
> ? do_tcp_sendpages+0x5ba/0x6e0
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124055] [<ffffffff8165a39f>]
> wait_for_common+0xdf/0x180
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124061] [<ffffffff8105f990>]
> ? try_to_wake_up+0x200/0x200
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124065] [<ffffffff8165a51d>]
> wait_for_completion+0x1d/0x20
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124092] [<ffffffffa053beb3>]
> __ocfs2_cluster_lock.isra.34+0x1f3/0x810 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124116] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124136] [<ffffffffa053d0a9>]
> ocfs2_orphan_scan_lock+0x99/0xf0 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124159] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124181] [<ffffffffa05541c5>]
> ocfs2_queue_orphan_scan+0x55/0x270 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124204] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124226] [<ffffffffa055441a>]
> ocfs2_orphan_scan_work+0x3a/0xb0 [ocfs2]
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124232] [<ffffffff81084e2a>]
> process_one_work+0x11a/0x480
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124236] [<ffffffff81085bd4>]
> worker_thread+0x164/0x370
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124241] [<ffffffff81085a70>]
> ? manage_workers.isra.29+0x130/0x130
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124246] [<ffffffff8108a42c>]
> kthread+0x8c/0xa0
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124251] [<ffffffff81666bf4>]
> kernel_thread_helper+0x4/0x10
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124255] [<ffffffff8108a3a0>]
> ? flush_kthread_worker+0xa0/0xa0
>
> Apr 27 17:39:45 ZHJD-VM6 kernel: [ 3959.124259] [<ffffffff81666bf0>]
> ? gs_change+0x13/0x13
>
> Apr 27 17:40:01 ZHJD-VM6 CRON[19062]: (root) CMD (
> /opt/bin/ha_check_resource.sh)
>
> Apr 27 17:40:01 ZHJD-VM6 CRON[19061]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:40:01 ZHJD-VM6 CRON[19063]: (root) CMD (
> /opt/bin/ha_cleanup.sh)
>
> Apr 27 17:40:01 ZHJD-VM6 CRON[19064]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:41:01 ZHJD-VM6 CRON[19360]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:41:01 ZHJD-VM6 CRON[19359]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065702] INFO: task
> kworker/u:0:5 blocked for more than 120 seconds.
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065709] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065715] kworker/u:0 D
> ffffffff81806240 0 5 2 0x00000000
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065720] ffff8805f792ba70
> 0000000000000046 ffff8805f792ba60 ffff8805d7cf5000
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065726] ffff8805f792bfd8
> ffff8805f792bfd8 ffff8805f792bfd8 0000000000013780
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065731] ffff8805f74696f0
> ffff8805f7905bc0 ffff8805f2a25200 7fffffffffffffff
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065736] Call Trace:
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065744] [<ffffffff8165a55f>]
> schedule+0x3f/0x60
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065748] [<ffffffff8165aba5>]
> schedule_timeout+0x2a5/0x320
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065755] [<ffffffffa036e020>]
> ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065761] [<ffffffff8158346a>]
> ? do_tcp_sendpages+0x5ba/0x6e0
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065764] [<ffffffff8165a39f>]
> wait_for_common+0xdf/0x180
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065770] [<ffffffff8105f990>]
> ? try_to_wake_up+0x200/0x200
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065774] [<ffffffff8165a51d>]
> wait_for_completion+0x1d/0x20
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065800] [<ffffffffa053beb3>]
> __ocfs2_cluster_lock.isra.34+0x1f3/0x810 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065824] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065845] [<ffffffffa053d0a9>]
> ocfs2_orphan_scan_lock+0x99/0xf0 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065867] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065890] [<ffffffffa05541c5>]
> ocfs2_queue_orphan_scan+0x55/0x270 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065913] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065935] [<ffffffffa055441a>]
> ocfs2_orphan_scan_work+0x3a/0xb0 [ocfs2]
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065941] [<ffffffff81084e2a>]
> process_one_work+0x11a/0x480
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065945] [<ffffffff81085bd4>]
> worker_thread+0x164/0x370
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065949] [<ffffffff81085a70>]
> ? manage_workers.isra.29+0x130/0x130
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065954] [<ffffffff8108a42c>]
> kthread+0x8c/0xa0
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065960] [<ffffffff81666bf4>]
> kernel_thread_helper+0x4/0x10
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065963] [<ffffffff8108a3a0>]
> ? flush_kthread_worker+0xa0/0xa0
>
> Apr 27 17:41:45 ZHJD-VM6 kernel: [ 4079.065967] [<ffffffff81666bf0>]
> ? gs_change+0x13/0x13
>
> Apr 27 17:42:01 ZHJD-VM6 CRON[19620]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:42:01 ZHJD-VM6 CRON[19621]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:43:01 ZHJD-VM6 CRON[19881]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:43:01 ZHJD-VM6 CRON[19882]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:43:32 ZHJD-VM6 kernel: [ 4186.125655] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 has been idle for 30.60
> secs, shutting it down.
>
> Apr 27 17:43:32 ZHJD-VM6 kernel: [ 4186.125688] o2net: No longer
> connected to node 2013-SRV06 (num 1) at 185.200.1.13:7100
>
> Apr 27 17:43:41 ZHJD-VM6 kernel: [ 4195.912900] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 has been idle for 30.81
> secs, shutting it down.
>
> Apr 27 17:43:41 ZHJD-VM6 kernel: [ 4195.912937] o2net: No longer
> connected to node 2013-SRV09 (num 2) at 185.200.1.14:7100
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007406] INFO: task
> kworker/u:0:5 blocked for more than 120 seconds.
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007412] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007418] kworker/u:0 D
> ffffffff81806240 0 5 2 0x00000000
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007423] ffff8805f792ba70
> 0000000000000046 ffff8805f792ba60 ffff8805d7cf5000
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007429] ffff8805f792bfd8
> ffff8805f792bfd8 ffff8805f792bfd8 0000000000013780
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007434] ffff8805f74696f0
> ffff8805f7905bc0 ffff8805f2a25200 7fffffffffffffff
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007439] Call Trace:
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007447] [<ffffffff8165a55f>]
> schedule+0x3f/0x60
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007451] [<ffffffff8165aba5>]
> schedule_timeout+0x2a5/0x320
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007457] [<ffffffffa036e020>]
> ? o2dlm_lock_ast_wrapper+0x20/0x20 [ocfs2_stack_o2cb]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007463] [<ffffffff8158346a>]
> ? do_tcp_sendpages+0x5ba/0x6e0
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007466] [<ffffffff8165a39f>]
> wait_for_common+0xdf/0x180
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007473] [<ffffffff8105f990>]
> ? try_to_wake_up+0x200/0x200
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007476] [<ffffffff8165a51d>]
> wait_for_completion+0x1d/0x20
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007503] [<ffffffffa053beb3>]
> __ocfs2_cluster_lock.isra.34+0x1f3/0x810 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007527] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007547] [<ffffffffa053d0a9>]
> ocfs2_orphan_scan_lock+0x99/0xf0 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007570] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007593] [<ffffffffa05541c5>]
> ocfs2_queue_orphan_scan+0x55/0x270 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007615] [<ffffffffa05543e0>]
> ? ocfs2_queue_orphan_scan+0x270/0x270 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007638] [<ffffffffa055441a>]
> ocfs2_orphan_scan_work+0x3a/0xb0 [ocfs2]
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007644] [<ffffffff81084e2a>]
> process_one_work+0x11a/0x480
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007648] [<ffffffff81085bd4>]
> worker_thread+0x164/0x370
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007652] [<ffffffff81085a70>]
> ? manage_workers.isra.29+0x130/0x130
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007657] [<ffffffff8108a42c>]
> kthread+0x8c/0xa0
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007662] [<ffffffff81666bf4>]
> kernel_thread_helper+0x4/0x10
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007666] [<ffffffff8108a3a0>]
> ? flush_kthread_worker+0xa0/0xa0
>
> Apr 27 17:43:45 ZHJD-VM6 kernel: [ 4199.007670] [<ffffffff81666bf0>]
> ? gs_change+0x13/0x13
>
> Apr 27 17:43:47 ZHJD-VM6 kernel: [ 4201.925965] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 has been idle for 30.109
> secs, shutting it down.
>
> Apr 27 17:43:47 ZHJD-VM6 kernel: [ 4201.926000] o2net: No longer
> connected to node 2013-SRV10 (num 3) at 185.200.1.15:7100
>
> Apr 27 17:43:50 ZHJD-VM6 kernel: [ 4204.140932] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 shutdown, state 7
>
> Apr 27 17:43:53 ZHJD-VM6 kernel: [ 4207.139488] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 shutdown, state 7
>
> Apr 27 17:43:56 ZHJD-VM6 kernel: [ 4210.138028] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 shutdown, state 7
>
> Apr 27 17:43:59 ZHJD-VM6 kernel: [ 4213.136565] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 shutdown, state 7
>
> Apr 27 17:43:59 ZHJD-VM6 kernel: [ 4213.928171] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:01 ZHJD-VM6 CRON[20049]: (root) CMD (
> /opt/bin/ocfs2_iscsi_conf_chg_timer.sh)
>
> Apr 27 17:44:01 ZHJD-VM6 CRON[20050]: (root) CMD (
> /opt/bin/libvirtd_check.sh)
>
> Apr 27 17:44:02 ZHJD-VM6 kernel: [ 4216.127082] o2net: No connection
> established with node 1 after 30.0 seconds, giving up.
>
> Apr 27 17:44:02 ZHJD-VM6 kernel: [ 4216.135116] o2net: Connection to
> node 2013-SRV06 (num 1) at 185.200.1.13:7100 shutdown, state 7
>
> Apr 27 17:44:02 ZHJD-VM6 kernel: [ 4216.926731] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:05 ZHJD-VM6 kernel: [ 4219.925271] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:05 ZHJD-VM6 kernel: [ 4219.941252] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:07 ZHJD-VM6 kernel: [ 4221.101584] o2cb: o2dlm has
> evicted node 1 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:07 ZHJD-VM6 kernel: [ 4221.101891] o2cb: o2dlm has
> evicted node 1 from domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:08 ZHJD-VM6 kernel: [ 4222.400030] o2dlm: Begin recovery
> on domain AE16636E1B83497A88D6A50178172ECA for node 1
>
> Apr 27 17:44:08 ZHJD-VM6 kernel: [ 4222.923814] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:08 ZHJD-VM6 kernel: [ 4222.939801] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:09 ZHJD-VM6 kernel: [ 4222.959757] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 1
>
> Apr 27 17:44:11 ZHJD-VM6 kernel: [ 4225.922350] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:11 ZHJD-VM6 kernel: [ 4225.938346] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:12 ZHJD-VM6 kernel: [ 4225.978308] o2net: No connection
> established with node 2 after 30.0 seconds, giving up.
>
> Apr 27 17:44:12 ZHJD-VM6 kernel: [ 4225.978415]
> (dlm_reco_thread,13736,2):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0xe00bcbbe) to node 2
>
> Apr 27 17:44:12 ZHJD-VM6 kernel: [ 4225.978427]
> (dlm_reco_thread,14227,1):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0x77c0b1d1) to node 2
>
> Apr 27 17:44:12 ZHJD-VM6 kernel: [ 4225.978434]
> (dlm_reco_thread,14227,1):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:12 ZHJD-VM6 kernel: [ 4225.978441]
> (dlm_reco_thread,13736,2):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:14 ZHJD-VM6 kernel: [ 4228.920883] o2net: Connection to
> node 2013-SRV09 (num 2) at 185.200.1.14:7100 shutdown, state 7
>
> Apr 27 17:44:14 ZHJD-VM6 kernel: [ 4228.936893] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:15 ZHJD-VM6 kernel: [ 4229.113560] o2cb: o2dlm has
> evicted node 2 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:15 ZHJD-VM6 kernel: [ 4229.113700] o2cb: o2dlm has
> evicted node 2 from domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:17 ZHJD-VM6 kernel: [ 4231.935407] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.991421] o2net: No connection
> established with node 3 after 30.0 seconds, giving up.
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.991522]
> (dlm_reco_thread,13736,2):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0xe00bcbbe) to node 3
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.991534]
> (dlm_reco_thread,14227,3):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0x77c0b1d1) to node 3
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.991541]
> (dlm_reco_thread,13736,2):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.991549]
> (dlm_reco_thread,14227,3):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992415]
> (dlm_reco_thread,13736,2):dlm_do_master_request:1332 ERROR: link to 2
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992425]
> (dlm_reco_thread,13736,2):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992433]
> (dlm_reco_thread,13736,2):dlm_do_master_request:1332 ERROR: link to 3
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992440]
> (dlm_reco_thread,13736,2):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992472]
> (dlm_reco_thread,14227,3):dlm_do_master_request:1332 ERROR: link to 2
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992482]
> (dlm_reco_thread,14227,3):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992489]
> (dlm_reco_thread,14227,3):dlm_do_master_request:1332 ERROR: link to 3
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.992497]
> (dlm_reco_thread,14227,3):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993204]
> (dlm_reco_thread,13736,2):dlm_restart_lock_mastery:1221 ERROR: node
> down! 2
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993214]
> (dlm_reco_thread,13736,2):dlm_wait_for_lock_mastery:1038 ERROR: status
> = -11
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993223]
> (dlm_reco_thread,13736,2):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0xe00bcbbe) to node 3
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993232]
> (dlm_reco_thread,13736,2):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993258]
> (dlm_reco_thread,14227,3):dlm_restart_lock_mastery:1221 ERROR: node
> down! 2
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993273]
> (dlm_reco_thread,14227,3):dlm_wait_for_lock_mastery:1038 ERROR: status
> = -11
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993283]
> (dlm_reco_thread,14227,3):dlm_do_master_requery:1656 ERROR: Error -107
> when sending message 514 (key 0x77c0b1d1) to node 3
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993291]
> (dlm_reco_thread,14227,3):dlm_pre_master_reco_lockres:2151 ERROR:
> status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993876]
> (dlm_reco_thread,13736,2):dlm_do_master_request:1332 ERROR: link to 3
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993885]
> (dlm_reco_thread,13736,2):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993901]
> (dlm_reco_thread,14227,3):dlm_do_master_request:1332 ERROR: link to 3
> went down!
>
> Apr 27 17:44:18 ZHJD-VM6 kernel: [ 4231.993910]
> (dlm_reco_thread,14227,3):dlm_get_lock_resource:917 ERROR: status = -107
>
> Apr 27 17:44:20 ZHJD-VM6 kernel: [ 4234.933965] o2net: Connection to
> node 2013-SRV10 (num 3) at 185.200.1.15:7100 shutdown, state 7
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.214606] o2cb: o2dlm has
> evicted node 1 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.214613]
> (kworker/u:2,19288,6):dlm_begin_reco_handler:2728
> AB92EF420A5A475ABD6C139B0C7DDD1C: dead_node previously set to 1, node
> 4 changing it to 1
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.317544] o2dlm: Node 4 (he) is
> the Recovery Master for the dead node 1 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.317548] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.739293] o2cb: o2dlm has
> evicted node 1 from domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.739300]
> (kworker/u:2,19288,6):dlm_begin_reco_handler:2728
> AE16636E1B83497A88D6A50178172ECA: dead_node previously set to 1, node
> 4 changing it to 1
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.760743] o2cb: o2dlm has
> evicted node 2 from domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.790445] o2cb: o2dlm has
> evicted node 3 from domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.833635] o2dlm: Node 255 (he)
> is the Recovery Master for the dead node 255 in domain
> AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:22 ZHJD-VM6 kernel: [ 4236.833639] o2dlm: End recovery on
> domain AE16636E1B83497A88D6A50178172ECA
>
> Apr 27 17:44:23 ZHJD-VM6 kernel: [ 4237.125780] o2cb: o2dlm has
> evicted node 3 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.314898] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.319727] o2cb: o2dlm has
> evicted node 2 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.319734]
> (kworker/u:3,14232,0):dlm_begin_reco_handler:2728
> AB92EF420A5A475ABD6C139B0C7DDD1C: dead_node previously set to 2, node
> 5 changing it to 2
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.319793] o2dlm: Node 5 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.319797] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.544905] o2cb: o2dlm has
> evicted node 2 from domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:27 ZHJD-VM6 kernel: [ 4241.544912]
> (kworker/u:3,14232,0):dlm_begin_reco_handler:2728
> AB92EF420A5A475ABD6C139B0C7DDD1C: dead_node previously set to 2, node
> 7 changing it to 2
>
> Apr 27 17:44:32 ZHJD-VM6 kernel: [ 4246.316421] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:32 ZHJD-VM6 kernel: [ 4246.316426] o2dlm: Node 7 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:32 ZHJD-VM6 kernel: [ 4246.316429] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:37 ZHJD-VM6 kernel: [ 4251.313976] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:37 ZHJD-VM6 kernel: [ 4251.313980] o2dlm: Node 7 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:37 ZHJD-VM6 kernel: [ 4251.313982] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:42 ZHJD-VM6 kernel: [ 4256.311550] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:42 ZHJD-VM6 kernel: [ 4256.311555] o2dlm: Node 7 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:42 ZHJD-VM6 kernel: [ 4256.311558] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:47 ZHJD-VM6 kernel: [ 4261.309120] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:47 ZHJD-VM6 kernel: [ 4261.309125] o2dlm: Node 7 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:47 ZHJD-VM6 kernel: [ 4261.309128] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:52 ZHJD-VM6 kernel: [ 4266.306711] o2dlm: Begin recovery
> on domain AB92EF420A5A475ABD6C139B0C7DDD1C for node 2
>
> Apr 27 17:44:52 ZHJD-VM6 kernel: [ 4266.306715] o2dlm: Node 7 (he) is
> the Recovery Master for the dead node 2 in domain
> AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 27 17:44:52 ZHJD-VM6 kernel: [ 4266.306719] o2dlm: End recovery on
> domain AB92EF420A5A475ABD6C139B0C7DDD1C
>
> Apr 28 10:49:45 ZHJD-VM6 kernel: imklog 5.8.6, log source = /proc/kmsg
> started.
>
> Apr 28 10:49:45 ZHJD-VM6 rsyslogd: [origin software="rsyslogd"
> swVersion="5.8.6" x-pid="1313" x-info="http://www.rsyslog.com"] start
>
> Apr 28 10:49:45 ZHJD-VM6 rsyslogd: rsyslogd's groupid changed to 103
>
> Apr 28 10:49:45 ZHJD-VM6 rsyslogd: rsyslogd's userid changed to 101
>
> Apr 28 10:49:45 ZHJD-VM6 rsyslogd-2039: Could not open output pipe
> '/dev/xconsole' [try http://www.rsyslog.com/e/2039 ]
>
> Apr 28 10:49:45 ZHJD-VM6 kernel: [ 0.000000] Initializing cgroup
> subsys cpuset
>
> Apr 28 10:49:45 ZHJD-VM6 kernel: [ 0.000000] Initializing cgroup
> subsys cpu
>
> -------------------------------------------------------------------------------------------------------------------------------------
> ??????????????????????????,?????????????
> ?????????????????????(??????????????????
> ???)?????????????????,??????????????????
> ??!
> This e-mail and its attachments contain confidential information from
> H3C, which is
> intended only for the person or entity whose address is listed above.
> Any use of the
> information contained herein in any way (including, but not limited
> to, total or partial
> disclosure, reproduction, or dissemination) by persons other than the
> intended
> recipient(s) is prohibited. If you receive this e-mail in error,
> please notify the sender
> by phone or email immediately and delete it!
>
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20130428/ffb0cc24/attachment-0001.html
More information about the Ocfs2-users
mailing list