[Ocfs2-users] Reservation conflicts
Sunil Mushran
sunil.mushran at oracle.com
Wed Dec 15 10:17:11 PST 2010
Ideally the scsi reservation error should be trapped by hypervisor/mgmt
domain and should not bubble upto the guest. That is if vmfs is doing the
reservation. Have you looked into the logs on all machines? See if there
is a way to get vmfs to log that info.
As far as RDM's goes, that's how I believe people use it. But you'll have to
get confirmation from actual vmware users.
On 12/15/2010 09:59 AM, brad hancock wrote:
> We have never used RDM in the past due to backup reasons etc and VM admins not having to deal the SAN admins. Do you think this would resolve the issue?
>
>
> On Tue, Dec 14, 2010 at 3:25 PM, Sunil Mushran <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>> wrote:
>
> I meant repeats 60 secs at a stretch. If not, as it seems so, then the messages
> should be only annoying.
>
> VMFS uses SCSI Reservation to perform disk based locking. See if they have
> some logging in ESX that shows when a VMFS performs reserve/unreserve
> on a SCSI device. You'll have to look at the logs of all nodes. As in, that log
> will be on a different node than that that got this error.
>
> BTW, any reason you are not using RDM.
>
>
> On 12/14/2010 12:51 PM, brad hancock wrote:
>> The issue does repeat.
>>
>> I looked through the vsphere 4.1, and the host logs and didn't see anything weird that corresponds with these times.
>>
>> What is a reservation conflict? Can this issue cause the nodes to see different data?
>>
>>
>> Dec 14 07:37:52 mdcvmsmes02 kernel: [351952.113847] sd 1:0:0:0: reservation conflict
>> Dec 14 07:37:52 mdcvmsmes02 kernel: [351952.113859] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 07:37:52 mdcvmsmes02 kernel: [351952.113868] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 07:37:52 mdcvmsmes02 kernel: [351952.114134] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 07:37:52 mdcvmsmes02 kernel: [351952.114379] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.233764] sd 1:0:0:0: reservation conflict
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.233775] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.233855] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.234112] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.234365] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.234789] sd 1:0:0:0: reservation conflict
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.234793] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.234796] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.235033] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 07:51:01 mdcvmsmes02 kernel: [352762.235273] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 09:23:15 mdcvmsmes02 kernel: [358423.734356] sd 1:0:0:0: reservation conflict
>> Dec 14 09:23:15 mdcvmsmes02 kernel: [358423.734366] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 09:23:15 mdcvmsmes02 kernel: [358423.734370] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 09:23:15 mdcvmsmes02 kernel: [358423.734620] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 09:23:15 mdcvmsmes02 kernel: [358423.734882] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.184302] sd 1:0:0:0: reservation conflict
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.184312] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.184316] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.184565] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.184809] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.188045] sd 1:0:0:0: reservation conflict
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.188045] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.188045] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.188045] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 10:25:27 mdcvmsmes02 kernel: [362254.188045] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] sd 1:0:0:0: reservation conflict
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] sd 1:0:0:0: reservation conflict
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] end_request: I/O error, dev sdb, sector 1735
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>> Dec 14 10:33:08 mdcvmsmes02 kernel: [362727.621062] (1882,0):o2hb_do_disk_heartbeat:753 ERROR: status = -5
>>
>>
>>
>>
>>
>>
>>
>> On Tue, Dec 14, 2010 at 11:38 AM, Sunil Mushran <sunil.mushran at oracle.com <mailto:sunil.mushran at oracle.com>> wrote:
>>
>> sd 1:0:0:0: reservation conflict
>>
>> That's the cause of the error in the guest. You'll have to track the error
>> to ESX's management domain. See the logs.
>>
>> Does this error come repeatedly? This error is only a problem for o2hb
>> if it continues for the next 60 secs. Else it can be ignored.
>>
>>
>> On 12/14/2010 07:20 AM, brad hancock wrote:
>>> The issue is starting to come up again. Both machines are logging the error a couple of minutes apart from each other.
>>>
>>> sd 1:0:0:0: reservation conflict
>>> Dec 13 16:40:07 mdcvmsmes01 kernel: [295051.378262] sd 1:0:0:0: [sdb] Result: hostbyte=DID_OK d
>>> Dec 13 16:40:07 mdcvmsmes01 kernel: [295051.378347] end_request: I/O error, dev sdb, sector 173
>>> Dec 13 16:40:07 mdcvmsmes01 kernel: [295051.378694] (0,1):o2hb_bio_end_io:225 ERROR: IO Error -
>>> Dec 13 16:40:07 mdcvmsmes01 kernel: [295051.379055] (1897,1):o2hb_do_disk_heartbeat:753 ERROR:
>>>
>>> Should I open a bug report? Who with, VMware or Oracle?
>>>
>>>
>>>
>>> On Sun, Dec 12, 2010 at 9:25 AM, brad hancock <braddhancock at gmail.com <mailto:braddhancock at gmail.com>> wrote:
>>>
>>> Kevin,
>>> I modified the VMFS virtual disk to Independent, and I haven't seen the issue since the change Friday morning. I noticed this didn't work for you. I will continue to watch it and let the list know. The issue I saw after several weeks was the data was not in sync. Two nodes saw different data on the same OCFS2 drive.
>>>
>>> We have Vsphere 4.1, and HP EVA 3000 SAN.
>>>
>>> Thanks,
>>>
>>>
>>>
>>> On Sat, Dec 11, 2010 at 10:41 AM, <kevin at utahsysadmin.com <mailto:kevin at utahsysadmin.com>> wrote:
>>>
>>> On Fri, 10 Dec 2010 06:26:06 -0800, ocfs2-users-request at oracle.com <mailto:ocfs2-users-request at oracle.com> wrote:
>>> >
>>> > My setup has the SCSI controller set to Physical so the guest can be on
>>> > different hosts, but I do not have the disk setup as Independent. I am
>>> > going
>>> > to change that setting in VMware and see if it makes a difference.
>>> >
>>> > > [2037805.922718] end_request: I/O error, dev sdb, sector 1735
>>> > > [2037805.922974] (0,0):o2hb_bio_end_io:225 ERROR: IO Error -5
>>> > > [2037805.923370] (27506,0):o2hb_do_disk_heartbeat:753 ERROR: status =
>>> -5
>>>
>>> Brad,
>>>
>>> I have had the same issue for over a year on ESX 3.5 as well as on vSphere
>>> 4.0. I have not tried yet on 4.1. The error occurs when I put the shared
>>> disk on either SATA or FC LUNs on our SAN. It also doesn't matter if the
>>> virtual machines are on the same physical host or not (with independent
>>> disks). The only problem that has come from it is the occasional reboot of
>>> one of the VMs, which for me is tolerable. I keep hoping to upgrade to a
>>> new SAN thinking that might fix it. The vSphere 4.0 release IOPS
>>> capability is higher than the SAN (it's 5 years old) so I didn't think it
>>> was VMware's fault. If you have fairly new hardware, maybe there is a real
>>> bug somewhere. I don't get I/O errors in any of my other implementations
>>> on this SAN. I sent a post like yours to the list when I first built it,
>>> but never opened a bug report with either OCFS or VMware. If you create a
>>> bug report I could add information from my implementation as well. (I
>>> actually have two of these setups and they both have the same errors.)
>>>
>>> Of course, if you find a solution, please post that as well.
>>>
>>> Thanks,
>>> Kevin
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Ocfs2-users mailing list
>>> Ocfs2-users at oss.oracle.com <mailto:Ocfs2-users at oss.oracle.com>
>>> http://oss.oracle.com/mailman/listinfo/ocfs2-users
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20101215/5f5c4198/attachment-0001.html
More information about the Ocfs2-users
mailing list