[Oraclevm-errata] OVMBA-2015-0052 Oracle VM 3.3 xen bug fix update
Errata Announcements for Oracle VM
oraclevm-errata at oss.oracle.com
Mon Apr 20 10:00:10 PDT 2015
Oracle VM Bug Fix Advisory OVMBA-2015-0052
The following updated rpms for Oracle VM 3.3 have been uploaded to the
Unbreakable Linux Network:
x86_64:
xen-4.3.0-55.el6.22.18.x86_64.rpm
xen-tools-4.3.0-55.el6.22.18.x86_64.rpm
SRPMS:
http://oss.oracle.com/oraclevm/server/3.3/SRPMS-updates/xen-4.3.0-55.el6.22.18.src.rpm
Description of changes:
[4.3.0-55.el6.22.18]
- x86/paging: make log-dirty operations preemptible
Both the freeing and the inspection of the bitmap get done in (nested)
loops which - besides having a rather high iteration count in general,
albeit that would be covered by XSA-77 - have the number of non-trivial
iterations they need to perform (indirectly) controllable by both the
guest they are for and any domain controlling the guest (including the
one running qemu for it).
Note that the tying of the continuations to the invoking domain (which
previously [wrongly] used the invoking vCPU instead) implies that the
tools requesting such operations have to make sure they don't issue
multiple similar operations in parallel.
Note further that this breaks supervisor-mode kernel assumptions in
hypercall_create_continuation() (where regs->eip gets rewound to the
current hypercall stub beginning), but otoh
hypercall_cancel_continuation() doesn't work in that mode either.
Perhaps time to rip out all the remains of that feature?
This is part of CVE-2014-5146 / XSA-97.
Signed-off-by: Jan Beulich <jbeulich at suse.com>
Reviewed-by: Tim Deegan <tim at xen.org>
Tested-by: Andrew Cooper <andrew.cooper3 at citrix.com>
master commit: 070493dfd2788e061b53f074b7ba97507fbcbf65
master date: 2014-10-06 11:22:04 +0200
Port of Xen stable-4.3 425e9039b6532a8c5884e7fe4b6676f6cd493770
"x86/paging: make log-dirty operations preemptible"
Conflicts:
xen/common/domain.c
Acked-by: Chuck Anderson <chuck.anderson at oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> [bug
20759718] {CVE-2014-5146}
[4.3.0-55.el6.22.17]
- Revert "x86/paging: make log-dirty operations preemptible"
Both the freeing and the inspection of the bitmap get done in (nested)
loops which - besides having a rather high iteration count in general,
albeit that would be covered by XSA-77 - have the number of non-trivial
iterations they need to perform (indirectly) controllable by both the
guest they are for and any domain controlling the guest (including the
one running qemu for it).
This is XSA-97.
Signed-off-by: Jan Beulich <jbeulich at suse.com>
Reviewed-by: Tim Deegan <tim at xen.org>
Conflicts:
xen/common/domain.c
Acked-by: Chuck Anderson <chuck.anderson at oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com> [bug
20759718] {CVE-2014-5146}
[4.3.0-55.el6.22.16]
- Xen: Fix migration issue from ovm3.2.8 to ovm3.3.2
This patch is a newer fix for pvhvm migration failure from
Xen4.1(ovm3.2.x) to Xen4.3(ovm3.3.x), and this issue exists in
upstream xen too. The original fix casues issue for released ovm
versions if user wants to do live migration with no downtime since
that fix requires rebooting the migration source server too.
This patch keeps the xenstore eventchannel allcation mechanism of
Xen4.3 as same as the one in Xen4.1. So migration can works well through
Xen4.1 to later Xen, no need to reboot migration source server.
The patch that causes this migration issue is,
http://lists.xen.org/archives/html/xen-devel/2011-11/msg01046.html
Signed-off-by: Annie Li <annie.li at oracle.com>
Acked-by: Adnan Misherfi <adnan.misherfi at oracle.com> [bug 19517860]
[4.3.0-55.el6.22.15]
- switch internal hypercall restart indication from -EAGAIN to -ERESTART
-EAGAIN being a return value we want to return to the actual caller in
a couple of cases makes this unsuitable for restart indication, and x86
already developed two cases where -EAGAIN could not be returned as
intended due to this (which is being fixed here at once).
Signed-off-by: Jan Beulich <jbeulich at suse.com>
Acked-by: Ian Campbell <ian.campbell at citrix.com
Acked-by: Aravind Gopalakrishnan<Aravind.Gopalakrishnan at amd.com>
Reviewed-by: Tim Deegan <tim at xen.org>
(cherry-pick from f5118cae0a7f7748c6f08f557e2cfbbae686434a)
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
Conflicts:
A LOT
[There are lot of changes to for this change. We only care about the
one in the domain destruction. We need the value -EAGAIN to be passed
in the toolstack so that it will retry the destruction. Any other
value (-ERESTART) and it will stop it - which some of the other
backports do we convert -ERESTART to -EAGAIN only].
Acked-by: Chuck Anderson <chuck.anderson at oracle.com>
Reviewed-by: John Haxby <john.haxby at oracle.com> [bug 20664695]
[4.3.0-55.el6.22.14]
- rc/xendomains: 'stop' - also take care of stuck guests.
When we are done shutting down the guests (xm --shutdown --all)
are at that point not running at all. They might still have
QEMU or backend drivers setup due to the asynchronous nature
of 'shutdown' process. As such doing an 'destroy' on all
the guests will assure us that the backend drivers and QEMU
are indeed stopped.
The mechanism by which 'shutdown' works is quite complex. There
are three actors at play:
a) xm client (Which connects to the XML RPC),
b) Xend Xenstore watch thread,
c) XML RPC server thread
The way shutdown starts is:
xm client | XML RPC | watch thread
shutdown.py
- server....shutdown ---|--> XenDomainInfo:shutdown
Sets "control/shutdown"
calls xc.domain_shutdown
returns
- loops calling:
domains_with_state ----|-->XendDomain:list_names
gets active |
and inactive | watchMain
list _on_domains_changed
- _refresh
-> _refreshTxn
-> update [sets to
DOM_STATE_SHUTDOWN]
->refreshShutd
own
[spawns a ne
w thread calling _maybeRestart]
[_maybeRestart thread]:
destroy
[sets it to DOM_STATE_HALTED]
-cleanupDomain
- _releaseDevices
- ..
Four threads total.
There is a race between 'watchMain' being executed and
'domains_with_state'
calling 'list_names'. For guests that are in DOM_STATE_UNKNOWN or
DOM_STATE_PAUS
ED
they might not be updated to DOM_STATE_SHUTDOWN as list_names can be
called
_before_ watchMain triggers. There is an lock acquisition to call
'refresh'
in list_names - but if it fails - it will just use the stale list.
As such the process works great for guests that are in STATE_SHUTDOWN,
STATE_HALT, or STATE_RUNNING - which 'domains_with_state' will present
to shutdown process.
For the other states (The more troublesome ones) we might have them
still laying around.
As such this patch calls 'xm destroy' on all those remaining guests
to do cleanup.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
Acked-by: Chuck Anderson <chuck.anderson at oracle.com>
Reviewed-by: John Haxby <john.haxby at oracle.com> [bug 20663454]
[4.3.0-55.el6.22.13]
- xend: Fix race between shutdown and cleanup.
When we invoke 'xm shutdown --wait --all' we will exit the moment
the guest has stopped executing. That is when xcinfo returns
shutdown=1. However that does not mean that all the infrastructure
around the guest has been torn down - QEMU can be still running,
Netback and Blkback as well. In the past the time between
the shutdown and qemu being disposed of was quick - however
the race was still present there.
With our usage of PCIe passthrough we MUST unbind those devices
from a guest before we can continue on with the reboot of
the system. That is due to the complex interaction the SR-IOV
devices have with VF and PFs - as you cannot unload the PF driver
before the VFs driver have been unbound from the guest.
If you try to reboot the machine at this point the PF driver
will not unload.
The VF drivers are bound to Xen pciback - and they are unbound
when QEMU is stopped and XenStore keys are torn down - which
is done _after_ the 'shutdown' xcinfo is set (in the cleanup
stage). Worst the Xen blkback is still active - which means
we cannot unmount the storage until said cleanup has finished.
But as mentioned - 'xm shutdown --wait --all' would happily
exit before the cleanup finished and the shutdown (or reboot)
of the initial domain would continue on. It would eventually
get wedged when trying to unmount the storage which still
had a refcount from Xen block driver - which was not cleaned up
as Xend was killed earlier.
This patch solves this by delaying 'xm shutdown --wait --all'
to wait until the guest has transitioned from RUNNING ->
SHUTDOWN -> HALTED stage. The SHUTDOWN means it has ceased
to execute. The HALTED is that the cleanup is being performed.
We will cycle through all of the guests in that state until
they have moved out of those states (removed completly from
the system).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk at oracle.com>
Acked-by: Chuck Anderson <chuck.anderson at oracle.com>
Reviewed-by: John Haxby <john.haxby at oracle.com> [bug 20661802]
More information about the Oraclevm-errata
mailing list