[Tmem-devel] [RFC] Transcendent Memory ("tmem"): a new approach to physical memory management

Anthony Liguori anthony at codemonkey.ws
Thu Jan 8 15:16:55 PST 2009


Dan Magenheimer wrote:
>> from Xen's.  A guest balloons memory by issuing effectively a 
>> hypercall 
>> that will tell the VMM that the guest doesn't care about the memory's 
>> contents anymore.  A guest is free to use that memory 
>> whenever it wants 
>> but the page will be all zeros.
>>     
>
> If the kernel is absolutely certain that a page will not be used
> again, certainly it doesn't care, but then it wouldn't put the
> page into tmem either.  However a kernel that has been aggressively
> reduced in memory size (via ballooning or equivalent) will
> usually only regretfully evict a page and will often have
> to go fetch that page from disk again (a "false negative
> eviction").
>   

The s390 ballooner used a shrinker (or perhaps OOM, but the theory's the 
same) callback to automatically unballoon memory when there's shrinker 
pressure.  We don't do that in KVM today but it's a trivial change to 
add.  Note that there are two concepts in KVM, the "ballooned" size of 
the guest and the RSS size.  The RSS size is the actual memory 
allocation.  When you increase the "ballooned" size, you don't 
automatically increase the RSS size until the guest actually uses the 
memory.

Likewise, we can also do RSS limiting which forcefully limits how large 
the RSS size can get.  If the guest exceeds the RSS limit, it will swap 
regardless of host memory pressure.

> On the reverse side, how quickly can KVM feed more memory to
> a needy VM, e.g. when it is on the verge of swapping?  (Or does
> only the host swap?)  Tmem handles this very efficiently.
>   

The guest decides when it wants more memory and it can get it instantly 
by just touching it.  The host may end up swapping the guest if overall 
memory pressure is too high.

The shrinker callback for the balloon driver basically gives you the 
semantics of, don't swap within the guest unless you have exhausted the 
balloon driver's allocation.  If you don't have enough memory to get all 
of that ballooned memory, the host is going to start swapping the guest.

> After I post the Linux patch and you've had a chance to look at
> some of the other materials, please let me know if you still
> feel KVM won't benefit.
>   

Yup, I will.

Regards,

Anthony Liguori

> Looking forward to more discussion...
>
> Thanks,
> Dan
>
>   
>> -----Original Message-----
>> From: Anthony Liguori [mailto:anthony at codemonkey.ws]
>> Sent: Thursday, January 08, 2009 3:03 PM
>> To: Dan Magenheimer
>> Subject: Re: [RFC] Transcendent Memory ("tmem"): a new approach to
>> physical memory management
>>
>>
>> Hi Dan,
>>
>> Dan Magenheimer wrote:
>>     
>>> At last year's Xen North America Summit in Boston, I gave a talk
>>> about memory overcommitment in Xen.  I showed that the basic
>>> mechanisms for moving memory between domains were already present
>>> in Xen and that, with a few scripts, it was possible to roughly
>>> load-balance memory between domains.  During this effort, I
>>> discovered that "ballooning" had a lot of weaknesses, even
>>> though it is the foundation for "time-sharing" physical
>>> memory in every major virtualization system existing today.
>>> These weaknesses have led in many cases to unacceptable performance
>>> issues when VMs are densely packed; as a result, memory is becoming
>>> the bottleneck in many deployments of virtualization.
>>>
>>> Transcendent Memory -- or "tmem" for short -- is phase II of that
>>> work and it essentially augments ballooning and "fixes" many of
>>> its weaknesses.  It requires paravirtualization, but the changes
>>> (to Linux) are fairly small and minimally-invasive.  The changes
>>> to Xen are larger, but also fairly non-invasive.  (No shell scripts
>>> this time! :-)  The concept and code is modular and may easily
>>> port to Windows, as well as KVM.  It may even be useful in
>>> containers and in a native physical operating system. And,
>>> yes, it is machine-independent so should be easily portable
>>> to ia64 too!
>>>
>>>       
>> I didn't want to pollute xen-devel with this since it's totally KVM 
>> specific, but I took a look at the info you have and believe 
>> that tmem 
>> is not really applicable to KVM.
>>
>> In KVM, guest's don't "own" there memory.  The model is the same for 
>> s390 too.  Since they never know the real physical page, they VMM can 
>> remove memory from the guest at any time as long as it can guarantee 
>> that it can recreate the page when the guest needs it.
>>
>> Right now, KVM feeds information from the guest's shadow 
>> paging to the 
>> host's mm LRU which allows the Linux mm to effectively 
>> determine which 
>> portions of memory are not in use and it can swap those to disk.
>>
>> Additionally, our "ballooning" mechanism behaves totally differently 
>> from Xen's.  A guest balloons memory by issuing effectively a 
>> hypercall 
>> that will tell the VMM that the guest doesn't care about the memory's 
>> contents anymore.  A guest is free to use that memory 
>> whenever it wants 
>> but the page will be all zeros.  In the VMM, we blow away the 
>> page and 
>> replace it with a CoW reference to the zero page.
>>
>> The s390 guys took it a lot further.  They actually updated the mm to 
>> give the VMM a ton of information about guest pages including which 
>> pages where resident in memory but also resident on disk.  That means 
>> that the VMM could blow away that page and deliver a special fault to 
>> the guest when it tried to access this page which would then 
>> result in 
>> the guest pulling it in again from disk.
>>
>> Unfortunately, the CMM2 work was deemed too invasive so it was 
>> abandoned.  However, if you're not familiar with it, you should take 
>> look at it.
>>
>> Regards,
>>
>> Anthony Liguori
>>
>>
>>     




More information about the Tmem-devel mailing list