[Tmem-devel] tmem and KVM

Fri Jan 16 11:00:47 PST 2009

Previous tmem-devel posts on the topic of tmem+KVM:
http://oss.oracle.com/pipermail/tmem-devel/2009-January/000001.html
http://oss.oracle.com/pipermail/tmem-devel/2009-January/000013.html
http://oss.oracle.com/pipermail/tmem-devel/2009-January/000011.html

Hi Anthony --

Thanks for the reply.

> So the concept of "precache" is that you would store a portion of the 
> clean page cache in precache.  The hypervisor could then 
> forcibly remove 
> page cache pages.  This is quite similar to what CMM2 does.

Yes.  But the guest has evicted the pages so "forcibly remove"
is a bit of an overstatement.  The guest has already bid them
goodbye.  If you think if precache as a very fast synchronous
disk cache populated only from "above" (i.e. evictions as opposed
to "below" from the disk itself), that's closer.

CMM2 is very similar in spirit, but I confess I've never been
able to fully understand everything in the very complex state
machine CMM2 requires.   Another big difference is that tmem
explicitly uses copying and I think CMM2 does mapping magic.
One can argue the relative benefits but neither is clearly
superior.

> An alternative model would be to give a guest X+Y ram from the start, 
> and for the hypervisor to balloon the guest down to X ram when 
> necessary.

This isn't an alternative; tmem works best if this is
exactly what is done.  And tmem works especially well when
ram has been reduced to a fraction of X, ie. on a temporarily
idle guest (call it Z).  If the guest suddenly needs a lot
more than Z, without hswap, it would need to swap to disk;
with hswap it need not.

> In KVM, memory can always be reclaimed instantaneously so 
> there isn't the same benefit of being able to reclaim Y 
> memory instantly like you do on Xen.

I still quibble with your use of instantaneous.  If I understand
correctly your previous post, this is true only if KVM "can
guarantee that it can recreate the page when the guest needs it."
This means that the container (pageframe, address) can
be instantly available, but the contents of the container
needs to be retained in many cases.

> The issue with ballooning is that the guest lacks the ability 
> to reclaim 
> the Y memory from the VMM as it needs it.  You can accomplish this 
> though by adding a shrinker callback to the balloon driver 

That's one issue with ballooning and your proposed solution
is intriguing.  However, tmem solves the more general problem
for >1 guests: With ballooning, guest A lacks the ability to
reclaim the Y memory from guest B... unless/until guest B's
balloon driver has surrendered it.  With tmem, the Y memory
is instantly available to ANY guest that needs it.

So am I still misunderstanding, or might tmem potentially
have SOME use for KVM?

Thanks,
Dan

> -----Original Message-----
> From: Anthony Liguori [mailto:anthony at codemonkey.ws]
> Sent: Friday, January 16, 2009 10:27 AM
> To: Dan Magenheimer
> Subject: Re: [RFC] Transcendent Memory ("tmem"): a new approach to
> physical memory management
>
> So the concept of "precache" is that you would store a portion of the 
> clean page cache in precache.  The hypervisor could then 
> forcibly remove 
> page cache pages.  This is quite similar to what CMM2 does.  
> So a guest 
> may have X amount of ram, and then Y amount of precache for a 
> total of 
> X+Y memory.  The Y memory can be removed at any time if the 
> VMM needs it.
> 
> An alternative model would be to give a guest X+Y ram from the start, 
> and for the hypervisor to balloon the guest down to X ram when 
> necessary.  In KVM, memory can always be reclaimed instantaneously so 
> there isn't the same benefit of being able to reclaim Y 
> memory instantly 
> like you do on Xen.
> 
> The issue with ballooning is that the guest lacks the ability 
> to reclaim 
> the Y memory from the VMM as it needs it.  You can accomplish this 
> though by adding a shrinker callback to the balloon driver 
> with a lower 
> priority than the page cache.  That way, the balloon driver 
> will attempt 
> to be shrunk in order to expand the page cache when necessary.
> 
> Regards,
> 
> Anthony Liguori