This is great, but lately I just can't keep up with the virtualisation changes. I got really excited about google's novm - that's dead now. kvmtools never got included into kernel afaik. qboot will be great, but it looks like it doesn't really support the light IO solutions (like novm's file, rather than block access). Now there's clear containers which is actually kvm, which looks like docker-meets-qboot. And many others in between.
This doesn't seem right, honestly. Half of the projects have the same approach or goals. (quick boot time and no legacy supported) So why do they die and get reinvented every single time?
Something is wrong...
stefanha
> it doesn't really support the light IO solutions (like novm's file, rather than block access)
That is incorrect. novm just implements the virtio-9p device that QEMU has supported for years.
Clear Linux does add something new: a pmem (NVDIMM) device that bypasses the guest kernel's page cache. This involves host kernel, guest kernel, and kvmtool changes.
The advantage of pmem is that short-lived VMs can directly access data from the host instead of copying in. But this feature needs to be added to QEMU/KVM anyway to support new persistent memory hardware (see http://pmem.io/) so it won't be unique for long.
sporkenfang
> The advantage of pmem is that short-lived VMs can directly access data from the host instead of copying in.
Is it just me or does this sound really, really exploitable from the VM-to-host direction? I'm hoping there's some way to safeguard such a process.
antocv
Also virtio-9p is slow as hell.
throwaway7767
> Also virtio-9p is slow as hell.
I've heard that from others, but I wonder why that is. I would expect a file-level abstraction to be faster due to less I/O roundtrips for transfers and the removal of the block-abstraction. Is it just an inefficient protocol or are there some inherent bottlenecks in a file-level sharing protocol that I'm missing?
How slow is it really? Slower than NFS for example?
dezgeg
If you want both the host and the VM to have a coherent view of the filesystem, the VM can't really do efficient caching.
throwaway7767
> If you want both the host and the VM to have a coherent view of the filesystem, the VM can't really do efficient caching.
That makes sense, though it's surprising then that it's not possible to explicitly enable caching if that's the bottleneck and a coherent view from the host side is not a necessity for a given workload.
antocv
It was slower than NFS for me.
I blame it on poor implementation, must be some bug somewhere, but I dont have the skills to find out.
They started with HVM and PV, and have since evolved HVM toward PV by removing legacy support and software emulation and have now settled (for now?) on doing every the PV way except where hardware virtualization assistance is faster on modern hardware. Some of this shifting has been due to changes in hardware capabilities, and some of it has been due to earlier efforts being developed from an incomplete understanding of what techniques are faster.
signa11
since you bought up Xen, i have a honest question: why would you consider using Xen in presence of alternatives like vbox/vmware ? more importantly, in say 2-3 years, wouldn't something like this edge them out ?
4ad
VirtualBox is irrelevant.
VMware is closed source. The real Xen alternative is KVM. KVM is better than Xen in pretty much every way. There's a very big cost for big Xen shops to switch to KVM, but if you're not tied to Xen I can't imagine why you'd use it when KVM is better in every way (kernel integration, tooling, performance, etc).
colin_mccabe
I think the main argument in favor of Xen was that it would have a smaller attack surface for hackers than KVM. After all, Xen is a hypervisor-based solution, whereas with KVM you are running the full Linux kernel plus qemu as your host.
With that being said, there have been exploits in the Xen hypervisor. As more hardware integration gets added, dom0 starts to look a lot more like a traditional kernel.
Personally, I use kvm for all my virtual machines, since I don't want to run everything under dom0.
Except for every single "prepackaged developer's workstation" solution I've seen so far. Seriously it works on all systems more or less the same, so I see it used all over the place.
kvmtool may not be in the mainline kernel source tree, but that doesn't mean it's dead or that it's failed. It had no need to be in the kernel source tree in the first place -- it's a standalone userspace tool that uses the publicly documented and stable KVM syscalls and ioctls, exactly like QEMU. QEMU isn't in the kernel source tree either, and nobody's ever suggested that putting it there would be a good idea...
justincormack
I wish its repo was not still a fork of the linux tree though. It is very hard to use. It made sense when it was going to be merged but not now.
pm215
Putting it in a standalone repo has been proposed recently:
https://lkml.org/lkml/2015/2/13/116
and the idea doesn't seem to have been rejected, at least.
justincormack
Ah thanks that repo seems to build anyway, will have a look.
krakensden
"Google"'s novm was pretty clearly labeled as a personal experiment, by a person who worked at Google. Clear Containers uses kvmtool, and iirc is the same group of developers.
JoshTriplett
> Clear Containers uses kvmtool, and iirc is the same group of developers.
Completely different developer, at a different company.
krakensden
thanks
sengork
Keeping up with virtualisation before the late 1990s was quite simple: mainframes.
zobzu
welcome to earth
viraptor
Thanks for the sarcastic comment, I know there's a lot of duplication and NIH in opensource. But I've never seen so much unexplained change anywhere else. Even in the landscape of web browsers where there seems to be a new fork every few weeks, at least people post explanations why they did it.
bronson
It's not just open source. How many proprietary virtualization solutions can you think of?
Isn't it good that so many smart people are trying to solve these problems in so many different ways?
teacup50
Only if the ways are actually different, and the problems are the right problems to be solving.
For example: containerization is a bit like taking a boat with a hole in its hull, and building a new boat to carry the old boat.
When instead, the real problem is that people's applications should be able to run in-place without having to take control of the entire operating system.
vidarh
> When instead, the real problem is that people's applications should be able to run in-place without having to take control of the entire operating system.
We have plenty of mechanisms - including cgroups - that allows you to achieve that.
What containerisation solutions solve is providing a convenient build and packaging solution that includes a decent level of isolation including preventing state from polluting the surrounding system.
The biggest problem is not lack of isolation mechanisms, but that most developers have no clue they even exist.
Try to get the average Linux developer to tell you what seccomp is, for example, and if they know what it is, try to get them to tell you how to use it [1]. There's plenty of room for innovation here, and plenty of room for more different solutions, but the biggest problem they will need to solve is how to make these mechanisms easy enough to use.
There most likely are explanations. On mailing lists. Where the developers are. Because absolutely none of this is in any way end-user-oriented software. Nobody takes the time to package them up and summarize them like Linux Weekly News does for Linux kernel comings and goings, because there's the distinct possibility that absolutely nobody would care.
ghshephard
New Industries (EC2 and friends) in hosting/cloud services were created with the advent of convenient virtualization. technologies.
I wonder what types of services/industries will be created when you can reliably spin up a hosted instance in 40-60 ms.
vidarh
I don't want to spin up instances. I want to spin up functions.
ghshephard
Essentially that's what a call out to Amazon's varied AWS services are though, right?
vidarh
Except that means you're calling out to fixed functions except for with Lambda, but the overheads with Lambda are still orders of magnitude above what you'd want for it to be practical to just decompose your app into functions run separately like that.
To be fair, making a system like that efficient is an unsolved problem even in far more tightly integrated supercomputer systems - IO quickly becomes a massive bottleneck and current hardware is CPU rich and IO poor.
This doesn't seem right, honestly. Half of the projects have the same approach or goals. (quick boot time and no legacy supported) So why do they die and get reinvented every single time?
Something is wrong...
That is incorrect. novm just implements the virtio-9p device that QEMU has supported for years.
Clear Linux does add something new: a pmem (NVDIMM) device that bypasses the guest kernel's page cache. This involves host kernel, guest kernel, and kvmtool changes.
The advantage of pmem is that short-lived VMs can directly access data from the host instead of copying in. But this feature needs to be added to QEMU/KVM anyway to support new persistent memory hardware (see http://pmem.io/) so it won't be unique for long.
Is it just me or does this sound really, really exploitable from the VM-to-host direction? I'm hoping there's some way to safeguard such a process.
I've heard that from others, but I wonder why that is. I would expect a file-level abstraction to be faster due to less I/O roundtrips for transfers and the removal of the block-abstraction. Is it just an inefficient protocol or are there some inherent bottlenecks in a file-level sharing protocol that I'm missing?
How slow is it really? Slower than NFS for example?
That makes sense, though it's surprising then that it's not possible to explicitly enable caching if that's the bottleneck and a coherent view from the host side is not a necessity for a given workload.
I blame it on poor implementation, must be some bug somewhere, but I dont have the skills to find out.
They started with HVM and PV, and have since evolved HVM toward PV by removing legacy support and software emulation and have now settled (for now?) on doing every the PV way except where hardware virtualization assistance is faster on modern hardware. Some of this shifting has been due to changes in hardware capabilities, and some of it has been due to earlier efforts being developed from an incomplete understanding of what techniques are faster.
VMware is closed source. The real Xen alternative is KVM. KVM is better than Xen in pretty much every way. There's a very big cost for big Xen shops to switch to KVM, but if you're not tied to Xen I can't imagine why you'd use it when KVM is better in every way (kernel integration, tooling, performance, etc).
With that being said, there have been exploits in the Xen hypervisor. As more hardware integration gets added, dom0 starts to look a lot more like a traditional kernel.
Personally, I use kvm for all my virtual machines, since I don't want to run everything under dom0.
Except for every single "prepackaged developer's workstation" solution I've seen so far. Seriously it works on all systems more or less the same, so I see it used all over the place.
Completely different developer, at a different company.
Isn't it good that so many smart people are trying to solve these problems in so many different ways?
For example: containerization is a bit like taking a boat with a hole in its hull, and building a new boat to carry the old boat.
When instead, the real problem is that people's applications should be able to run in-place without having to take control of the entire operating system.
We have plenty of mechanisms - including cgroups - that allows you to achieve that.
What containerisation solutions solve is providing a convenient build and packaging solution that includes a decent level of isolation including preventing state from polluting the surrounding system.
The biggest problem is not lack of isolation mechanisms, but that most developers have no clue they even exist.
Try to get the average Linux developer to tell you what seccomp is, for example, and if they know what it is, try to get them to tell you how to use it [1]. There's plenty of room for innovation here, and plenty of room for more different solutions, but the biggest problem they will need to solve is how to make these mechanisms easy enough to use.
[1] An example here: http://blog.viraptor.info/post/seccomp-sandboxes-and-memcach...
I wonder what types of services/industries will be created when you can reliably spin up a hosted instance in 40-60 ms.
To be fair, making a system like that efficient is an unsolved problem even in far more tightly integrated supercomputer systems - IO quickly becomes a massive bottleneck and current hardware is CPU rich and IO poor.