Published on October 21st, 2014 | by Brian Suhr44
Hyper-Converged Infrastructure comparison, Nutanix vs SimpliVity vs EVO:RAIL
With the hyper-converged (HC) market gaining momentum each of the last few years, there are more and more customers taking notice. The number of HC vendors and models is increasing quickly to satisfy different customer sizes and use cases. I saw a lot of confusion from people in the market on how these offerings compare to each other. So I thought a thorough comparison was needed. This would compliment the in depth reviews for each of these products as we are able to complete them.
- Nutanix review
- SimpliVity review
- EVO:RAIL review
Note: Additional details available for some cells by hovering over text.
I have been keeping this post updated since there are a lot of readers looking for data on SimpliVity vs Nutanix, Nutanix vs VSAN, Nutanix vs EVO:RAIL, and Nutanix vs SimpliVity.
The early days of Hyper-Converged Infrastructure (HC) was our software on our hardware. It was the easiest way to create a predictable experience that the vendors could control. But as offerings mature and new vendors enter the market things are loosening up. We are seeing Nutanix and SimpliVity strike deals with major server vendors. While VMware created a product offering and is allowing hardware vendors to certify their hardware into the EVO:RAIL program.
As products mature and these new hardware options become available, I am seeing additional customers take interest. The questions of why did they build on that server product, with options from Dell, Cisco and others likely to come soon. Even the pickiest hardware snobs will soon likely be able to find a HC solution on a hardware platform that meets their standardized choice.
The discussion of scaling is both a hardware and platform discussion. I will cover it in the hardware section here. For scaling you need to consider both ends of the spectrum, what is the minimum number of nodes and is there a max number of nodes. This discussion will vary greatly based on customers and use cases. When considering an HC vendor for a customer, I would like to know would it be just be used in your data center, a ROBO option or both. This could sway my decision making a little in one direction possibly.
Don’t be fooled into some vendors having a limit versus others having no theoretical limit to the number of nodes in a storage cluster. In reality, your design will drive how many nodes you will require in a storage cluster and if you might need multiple storage clusters. You will need to consider if you need separation for workload types, for security reasons and also to control your failure domain size. An example would be if I built a 100 node storage cluster that could sustain two node failures vs building three separate storage clusters that can each sustain two node failures. The second option is likely to be more attractive to most since the failure domain is smaller.
If you are looking at the minimal size of the offerings, there are a few things to consider also. The different vendors have taken different approaches here. Both Nutanix and EVO:RAIL require a minimum number of nodes to build a storage cluster, ensuring availability. SimpliVity allows a different approach allowing single node install, that sacrifices local HA while benefiting from the global storage layer for backups and replication if desired.
As for scaling another important topic is how granular can I scale as the solution grows. Can nodes be added one at a time or is there a minimum number of nodes that must be added for each scaling event. Depending on your use this may or may not matter in the decision.
When it comes to the hypervisor discussion with hyper-converged solutions, most tend to focus on two main points. First which hypervisor platforms do you support and secondly which versions are supported. These both carry different levels of importance to customers.
As the hypervisor space becomes more competitive and customers are implementing non VMware products to virtualize workloads, the flexibility to support more than VMware is quickly becoming important. While VMware does still rule the world, there are certain verticals and customers that are adopting KVM or Hyper-V over VMware. Today this likely already plays into the buying decision for some customers when looking at hyper-converged products. Nutanix is already supporting multiple hypervisors, SimpliVity has discussed their desire to in the future, while VMware will always be a 100% vSphere based solution. Both of these stances can provide benefits for customers as they provide more flexibility or a tightly controlled integration.
As far as the versions of hypervisors they support is also a big factor in my eyes. There are a few things for customers to consider with versions. If you are new to a solution will they support the hypervisor version you are using currently, or will they force an upgrade or downgrade. This could be a big deal for some, you might have the option to use an older build to gain support for your version. But you then might be missing out on features that attracted you. Another thing to watch is how quickly do vendors update their software to support new hypervisor versions. I don’t think that people need day one support usually, but the question becomes how long will I have to wait. I think support within three months is very acceptable, if it typically runs longer than that it could be an issue for some.
Having a good management story helps put the hyper in hyper-converged infrastructure. Along with converging the storage and compute, I think that HC products should equally converge the management. In my eyes, this means a single interface for managing the product from a hardware, performance and upgrade perspective. After all these are products built from the ground up for one purpose and the management should reflect this.
There are different ways that these vendors built their interface. I have personal feelings about them, but will leave my feelings out of this one. You should look deeply at the different interfaces, find out what features they offer and the level of detail they make available to admins. There are three different approaches to the management of these products.
- vCenter plug-in
- vCenter web client plug-in
- Separate web based portal
The management interface for these HC products all focus on managing on a per-VM basis. I think this is the right approach and while they are all taking different approaches, they all agree that the VM is what admins care about. Given this approach I expect to see all data on a per-VM view also. So details about performance, capacity, replication, backups and all other features should be present with VM granularity.
With these being modern storage, compute and virtualization platforms I come to expect extra from them. There should be some method of automating the product. These HC products make great platforms to build cloud solutions, VDI and server virtualization environments on. To aid in these the increasing demand for automation and orchestration should not be ignored on a modern product. How are the vendors providing you different methods for controlling and configuration. There are several options for this type of automation that I have listed below.
- REST API
- Command Line
The connectivity into these hyper-converged products has been pretty stable. There are some increasing options as the products mature and new models continue to appear. The standard for the majority of models from these vendors is a pair of 10GbE connections. These two network connections provide the storage interconnectivity, VM traffic and sometimes management. Many of the models still also come with a pair of 1GbE connections and these are typically used for management traffic, but some offer the ability to use for storage or VM traffic depending on the size of the environment.
The storage part of this comparison is some of the most talked about details and often leads to FUD and fighting between the vendors over differences in features and approaches. I don’t intend on bringing peace to the storage industry but will try and explain the different offerings for easier consumption.
When it comes to storage the performance, availability and data services are some of the most important and most covered topics. This comparison is not going to cover performance, but I will be comparing the availability and data services that all three of the hyper-converged platforms offer.
A hot topic between the vendors is how they are enabling their HC storage layer. Both SimpliVity and Nutanix are using a Virtual Storage Appliance (VSA). There is a VSA on each node in the storage cluster and they act like scale out storage controllers. While VMware has taken the approach of building VSAN as a module in the vSphere kernel. Each approach has its benefits and draw backs. The VSA model will use more host resources to provide storage services. Using the VSA is allowing vendors to offer deduplication, compression, backup and replication among other services. While VMware’s integrated approach uses far less resources, it does lag in the data services it can offer currently.
The vendors are working hard on providing a highly available and resilient storage layer that can handle multiple failures. The very nature of hyper-converged is the blending of compute and storage, so to be able to architect for these failures you must ensure you have capacity available at all layers. Each vendor will offer guidance on how to architect to allow failure levels for the storage layer, ensuring you have the right capacity to allow for the number of copies of data. But do not forget about the compute discussion, while the storage might not be affected by dual node failure. Will you have enough compute resources available to handle a double failure? Will you even care about compute in a double failure as long as storage is still online and has no data loss?
The availability discussion is too complex of a topic for this level of comparison. But some important items to research and understand during different events are the following.
- At what point could there be data loss
- Will all or part of the VMs be unavailable during different failures
To me one of the founding principals of hyper-converged infrastructure should be a simplified deployment process. This could mean a number of things to different people, I will layout my take on it. There are three areas of the install process that can be simplified.
- Storage deployment
- Hypervisor deployment
- Hypervisor configuration
The storage deployment process is something that should be super easy and as automated as possible. This is the part that the vendors are creating and it should be an important part of their solution. By automated and simple, I would envision a process that deploys any VSA’s if there is one or enables the storage layer that is built-in. This would be accomplished from a management portal or deployment tool ideally. The process should allow the person deploying to enter in details such as network addresses and details, host names, NTP, service account and any other details. This information should be used by the automated install process.
To continue on with my idea, the deployment of the hypervisor should also be automated as part of the greater HC install process. The thought of having to mount an ISO and go through the hypervisor install seems old school and a waste of time. Its good to see this become more of a standard with offerings.
Last up is the post install configuration of the hypervisor. I think just installing the hypervisor is a bare minimum to be able to also install the virtual storage controller is not enough. The process should continue and configure a set of common best practices that apply to the majority of customer environments. This is a time savings benefit and also ensures that the product is deployed in an optimal manner. Today only VMware is taking it this far, I look forward to the other vendors maturing their process to include this phase.
While a simplified and automated install is important the fact is you only install a node once, but you will upgrade and patch it many times over the course of its lifetime. This is why the upgrade process and method can be a key differentiator in hyper-converged products.
Just like the install function, I think there are a few layers of the solution that companies can help customers with. While I think that having an easy polished upgrade for the storage layer is table stakes, any help vendors can provide for the other layers is added value today. But will likely become an expected set of features in the coming year. These are the layers that I think are targets for upgrade simplification.
- Storage Layer
- Hardware Firmware
Hyper-converged solutions have been on the market for several years now, and I personally think that a storage software upgrade should be easy. It should be so easy a junior admin can do it and without any interruptions. This should not be a huge ask these days, pretty much every modern flash or hybrid storage array offers this same level of upgrade simplicity.
For the hypervisor layer, this is a nice to have feature at this time. We all know that upgrading hosts is not a glamorous job and no one likes to spend hours doing it. When upgrading the storage software in HC solutions, it may be also necessary to upgrade the hypervisor at some intervals. The market and time will determine how much of a differentiator this becomes.
The last layer is the firmware layer of a hyper-converged solution. This to me is the second most important of the three layers. Since all of the HC solutions have pretty tight control over the hardware that they are implemented on they should also be able to provide proper management of the firmware. They should be able to help with monitoring the firmware and provide better assistance with determining drive failures. Ideally the firmware upgrade should become wrapped into the solution in some method and hopefully vendors will start to recommend when upgrades should be performed when other layers in the stack are upgraded. This will help avoid performance or stability issues because a customer is running 2 year old firmware on the built-in network adapters for example.