Ubuntu

As promised, I would like to talk in part 2 a little bit about the hardware that we need to start the open source datacenter. The amount of hardware needed is based on the software we would like to implement and might look a bit oversized. This is true, you can build up the data center with less hardware, but you have to do much more work when you try to scale the data center because you need to reinstall several parts of your infrastructure which might be complicated once you turn your data center into production. So here is what we need to start:

1 rack
2 switches
4 storage nodes
1 controller node
2 host nodes

I will not recommend any vendor or a specific product to use, in principle, I like the open compute approach https://www.opencompute.org/. Unfortunately, there are not much products in the wild that use this new technology, so if you start building up your data center now, I would stick to traditional server hardware.

The rack should be a full one with 40 HEs or more. You can start with smaller ones or even without a rack at all, but again, when it comes to scale you have to replace most of your components. The switches should have 48 ports and operate at least at 1GB. It should be possible to remotely manage the switches. Based on the fact that Puppet is used for configuration management, I would recommend a switch that can be configured with Puppet https://puppetlabs.com/blog/puppet-network-device-management.

The storage nodes are standard hardware as well, they should be able to serve 8 hard discs, so you get the flexibility to establish hybrid storage pool. It is also possible to start with 2 instead of 4 storage nodes, but then you can not start with a cloud RAID 10.

The controller node will contain all services that are needed to manage the data center software. You can realize much of this services in a virtual machine as well, but you will complicate things if one of these controller virtual machines is failing. So moving this to a separate machine can help you. Keep in mind, this one is a single point of failure in the current layout, so you may have to take additional actions to get this node redundant.

Last but not least, you need 2 nodes or more, serving the virtual machines. The most important thing is fast CPU and lot of RAM. The disk is not so important on a clustered host node. If all things go right, the hypervisor is loaded into RAM and the storage is in the SAN so the local disc is only used for booting.