At Vapor IO, our team has been working on what we feel to be dramatic improvements for both hardware and software in the data center space. As the VP of Software Engineering, it’s my job to lead the team designing and building the software that will modernize data center operations north and south of the rack.
Currently, our software landscape is divided into two main parts: the API (OpenDCRE/Vapor CORE/Vapor EDGE) and the OS (Open Mist OS/Vapor OS). In this series of posts, I’ll begin to discuss our API and OS strategies, as well as how these components can be integrated into the mythical “Single Pane of Glass” currently touted and misguidedly coveted in the data center world.
Getting Started – The Hardware
Before we talk about software, it makes sense to briefly discuss the hardware universe we are operating in today. By now, you’ve probably already seen renderings of the Vapor Chamber, and our contribution of a Bus Bar-Based Monitoring and Control system to the Open Compute Project. So where does the Vapor software come in?
-Vapor Edge Controller
First of all, whether your Vapor-enabled deployment takes place in one of the wedges of a Vapor Chamber, or in a standard 19-inch rack, there are some common components and resources that comprise any rack. This includes the CPU/Memory/Storage (aka servers), power distribution, networking components, and auxiliary devices and sensors. To tie all of these components together we have created the Vapor Edge Controller (VEC), a new Edge Zone Controller that supersedes the traditional notion of the “top-of-rack” controller by managing a dense physical and logical compute zone in the data center. The VEC runs a Linux distribution that is either our open-source operating system Open Mist OS (OMOS) or Vapor OS (both described in a later post), and, among other things, hosts the APIs used to monitor and manage the rack components.
-BMC Replacement
Monitor and manage rack components, you say? Yes, that is where our API comes into the picture. Digging into the details of the hardware, you will find that the VEC offers a unique communications bus that can support digital and analog sensors, power control, fan speed, door lock, and other device control – as well as serial/TTY access to devices in the rack, exposure of OCP debug header information, and full support for legacy IPMB commands as well.
If you’ve been following the industry for a bit, this might sound like this aspect of our offerings provides the capabilities of IPMI on steroids – however, in the Vapor model there is no need for a BMC (Baseboard Management Controller), costly cabling, or an expansive and expensive management network. Additionally, by removing the BMC we also have the opportunity to provide a RESTful API that exposes all of the data and commands needed, without the headaches of crafting IPMI packets over the network.
The API
Our API is broken into two pieces: the “southbound” API – a lower-level REST API that provides deep, granular access to every individual readable, writeable and controllable component in the rack. I like to think of this as “IPMI the way it should have been”, as is hopefully evident from the following discussion.
In addition to the “southbound” API, we also provide a “northbound” API, which serves as a pluggable aggregation framework, that can bring together disparate data from multiple data sources (our southbound API, your building management system, other APIs like Redfish, etc.), and expose key high- and low-level data points that may be used for critical environment live migrations, dashboard visualization and heatmaps, performance per watt per dollar computation, and cost per cloud realization. In a future post, I’ll discuss the intricate details of our northbound API as this is where a lot of our “magic” will be exposed.
OpenDCRE / Vapor CORE
The “southbound” API is also referred to as Open DCRE (for our open-source contribution), and Vapor CORE (for our commercial offering). Both share a common base and set of semantics – the main difference is the functionality supported by the API and its corresponding firmware (see Table 1, below, for differences between versions).
Open DCRE | Vapor CORE |
OCP-Contributed Hardware | OCP-Contributed Hardware |
Open Mist OS | Vapor OS |
Power Control (status, on, off, cycle) | Full Firmware Support |
Onboard Analog Sensor Support | All Open DCRE Features |
Secure RESTful API | Real-time Analog & Digital Sensor Support |
Critical Environment Assisted Live Migrations | |
“Northbound” API for Aggregate/Rollup Metrics (Performance/W/$; Cost/Cloud) | |
Analytics Engine & Dashboard for DC Operations |
Table 1. Feature Availability for Open DCRE and Vapor CORE.
Our API runs on the top-of-rack controller, and exposes a secure REST API that supports a variety of authentication methods, and is simple enough that all commands are fully curl-able, though both GET and POST are supported for all operations.
A key design goal of the southbound API has been to completely eliminate extraneous commands and operations, and to cut to the chase when it comes to common devops and management/monitoring tasks. As a result, we’ve distilled the API down to a compact set of “verbs”, with a flexible and intuitive set of components that the API may communicate with. Below are a few examples:
-Scan
The scan command enumerates all of the boards, ports and devices attached to the bus. In a typical deployment, one would call the scan command occasionally (e.g. at startup), and use the resulting map of devices for whatever is the desired purpose. For IPMI aficionados, think of this as reading and interpreting the FRU and the SDR in a single command that results in a tidy JSON document.
Example Request:
https://192.168.2.2:5000/vaporcore/0.6/scan/1
The above command is passed to the hardware to scan the bus, and respond with a list of boards, ports and devices (as well as device types).
Example Response:
{ "boards": [ { "board_id": 1, "ports": [ { "device_id": 255, "device_type": "thermistor", "port_index": 1 }, { "device_id": 255, "device_type": "none", "port_index": 2 }, { "device_id": 255, "device_type": "thermistor", "port_index": 3 }, { "device_id": 255, "device_type": "none", "port_index": 4 }, { "device_id": 255, "device_type": "none", "port_index": 5 }, { "device_id": 255, "device_type": "none", "port_index": 6 }, { "device_id": 255, "device_type": "none", "port_index": 7 }, { "device_id": 255, "device_type": "thermistor", "port_index": 8 }, { "device_id": 255, "device_type": "thermistor", "port_index": 9 }, { "device_id": 255, "device_type": "thermistor", "port_index": 10 }, { "device_id": 255, "device_type": "none", "port_index": 11 }, { "device_id": 255, "device_type": "humidity", "port_index": 12 }, { "device_id": 255, "device_type": "power", "port_index": 13 } ] } ] }
In the example response above, we see that there is one board available, with 13 ports. There are several thermistors attached to the board (ports 1,3,8,9,10), as well as a humidity sensor on port 12. Additionally, port 13 can be used to control device power. First, let’s read the thermistor to see the temperature.
-Read
In Open DCRE / Vapor CORE, devices on the bus may be read. This may range from reading a temperature observed by a thermistor, to reporting the observed voltage or current from a PMBUS-based power supply. The southbound API makes this process dead simple, and eliminates the need to execute a baker’s dozen commands to retrieve and convert a simple temperature and airflow reading.
Example Request:
http://192.168.2.2:5000/vaporcore/0.6/read/thermistor/1/5
The above command is used to obtain a thermistor reading. The schema of the URL is the command (“read”) followed by the device type (“thermistor”), and the board id (“1”) and port id (“5”). A device ID may also be specified if there are multiple devices on a single port, but in this case the default device is used and is not specified.
Example Response:
{ "device_raw": 724, "temperature_c": 22.08 }
In the response above, we see that our thermistor reads 22.08 degrees Celsius, a pleasant indoor temperature. We also provide the raw reading, however, the conversion from a raw value to a Celsius value takes place automatically within the southbound API endpoint.
-Power
Finally, Vapor CORE / Open DCRE provides unprecedentedly easy access to power data and control, including the ability to observe and control power for OCP equipment.
Remember from the Scan example earlier that, in addition to the thermistors and humidity sensors, there was also power control supported from this board on port 13. Let’s first check the power status on that port.
Example Request:
http://192.168.2.2:5000/vaporcore/0.6/power/status/1/13
This command retrieves the power state of the device on board id 1, port 13.
Example Response:
{ "control_raw": 0, "power_status": "on", "power_ok": true, "over_current": false, "under_voltage": false }
As we can see, the device is “on”. Let’s shut it off:
http://192.168.2.2:5000/vaporcore/0.6/power/off/1/13
That’s all there is to it!
Conclusion
Hopefully this article has provided a brief primer on how the Vapor CORE / Open DCRE API fits into the larger Vapor story, and how it might be of use in your own data center. Obviously, there is far more to the API then discussed in the few brief examples above. If you’re interested in being part of the beta testing of our API, contact us for a development kit, including a configurable emulator that may be used to simulate the Vapor hardware.
In my next article, I’ll introduce you to our OS development, where we are making some exciting progress in building the industry-leading edge controller operating system distribution. Stay Tuned!