At the time of this writing there are 30 new enterprise IoT devices connecting in the world per second. And by the end of 2020 it’s estimated that over 5 billion enterprise IoT devices will be in service. These devices are in our environmental controls, our power plants and our process controls — all controlling or responding to the world around them.
With the proliferation of IoT devices, the aggregation of multiple devices is essential to properly analyze the data they are providing, giving us more control than ever before.
However, for all of the positives this data provides, it’s actually really hard to get to clear analysis. Data from one device can be hard enough, let alone the hundreds or thousands that are used in complex systems. Uniting this noise of disparate data into actionable information that clearly speaks to high level goals and objectives, particularly as the number and variety of devices increase, can cause a lot of anxiety.
First, you need to establish your objectives.
These are high level, non-technical statements regarding what you intend to accomplish through the use of the data you are collecting. Examples might include reduce energy consumption while maintaining a cool temperature during the summer or support higher product velocity by improving throughput in the warehouse. These are the what and where parts of the problem you are looking to solve.
You also need the how part of the problem, which has more to do with the data. In the case of identifying the how of reducing the energy consumption, an example might be by monitoring room temperatures, air flows and air handler energy consumption. Together these two statement types provide a succinct picture both of the objective and how to get to it using data.
There are a few principles to keep in mind to help reduce both complexity and anxiety. The first is that you have control over the devices. We often panic and race to build a system that ingests all of the data that a device produces. Systems that are built quickly often do not scale well, which leads to additional complexity later. Many IoT devices output data at a rate that has nothing to do with what you need in order to accomplish the objectives you laid out. This is because they exist to accomplish their goals, not yours. The sooner you realize that, the better.
A device that emits data a thousand times a second may be doing so simply because that is a matter of convenience given the way it was designed. Or perhaps, internally, it needs to process at that rate to sufficiently control what it was designed to control. It absolutely does not mean that you have to stockpile and interpret at that rate. You must make technical choices that ladder up to your objectives, and get in control of the devices you are using to accomplish this.
Positioning yourself in control of your devices generally has some implications. Sometimes you have the clout to change the device interface if you are a large enough customer or design the hardware yourself, but in all but the most exceptional circumstances, this isn’t the best path forward. Affecting control over hardware design is expensive and time-consuming. Just think of the work you would face attempting to roll out new hardware every time you want to change something.
Instead, you want to create aggregators that listen to the IoT devices and translate the data they emit into a cohesive stream of data laddering up to your objectives at the rate that makes sense for you. The aggregator can help remove sources of anxiety because it reduces complexity, answering questions like “how do I get the temperatures all calibrated to the same scale,” “how can the outbound connection meet a level of security that none of the devices can achieve,” or “how do I reject data from a faulty sensor?”
Before we get to some design principles for the aggregator, notice that there has not been anything mentioned about where this solution lives. That is because it depends on your problem and the details of what it must solve. If you need to solve bandwidth or security problems, chances are it may be an on-prem solution for which there are abundant options to meet your needs. Depending on the infrastructure of the rest of your solution, it may be something in the cloud or it may be at your data center, for which there are again abundant options to meet your needs. If you are pushing the envelope with data volume and latency, perhaps it is an edge computing solution.
Sometimes you must build it, sometimes you can buy it, but you need to think through what makes the most sense to control your infrastructure complexity as well as the data itself. This is an important element to get right the first time, as it is costly and time consuming to change.
There are some common patterns of software design that will help you reduce complexity and consequently reduce data anxiety. First, admittedly rather obvious, is that you need to be in complete control of the aggregator.
This is your tool to interpret many incoming data sources and translate them to one data stream that has dense meaning related to your objective. You need to be able to update the software, perform diagnostics, and adjust parameters and settings.
Depending on your line of work, and where and how it is deployed, this could be in person or it could be remote. In either case, you need to run through the pros and cons for various control strategies and decide which reduces complexity the most with the least risk.
You need to create a solution that allows you to process each type of input data and combine them to form your output. This depends greatly on both the technical embodiment of your high level objectives as well as the details of the input data you are working with. In gross generality, this has two distinct parts. The first is a modular framework organized around easily replaceable components focusing on the incoming data. This is where the APIs for specific data sources are created, and where schemes like averaging high frequency data, rejecting outliers, and conversion of temperature scales are accomplished. The second is a bus architecture that aggregates and formats the final output, applying timestamp alignment and output cadence, combining data from two or more dissimilar inputs, and where output encoding is applied. The goal here is compartmentalizing complexity and risk by modularization.
Data anxiety in today’s IoT world is real — especially within the enterprise. To overcome this anxiety, you need to take control of your data rather than letting the sheer mass of it control you.
Setting clear objectives, aggregating data based on those objectives, and designing a solution to form a clear output will put you back into control of your data, and ultimately allow your enterprise to make better decisions.