IoT solution in Azure

How to Build a Powerful IoT Solution in Azure?

Today, we will talk about the way in which IoT can transform a business model, from traditional sales to a subscription-based service. We’ll go over a case study from the manufacturing industry.

I will also show you how to use event-driven architecture to build a scalable global platform, allowing customers to manage their devices securely from anywhere in the world.

Key points:

  • What is event-driven architecture and what are its benefits?
  • What is IoT and how can it be useful?
  • How to use Azure services to build a scalable IoT solution?

Our scenario

One of our clients is a water pump producer. They provide millions of devices for customers around the globe. The client wanted to extend their offering and launch a subscription model as another aspect of their business.

The aim was to build a modern IoT platform that would allow for all their devices to be connected and managed via a single website. Their customers could then launch the portal and monitor their devices, send the commands, or even receive some alerts and predictions about water usage or mechanic failure before anything happens.

Each device already includes electronics to control it. That’s why reaching the client’s goal did not require complex restructuring, but merely using the data that was already available.

Solution overview

The previous service was created as a sequential system, so its performance and scalability were insufficient. It was also hosted as an on-premises solution, which caused difficulties with maintaining the infrastructure. Additionally, that platform was only prepared for one type of device, without any way of extending it.

Together with the client, we created a cloud-based platform that replaced the existing solution. Now, each device can be connected to a different offering type (depending on the business model, e.g. capabilities and limits) using different byte protocols.

Our client has already prepared their own device communication protocol, in two versions which are not compatible. The first version is used on the existing devices, and the new one is installed on the new line of products.

You might wonder why version 2 is not compatible with version 1. The answer to this question is easy. Old devices use an entirely different processor architecture, and version 1 was created only for that device type. The second version is more generic and flexible, which means it can be added to other products.

The goal of the new platform was to eliminate the old platform’s weaknesses and move the solution to the cloud. Together with the client, we decided to use the event-driven approach to handle the load from all devices.

What is the event-driven approach?

Along with event-driven architecture, it is a way to have a scalable system based on distributed and asynchronous communication between microservices. How do they work?

Each device or service sends a piece of data via an event bus, thus defining an event. A second service (e.g. service A) listens to the information from the event bus. The latter puts events through to the listener (i.e. service A). That service, when ready, can take and process it.

Upon completion, service A should generate another event that contains data about what happened and send it via the event bus.

That approach reduces coupling between services and allows you to scale the solution out (you can find out more about scaling in one of my previous articles).

Definition: coupling. (1) manner and degree of interdependence between software modules (ISO/IEC/IEEE 24765:2017 Systems and software engineering-Vocabulary) . . . (3) measure of how closely connected two routines or modules are (ISO/IEC/IEEE 24765:2017 Systems and software engineering-Vocabulary)

(Source)

Services only communicate using contracts (events). Keep in mind that this is asynchronous communication, so we have to make sure there is consistency among our services.

event-driven IoT architecture diagram

Example of an event-driven architecture (click to view full-size)

What is IoT?

IoT stands for the Internet of Things. It’s based on the internet connectivity of the modern physical devices, vehicles, buildings, or home electronics. The devices continuously send their data to their respective systems via the Internet. The systems then appropriately store, analyze and react to this data.

Usually, IoT devices contain many sensors that gather data and have software that allows them to combine this information and send it to the destination system. Each piece of data from a sensor is called a datapoint.

What is possible with IoT?

Imagine that you are going away for a few weeks but the plants in your home and garden need to be watered. You can take care of it with IoT devices, even if you’re not home. All you need is a phone or a web portal to manage it.

You can set up a schedule for when to run the water pumps to water your plants, and control the water temperature. Everything will be done automatically. You can focus on other things, and just check in on your devices from time to time.

What’s also worth noting is the monitoring and alerting capability. IoT systems can inform you when something is wrong or that something might happen soon. That’s because everything is stored on servers and monitored. As an IoT device owner, you will receive a notification, and you will be able to check the dashboards for information about the condition of your devices.

Azure services for an event-driven IoT architecture

Let’s talk about the role the Azure cloud plays in building an event-driven IoT solution. Microsoft Azure has several useful services available to make the solution work smoothly and effectively. Here is what we have used for this project.

Azure Kubernetes Service (AKS)

Kubernetes Service is a system for managing clusters of containers. It is designed to be extensible and fault-tolerant by allowing application components to restart and move across systems as needed. The service is fully managed by Microsoft and hosted on Azure.

Kubernetes automatically manages service discovery, incorporates load balancing, tracks resource allocation, and scales based on computing utilization (you can find more details here). Autoscaling is also faster than for Azure PaaS.

Azure Container Registry (ACR)

A container lifetime management system. All services and Kubernetes deployments are stored and versioned in ACR as Docker images and Helm Charts. Using this service, we can have many versions of our microservices with proper deployment definitions.

ACR can scan all the images pushed there. It allows you to discover any known vulnerabilities in packages or other dependencies defined in the container image file. You can also receive vulnerability assessments and recommendations, including specific guidance on remediation.

Azure DevOps

To manage all changes and connect ACR and AKS, we use Azure DevOps. Any changes to code during the merge to master/main branch will trigger build, run unit test, build image and Helm Chart and push to ACR functions. When an image lands in ACR, Azure DevOps will run deployment scripts, i.e. set up Azure infrastructure and run a microservice Helm Chart to AKS.

The important thing in this process is that everything is written as one pipeline using the Infrastructure as Code approach, so any change to the pipeline is tracked.

Azure IoT Hub

One of the most used services in our solution. The IoT Hub is a service that allows us to connect devices to Azure. Using the IoT Hub, we can manage IoT devices, securely send commands, and receive data from a device. A part of this service is a discovery feature that registers and provisions devices.

Azure Event Hub

A fully managed, real-time data ingestion service. It streams millions of events from multiple sources. It guarantees that a massive load of events will be sent to other services. I wrote a dedicated article about the Event Hub, so you can find out more about it here.

Azure Service Bus

A message broker, fully managed by Azure. It is used to send a message from one service to another in an asynchronous way to exchange information. I demonstrated the possibilities of this service in previous articles which you can find here:

Azure PostgreSQL

A normal PostgreSQL managed by Azure. We use the Timescale extension to manage time-series data. This extension is a combination of the power of the SQL language and the time-series collection. Datapoints from devices are stored as document collection in JSONB format – a native PostgreSQL binary JSON format that can be efficiently included in a query.

Azure storage account

We use it to store raw data from devices and sometimes as a store for the internal service state.

Want more expert content? Leave your email address to get our industry insights every two weeks! Sign me up

Connecting the dots

Some regions of our client’s operation have dedicated offerings or limits. To make sure the solution matches the individual needs, we decided to create a separate Kubernetes infrastructure for each location. Each region has its cluster and Azure infrastructure. This approach allows us to gradually open up the solution to new customers while maintaining stability, and manage the load of devices.

A note about offerings

I mentioned dedicated offerings, so what are they? Essentially, they are groups of devices that solve certain problems. For example, they might detect a fire, run sprinklers, or mix hot and cold water in cities. Based on the device type, they might have completely different sets of datapoints or even data protocols.

The infrastructure of our event-driven solution

One IoT Hub manages many kinds of devices, which don’t know anything about the offerings or the protocols which others use. We decided to use device twin tags to mark this information.

How are the tags added? During the device provisioning process, our solution automatically adds tags to the device twin based on its configuration.

This data is necessary for processing – the IoT Hub message routing feature forwards each message from a device to the proper event hub. Additionally, we use message enrichment to add twin tags during the forwarding. This approach allows us to sort events by type and protocols.

Other development teams and even partner companies can also connect, consume, and react to data from the proper feed.

Sorting data by event type diagram

Sorting data by event type (click to view full-size)

The diagram above presents how the IoT Hub propagates device events to the proper event hub. The event hub streams correct data based on the header type (telemetry, event, command) and by protocol. As you can see, on that level we don’t sort by offering, because it would be too granular.

Sorting data by offering

Sorting data by offering (click to view full-size)

The above diagram shows what sorting by a specific offering looks like. Devices send a custom protocol to a microservice which decodes (translates) the message and forwards it to a proper event hub. We look for telemetry, and each telemetry event is translated from bytes to human-readable objects, then based on the offering, it’s sent to a proper telemetry handler.

We specify a microservice based on the offering and event type – so we have a telemetry service for first and second offerings that are entirely separate. Each service is hosted on AKS as an individual namespace.

I mentioned that we also use Azure Service Bus, but I didn’t show where to use a service bus for internal communication in a microservice. Sometimes we need to use the FIFO (first in-first out) order and we do it using a service bus and session feature. The best case where we can use a FIFO order is a device firmware update, where firmware is split to small pieces and sent to the device.

A note on microservices

Our microservice can be a combination of many small services like the latest and historical datapoints. Both listen to the same event hub but for the different consumer groups.

Why have we done that? Because the logic is entirely different. The latest datapoints store the last datapoint values, but historical ones insert each value as a new record. Historical datapoints use a timescale to build a time-series.

Measure all your service data

It is clear now that we use a lot of services and process many events in this solution. As a result, it can be hard to detect a bottleneck. The only way to do it is to measure everything we have, and how many events each service takes in and out. We use AKS for this purpose. More specifically, we decided to use Prometheus.

Prometheus is a Kubernetes service that scrapes the metrics from services. When receiving a new message, the service first sends that information to Prometheus, then invokes a business logic and again notes how long it took and how many events were generated.

Example event metrics for an IoT architecture

Example event metrics

Having that data, we know how we can scale our services, and whether we have any problems. Data is processed in real time, so if anything happens, we know about it. We are also well prepared for scaling, as our services use the same metrics for it. Without them, we couldn’t estimate our loads and capabilities of our services.

Summary

Now you know how an event-driven architecture can be created to manage high volumes of IoT data. If you’re interested in creating a solution such as this or would like support with your project, just get in touch!

Key takeaways:

  1. The event-driven approach makes the system more scalable and extensible.
  2. Azure provides services like IoT Hub and Event Hub which make it relatively easy to build an event-driven solution for IoT.
  3. An event-driven system needs to be measurable in terms of events, and the instances need to be adjustable dynamically using autoscaling.