Flatcar as the operating system of the Industrial IoT
Discover why we chose Flatcar as the operating system for the Industrial IoT. We want to guide you through our thought process - from our requirements to selecting the operating system.
Operating Systems (OS) play a crucial role in the world of Industrial IoT, serving as the foundation for various applications and processes. Regardless of the intended use, whether it be data extraction, data visualization, or deployment of microservices and Kubernetes clusters, an operating system is essential for running the necessary software and applications, including the United Manufacturing Hub (UMH).
In this article, we aim to provide guidance on the selection process of an operating system for the UMH. By walking the reader through our own thought process and requirements, we aim to provide a valuable resource for those facing similar challenges. It is important to note that this is not a comprehensive evaluation of all available options, but rather an insight into our own journey to find the best fit for our needs.
TLDR: We utilize Flatcar for its strong maintainability, even though the initial setup may be more complex. This allows us to efficiently deploy hundreds to thousands of machines in production, reducing maintenance overhead.
Requirements and Use Cases
Selecting an operating system for the deployment of the UMH or any other Industrial IoT application requires a thorough understanding of the system's requirements and use cases. An explaination of our general requirements (reliability, scalability and maintainability) can be found in our blog posts about MQTT broker. In the following, we will focus only on the specific requirements.
The operating system must meet the following key requirements:
- Lightweight to run efficiently on edge devices
- Stable and supported long-term to minimize maintenance efforts
- Immutable with a declarative configuration to ensure maintainability
The UMH operating system must be capable of installation on both edge devices and on-premise servers (lightweight), and able to run for prolonged periods with limited physical access for maintenance (stable). This is especially important since our customer base is global, and frequent travel for software updates or support requests must be avoided.
Furthermore, the UMH operates under an immutable approach to minimize the risk of unintended changes or side effects. Let's understand what is meant by that.
What is immutability
This means that we do not allow quick changes to the system, e.g., logging in via SSH and adding a new application using `apt-get`. This might seem restrictive at first, but it is a crucial aspect for maintaining the system.
In large enterprises, when the person who set up the system is no longer available, it can be difficult to manage a system that has been carelessly modified without changes to the documentation. This often leads to "ghost" servers with unpatched vulnerabilities, left untouched for fear of breaking the production. To address this, the UMH requires documentation through automated installation scripts, such as a Dockerfile for containers, a Helm Chart for microservice architectures, or in the case of operating systems, a declarative configuration file. These scripts then automate the deployment process, ensuring that the system is set up consistently and can be easily maintained. Someone in a couple of years can then look at the configuration and determine easily what the system should do and whether an update is likely to be successful.
One disadvantage of the immutable approach is that it is not widely adopted yet. As a result, many online tutorials and guides rely on SSH’ing into a running server and install software using `apt-get` or similar methods. However, we are confident that this will change in the future. We recall the resistance to using Docker containers that we once encountered, as the initial learning curve seemed steeper than going without them. However, we ultimately found that containers reduced the overall maintenance burden, and most of our peers shifted from writing raw python to using Docker containers.
To summarize: the operating system for our edge devices and servers must be lightweight, stable, and immutable to ensure reliability and maintainability.
Our journey started with using non-immutable systems such as Ubuntu, but maintaining them turned into a nightmare. People changed random configurations and then our support needed to first find those changes and then fix them again.
We then turned to k3os, which promised to be an operating system exclusively running k3s, a lightweight Kubernetes. However, k3os was eventually deprecated, and after conducting some research, we selected flatcar based on a recommendation (thank you Marc :) ). Although other options like Bottlerocket from AWS or Talos Linux may exist, we were drawn to flatcar's reliability and enterprise-readiness.
Compared to k3os, flatcar offers several key advantages, including:
- Increased logging options. In the past, we experienced challenges with edge devices running on an unreliable power supply. k3os would simply hang and fail to log any kernel information, making it difficult to diagnose issues.
- Increased customization options. With flatcar, the ignition files provide more configuration options, including the ability to set up multiple network interfaces with DHCP, firewall, and more.. With this, we do not need a separate router or VM like OPNsense anymore, but can instead do the routing all in one single file.
- The option to use docker-compose on the basic OS. Many of our customers are still not fully familiar with Kubernetes, so having the option to deploy using docker-compose is beneficial in emergency situations. While this deviates from the immutability approach, customers are aware that they will not receive enterprise support for the entire node if they choose this option.
In summary, after facing challenges with non-immutable systems like Ubuntu and the eventual deprecation of k3os, our journey led us to flatcar, which offers increased logging and customization options, as well as the option to use docker-compose in emergency situations.
How we use it
For those looking to install flatcar with k3s and the UMH, we have a tutorial available for reference. This article provides additional background information on the process.
The entire process is entirely chainloaded, so that a single USB image or ISO file can be used to boot and setup various machines.
The image, which you will download and flash to a USB drive, features a modified version of iPXE. This prompts the user for the hard disk (1), where flatcar should be installed on, or a customer-specific token (when we setup systems, not available to the general public yet). It then fetches system information such as the MAC address and device serial number, and sends it to our matchbox instance running on deploy.umh.app (2). From there, it requests the configuration files also called “ignition file” (3).
Next, the latest long-term support (LTS) version of flatcar is downloaded and booted using the ignition file. The ignition file itself contains instructions to download (again…) and install flatcar on a local disk (4). Once installed, a second ignition file is fetched from deploy.umh.app (5, 6) and loaded in the second boot, containing the necessary applications and system settings to setup the UMH (7).
In this ignition file included is also the UMH management console for remote monitoring and management of the system, including a Tailscale VPN connection for remote access (if requested by the customer). This is particularly useful for troubleshooting PLCs and accessing the device and surrounding networks.
The article evaluates different operating systems for use in the Industrial IoT. It was concluded that Flatcar best fits the needs for deployment of the United Manufacturing Hub (UMH). The operating system must be lightweight, stable, and immutable for reliability and maintainability. Flatcar was chosen for its increased logging options, customization options, and the option to use docker-compose. Flatcar's increased reliability and enterprise-readiness compared to k3os and its alignment with the immutable approach made it the best fit for the UMH operating system.