Software developers on hardware teams don’t usually have infrastructure to build their custom software on. They have to work with dozens of data acquisition devices to get data into a single database. Oftentimes, they have to use several different solutions to communicate with different devices: LabVIEW for National Instruments hardware, ROS for robotics, and SCADA software for PLCs.
There’s no good “glue” that lets you coordinate all of these devices together. Developers end up using tools like Apache Kafka for streaming and InfluxDB for storage, but these tools are hard to get working with hardware, plus it’s a pain to configure these tools to also record and stream data whenever commands are sent to control hardware.
This forces developers to repeatedly build adapters to get data off hardware devices and manage separate systems for real-time streaming, data storage, and hardware control.
I (Emiliano) discovered this problem while working as a test engineer at an aerospace company. We used old control software that spit out data in massive 10 GB CSV or TDMS files. After a long day and night of testing, no one wanted to go through all the work to review the data.
One day, I was operating the system, and a very expensive component failed, causing a multi-million dollar test stand to explode. After many days of data review, we found a small anomaly that indicated a component defect.
I then got fascinated by this problem, and moved into a software engineering role to improve their data pipeline. After I left this job and went back to school, I spent most of my time skipping classes to build Synnax, eventually meeting Patrick and Elham.
Synnax has several main parts. We have a custom time series-database that was designed to be horizontally scalable and fault-tolerant. The database can deploy in an OS or in a container. Every sensor and actuator can fit into a “channel” that can be written to using our client libraries in C++, Python, and TypeScript.
When writing to a channel the server both persists it to the database and streams it for real-time consumption for any type of service, such as a GUI, automated post-processing tools, and supervisory control sequences.
Finally, we’ve built a React component library that simplifies the process of making a GUI, a visualization and control desktop application, and pre-built device integrations for National Instruments hardware and PLCs via OPC UA.
We think Synnax is unique in that it provides a bridge between existing solutions like ROS, LabVIEW, and SCADA software and general purpose tools like Apache Kafka or InfluxDB.
Synnax is source-available on a BSL 1.1 license (GitHub: https://github.com/synnaxlabs/synnax, documentation: https://docs.synnaxlabs.com). Usage of the software is free for up to 50 channels. We aren’t yet sure on what pricing to settle on—we’ve talked about doing only usage-based implementation or also adding an implementation cost.
If this sounds interesting to you, please check us out! You can follow through on our documentation website (https://docs.synnaxlabs.com) to deploy a database and download the desktop visualization dashboard.
We’d really love to hear your feedback and look forward to all of the comments!
How are you going to interface with the big boys like rockwell? I see you have drivers, what about partnerships? I know a lot of companies tend to only work with toolsets their provider "blesses", so having them on "your team" can help. You may have to pick favorites to win early deals/"synergy" (and may help with acquisition?)
I've worked with industrial automation in the past and have always enjoyed the technical constraints within it. I would be interested in helping you with pre or post-sales support/training/implementation for your customers if you need it. Email is in my profile.
Our plan so far has been to try to interface with the bigger companies through the drivers we make for their hardware. We haven't reached out about partnerships yet, but that is a really good idea.
Thank you for the offer - will definitely reach out if and when we need more help on the implementation side.
Second question. The main platform in this space is Ignition. Do you consider yourself a competitor to Ignition or are you aiming for a different use case?
2. We see a lot of value in providing essentially a universal adapter to these protocols and hardware interfaces. Decoupling the data communication/device infrastructure from the control and acquisition workflows is big for us and this seems essential to that. A big endeavor on its own, but our existing integrations have been really helpful to our users and as it matures, we intend to continue expanding these integrations!
Hopefully 1 & 2 address your first question!
3. Addressing the second question: We've mostly been focusing on the test & operations use cases (e.g. running real-time control and data acquisition for engine tests). We see a lot of ways we can eventually service industrial controls/automation space - similar to Ignition. However, we are also aware of many reasons people in this space will want to stick to tried and true tools with a larger community and ecosystem.
We're still figuring out how we fit into that space + communicate our ability to provide the breadth of functionality and support them. Posts like this and the users who already see the value and are willing to try something more novel and developmental like us have been huge in progressing towards this.
Some questions I have!
Ended up being a long message but I appreciate your insights on any of what I just said!1. It's basic networking tasks such as running a network drop, assigning IPs, making sure the PLCs are on the right subnet, etc. In many cases the PLCs aren't on a network at all and the IT team doesn't really know how to work with the PLCs and the OT team doesn't really know how to work with networks. Sometimes it's been easier to just add external sensors and go over a cellular network and skip the PLC altogether.
2. We use one of Ignition's modules to interface with the control systems directly. They have drivers for Allen-Bradley, Siemens S7, Omron, Modbus, and a few others. The downside is Ignition doesn't have an API, so we have to configure things using a GUI. Beyond Ignition, the other big provider of drivers is Kepware - they probably have a driver for everything, but again, they aren't really set up for use by developers trying to deploy to a Linux box. If the customer has an OPC-UA server set up, we can connect to that using an open source library.
3. What we've learned is that many customers rely on third parties (e.g. the machine manufacturer or a system integrator) to configure their system, so when it comes to extracting the data they want, you're kind of on your own. We're not industrial system experts, so this creates a unique challenge. Larger and more sophisticated customers will have a much deeper understanding of their systems, but these folks are usually going to be using something like Ignition and will already have the dashboards and reports so it's more a matter of integrating with Ignition.
> We used old control software that spit out data in massive 10 GB CSV or TDMS files. After a long day and night of testing, no one wanted to go through all the work to review the data.
> We think Synnax is unique in that it provides a bridge between <lab/automation DAQ systems>
On the surface it seems like anomaly detection is still the hard problem, but you’re not setting out to solve it?
Time series databases are state of the art generally in finance, not in industrial/InfluxDB, so I don’t think saying you’re 5x influxSB on writes is going to persuade too many people, especially given the cost now for a terabyte of RAM. I’ll just move all of it to an in-memory database before I’ll take on the switching costs.
The thing I wanted was one solution for something that was always two: a properties/metadata database, and a separate time series database.
It seems to me like you are maybe building a level too low and could get a lot more value working on the problem that you say motivated you in the first place. It is hard because of all the context required to automatically detect anomalies, but I think that is why it is valuable to solve.
The value we had was we rolled in the data/cellular connection all the way down to the endpoint, so they could avoid IT integration, which was a big hurdle at the time. I don’t know if IT integration is still a hang up for your customers.
We found that visualization layers tended to reach down just far enough into the data intake world that it was really hard to sell just another tsdb.
I definitely agree with this. Our early prototype of Synnax actually sat on top of a combined Redis/S3/SQL stack and focused on those high level features. We found that it was challenging to deploy, manage, and synchronize data across these services, especially when you're running everything on prem.
We've come to believe that a re-architecture of the underlying infrastructure can actually unlock the high level workflows. For example, to compare a real-time telemetry stream with a historical data set you'd need to query across tools like Kafka and Influx at the same time. For an experienced software engineer this isn't too hard of a task, but they don't tend to be the people who understand the physics/mechanics of the hardware. We want it to be possible for say, a Turbo machinery expert, to translate a Python script they wrote for post-processing a CSV into something Synnax compatible without a huge amount of work.
In short, we're working on finding a way for subject matter experts in hardware to implement the anomaly detection mechanisms they already have in their head, but don't have the software expertise to implement.
> The thing I wanted was one solution for something that was always two: a properties/metadata database, and a separate time series database.
What do you think about TimeScale for this sort of use case? Haven't run it in production myself, but having your time-series data in the same place as a SQL table seems pretty nice.
> We found that visualization layers tended to reach down just far enough into the data intake world that it was really hard to sell just another tsdb.
This is a good point. We think that focusing exclusively on the DB is probably not the right approach. Most of our focus nowadays is on building out higher level workflows on top of the new database.
I found TimescaleDB after I wrote this — it does look like the answer to my problems from a decade ago. I don’t do that anymore but I’m glad someone brought it to market.
If you can describe with clarity how a scientist/hardware engineer using your tool is going to implement their anomaly detection, or whether your software will somehow shadow and assist/learn from what they try to do, I think that would be a much more compelling pitch.
In general, having built out the core DB, it has been valuable in allowing us to expand to the other useful features such as being able to write commands out to hardware at sufficient control loop frequencies or create smooth real-time visualizations.
The other thing we think is really powerful is having a more integrated tool for acquiring, storing, and processing sensor data & actuating hardware. One common issue we experienced was trying to cobble together several tools that weren't fully compatible - creating a lot of friction in the overall control and acquisition workflow. We want to provide a platform to create a more cohesive but extensible system and the data storage aspect was a good base to build that off of.
For other meta-data, such as channels, users, permissions, etc. we rely on CockroachDB's Pebble, which is a RocksDB compatible KV store implemented in pure go.
We also value enabling developers to build off of Synnax or integrate the parts they are most interested in into their existing systems. We've tried to service that end by building out Python, Typescript, & C++ SDKs and device integrations. We're continuing to look into how we can better support developers build/expand their systems with Synnax, so if there are any integrations you think are important, I would appreciate your take.
To address hardware deeply integrated with proprietary systems - our current mode of dealing with this is building out the middleware to allow these systems to integrate with Synnax - or a protocol that can be used to communicate with these devices. But we also try to make it a developer-friendly tool so that if people already familiar with these systems want a quick way to connect Synnax to their existing systems - they can do so using our Python, Typescript, or C++ libraries.
Our goal would be to substitute these cobbled-together systems for a more uniform and airtight system. We hope to do that by starting with sub-scale preliminary integrations on already-in-development setups/expansions that can be a base for expanding out to the rest of the system.
A sensor is physically wired to an input on a PLC which collects data, the historian software communicates with the PLC/DCS and saves instrument/sensor data for further review.
https://asimov.fandom.com/wiki/Synnax
Synnax also follows a pub-sub model, which enables functionality like having multiple clients/consoles be able to view and monitor their physical systems.
I'd say we try to reach closer to the edge to help directly facilitate sensor abstractions. In this vein, Another way Synnax seems to differ is to try to cater more to the hardware control aspect. So for example, we have a panel for users to create schematics of propulsion/electrical systems which they can they link with automated scripts to command that actual system and override with manual control when necessary.
Using multiple nodes of Synnax in a distributed fashion is also still a WIP but a goal of ours as well!
If you use Zenoh would love to hear how you use it and if my impressions of it are correct.
https://www.cs.dartmouth.edu/~kotz/research/project/solar/
https://www.cs.dartmouth.edu/~kotz/cmc/projects/solar.html