SEDIMARK Logo

SEDIMARK focuses on Artificial Intelligence/Machine Learning to ensure data quality

SEDIMARK · February 24, 2023

Machine Learning introduction

Machine Learning (ML) is a modern and efficient branch of AI (Artificial Intelligence), specialised in pattern recognition within data streams. It can provide precise analysis based on statistics to detect insights from a large data set, using the same principle as human neural networks in our brain. Every system equipped with ML must learn and discover patterns from historical data and compare its predictions with real data, before providing reliable information. That is why AI systems are trained with as much data as possible.

ML algorithms are more efficient than traditional modelling methods and can surpass human intelligence through its powerful computational efficiency. For instance, image recognition and time series analysis are well-known and widespread domains of application of ML for real world cases, such as the EU-funded SEDIMARK project. SEDIMARK aims at building a secure, trusted and intelligent decentralised data and services marketplace over several years, using ML in order to automate data quality management. Over time, the project results will provide ever-increasing accuracy and precision with its growing data sources.

ML could be directly used on edge systems to ensure data quality. Some algorithms are specialised for this purpose, with low power consumption and modest memory size. For instance, EdgeML and TinyML are open-source libraries that provide this kind of outcome.

ML embarked on edge systems

The IoT platform from EGM (i.e. the EdgeSpot) is compatible with both libraries and could manage and distribute FAIR data in an energy efficient way. The ONNX (Open Neural Network Exchange), an open format representing ML models, may be a solution to select the right combinations of tools. And finally, with the help of the use cases provided within SEDIMARK, the project might elaborate a concrete strategy to automate and manage data quality.

SEDIMARK plans to build a distributed registry of resources stored on edge systems, close to where data is generated. The purpose is to clean, label, validate and anonymise data.

Subscribe to SEDIMARK!

* required

🎉 Excited to share that Tarek Elsaleh will represent @sedimark at OpenSource Community Day 2025 in Madrid (23–24 Sept)! He’ll speak on how EU projects like SEDIMARK are advancing open data & innovation. 🇪🇺

📍 Don’t miss this key #opensource event!

🚀 Just Published: A Practical Guide to Multivariate Time Series Forecasting with Crossformer Package.
This tool is useful for forecasting tasks in domains like energy or sensor networks—where handling multiple correlated signals is essential.

Load More
crossmenu
SEDIMARK
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.