SEDIMARK Logo

SEDIMARK focuses on Artificial Intelligence/Machine Learning to ensure data quality

SEDIMARK · February 24, 2023

Machine Learning introduction

Machine Learning (ML) is a modern and efficient branch of AI (Artificial Intelligence), specialised in pattern recognition within data streams. It can provide precise analysis based on statistics to detect insights from a large data set, using the same principle as human neural networks in our brain. Every system equipped with ML must learn and discover patterns from historical data and compare its predictions with real data, before providing reliable information. That is why AI systems are trained with as much data as possible.

ML algorithms are more efficient than traditional modelling methods and can surpass human intelligence through its powerful computational efficiency. For instance, image recognition and time series analysis are well-known and widespread domains of application of ML for real world cases, such as the EU-funded SEDIMARK project. SEDIMARK aims at building a secure, trusted and intelligent decentralised data and services marketplace over several years, using ML in order to automate data quality management. Over time, the project results will provide ever-increasing accuracy and precision with its growing data sources.

ML could be directly used on edge systems to ensure data quality. Some algorithms are specialised for this purpose, with low power consumption and modest memory size. For instance, EdgeML and TinyML are open-source libraries that provide this kind of outcome.

ML embarked on edge systems

The IoT platform from EGM (i.e. the EdgeSpot) is compatible with both libraries and could manage and distribute FAIR data in an energy efficient way. The ONNX (Open Neural Network Exchange), an open format representing ML models, may be a solution to select the right combinations of tools. And finally, with the help of the use cases provided within SEDIMARK, the project might elaborate a concrete strategy to automate and manage data quality.

SEDIMARK plans to build a distributed registry of resources stored on edge systems, close to where data is generated. The purpose is to clean, label, validate and anonymise data.

Subscribe to SEDIMARK!

* required

🌟 Introducing our Improved Data Orchestrator! With this tool, even less experienced users can easily handle ETL tasks and create new pipelines using generative AI. 🚀 Simplifying data processing for everyone!
🌐 https://innovation-radar.ec.europa.eu/innovation/56728
#GenerativeAI #DataOrchestration #ETL #AI

Insights from the #EBDVF2024 session in Budapest on "Leveraging Technologies for Data Management to Implement #DataSpaces"! 6 key projects like @sedimark shared groundbreaking developments toward a more interoperable, standardized data-sharing future 🌍https://sedimark.eu/sedimark-at-the-european-big-data-value-forum-ebdvf-2024/

2

🚀 Excited to unveil our Improved Data Orchestrator for decentralized marketplace! 🌐 Powered by , it handles any data processing, from preprocessing to prediction—flexible for all use cases.
🌐 https://innovation-radar.ec.europa.eu/innovation/56728 #AI #DataOrchestration #Innovation

Load More
crossmenu