News Archives - SEDIMARK

This document is a deliverable of the SEDIMARK project, funded by the European Commission under its Horizon Europe Framework Programme. This document presents the “D6.4 Dissemination and Impact creation activities. Final version” deliverable, including all the carried out activities, communications and dissemination material, along with the final status of the Key Performance Indicators (KPIs) for such activities. Besides, it also includes the efforts that have been made for the cooperation with other projects and associations. Special emphasis will be put into the dissemination and communication activities carried out during the last eighteen months of the project, as previous actions were already reported on SEDIMARK_D6.3. The target audience for this document is manyfold, including the scientific community that will find a compendium of dissemination activities from where they can get into the project’s main outcomes or the EC to assess the impact raising activities that the project has executed.

During the project lifetime, a number of dissemination and communication activities have been carried out, reaching a large audience of variable types, including users, citizens, other research projects and the scientific community. These events span across online webinars to large exhibition venues, such as the Smart City Expo or IEEE CSCN 2025 conference. Besides, project members have participated to several cooperation activities, including workshops and panel discussion, so as to explore future means of collaboration. This includes the participation in Working Groups and different association events, such as the Data Spaces Support Centre (DSSC) Technical Working Group, the International Data Spaces Association (IDSA) or the Big Data Value Association (BDVA), as well as becoming members of important clusters, such as the ICT4WATER initiative. Finally, during the project, and mainly on the second half of it, which is the main focus for this deliverable, the scientific communication and dissemination has benefitted from the outcomes and results obtained within the technical activities, submitting and publishing several papers in conferences and journals.

It is worth highlighting that the online presence of SEDIMARK has been pushed forward, following a collaborative dissemination plan among all the partners, leading to the creation of multiple articles, multimedia content and social media posts to engage with the general public and disseminate project results. Furthermore, different dissemination materials have been produced, including brochures, videos and leaflets, facilitating the dissemination of the project at online and offline events.

Last but not least, hackathon activity has been organised in order to open the tools and services developed in the project to external participants so that it is possible to get valuable insights and feedback about the usability and evaluation of the SEDIMARK Marketplace, as well as better understanding the demands from those that might make use of the integrated project results.

D6.4 deliverable can be downloaded from here.

This document corresponds to the Deliverable D4.6 of the SEDIMARK project, named “Data sharing platform and incentives – Final version”. Its goal is to describe the SEDIMARK marketplace, a web frontend application constituting the entry point for users to the SEDIMARK ecosystem and all the functionalities it offers. It updates a previous Deliverable, SEDIMARK_D4.5 “Data sharing platform and incentives – First version”, submitted in December 2023, which described the Marketplace in an early development stage. This final version provides an up-to-date description of it, focusing on how users can navigate the web interface to access all features of SEDIMARK.

This document revolving around the graphical user interfaces of the SEDIMARK platform, it complements Deliverables SEDIMARK_D4.2 “Decentralized Infrastructure and Access Management - Final version” and SEDIMARK_D4.4 “Edge data processing and service certification – Final version”, which explain in greater details how the functionalities exposed in these interfaces actually work internally.

D4.6 deliverable can be downloaded from here.

In response to the growing demand for secure and transparent data exchange, the infrastructure of SEDIMARK Marketplace leverages cutting-edge technologies to establish a resilient network.

The decentralization approach ensures increased security, transparency, and user-centric control both over different types of assets and user identity information. The SEDIMARK Marketplace leverages distributed ledger technologies to establish a resilient and scalable infrastructure. The decentralized architecture of the marketplace is built on a robust distributed ledger employed for user identity management, as well as blockchain foundation, fostering tamper-resistant contracts.

This deliverable presents the final version of the Decentralized Infrastructure employed for the SEDIMARK Marketplace and the APIs that enable the functionalities satisfying the requirements and the objectives for the Project. It represents an important capstone for every stakeholder that plans to use the Marketplace for offering assets and consuming data and services. Here, after a careful examination and design process across all the partners of the project, the final version of the SEDIMARK Marketplace is shaped and consolidated. This document stems from the previous capstone realized with Deliverable SEDIMARK_D4.1 and Deliverable SEDIMARK_D3.4 in December 2023. The current deliverable builds upon the previous version, extending its scope and functionality. This document highlights the key differences, changes, and additions made since the last iteration, providing a clear overview of the enhancements and improvements implemented in this version. Additionally, each section of the current deliverable includes context and rationale for the updates, ensuring a comprehensive understanding of the project's evolution.

D4.2 deliverable can be downloaded from here.

This report details the final architecture and tools developed within the SEDIMARK project for edge processing and services certification. The work focuses on providing the foundational components for managing the entire lifecycle of data and AI assets, from their creation and processing at the edge to their certification and exchange in the marketplace. The key contributions establish a framework for AI-driven modules that can be deployed at the data source, adhering to MLOps principles while managing complex edge-cloud interactions.

A primary achievement of this work is the development of an edge processing framework designed for resource-constrained environments. Key innovations include:

WebAssembly (WASM) on MCUs: A secure and sandboxed architecture was implemented to allow user-defined code to run on low-power microcontrollers, enabling flexible and frequent updates without compromising the core firmware's stability.
Fine Timestamping and Energy Optimization: A novel algorithm was developed to calibrate the on-device Real-Time Clock (RTC) by compensating for temperature-induced drift. This significantly improves data timestamp accuracy and dramatically reduces the need for energy-intensive network synchronizations, extending the operational battery life of edge devices to meet a target of over four years.
Edge-Cloud Orchestration: An analysis of open-source tools led to the selection and deployment of platforms like Mage.ai and Apache NiFi to manage distributed data processing flows between edge devices and the cloud.

To address the challenges of managing AI models in a diverse ecosystem, the project has established a comprehensive MLOps strategy and a solution for model interoperability. By adopting MLFlow, SEDIMARK provides a standardized framework for the entire machine learning lifecycle. A critical innovation is the use of Keras Core to create framework-agnostic model descriptions. This allows models to be defined once and then seamlessly trained or used for inference across different backends like TensorFlow, PyTorch, and JAX, which is essential for fostering collaboration in federated learning scenarios where participants may use different tools.

Finally, to build a foundation of trust within the marketplace, a multi-faceted conformity evaluation Service has been designed. This service provides validation for all marketplace assets:

Data Assets are certified for conformance with standards like NGSI-LD and Smart Data Models, ensuring interoperability.
Service Assets are validated for API compliance, leveraging and contributing to the official ETSI NGSI-LD Test Suite.
AI Model Assets undergo a two-fold assessment, verifying not only their performance against quantitative KPIs but also their trustworthiness based on principles of fairness, transparency, and security, in alignment with emerging regulations like the EU AI Act.

Together, these advancements in edge computing, MLOps, and certification provide the core technical infrastructure for a robust, transparent, and efficient decentralized data marketplace.

Deliverable D4.4 can be downloaded from here.

This report presents the design and implementation of the core technical enablers for the SEcure Decentralised Intelligent Data MARKetplace (SEDIMARK) platform. The project aims to address the limitations of centralized data markets by fostering a secure, trusted, and intelligent ecosystem based on Distributed Ledger Technology (DLT) and Artificial Intelligence (AI). The work detailed in this document establishes the foundational components for interoperability, AI-driven services, DLT-based trust, and distributed storage, advancing the platform from a Technology Readiness Level (TRL) of 5 toward demonstration in real-world scenarios.

A key contribution of this work is a comprehensive interoperability framework. At its core, the framework uses the NGSI-LD specification to create a common semantic language for data assets. This is supplemented by a Marketplace Information Model, which defines crucial marketplace concepts such as Self-Description, Offering, and Asset. This model builds upon existing standards like DCAT and ODRL by introducing the "Offering" concept, which allows multiple diverse assets—such as datasets, AI models, and services—to be bundled and transacted together. A suite of software components within the Interoperability Enabler handles data formatting, curation, quality annotation, and validation to ensure data adheres to FAIR principles.

The platform's intelligence is powered by a multifaceted AI Enabler. This component supports advanced local model training with techniques like the transformer-based CrossFormer for multivariate time-series forecasting and model optimization methods like pruning. For collaborative scenarios, the project introduces two frameworks for distributed training: deFLight, a dynamic and fully decentralized framework supporting gossip and federated learning, and Fleviden, a tool for orchestrating complex federated workflows. A significant innovation is the Offering Generator, which uses Large Language Models (LLMs) to automatically create standards-compliant, semantically rich marketplace offerings from unstructured metadata, lowering the barrier to entry for data providers.

Trust and security are ensured by a DLT infrastructure built on a private instance of the IOTA Tangle (Layer 1) and IOTA Smart Contracts (Layer 2). This two-layer architecture provides a non-repudiable ledger for managing participant identities, cataloguing offering metadata, and facilitating secure asset trading. To support the platform's digital assets, a robust Storage Enabler provides a distributed architecture using Minio for AI model storage and NGSI-LD brokers for scalable, interoperable storage of linked data.

D3.4 deliverable can be downloaded from here.

SEDIMARK knows the importance of regulating data management issues within a context such as the one posed by the project. A solution will be considered where consortium partners will deposit all underlying information on data-related business processes (data storage, data provisioning, processing etc.) of the SEDIMARK solution clearly and transparently.

The purpose of the Data Management Action Plan (DMAP) is to identify the main data management elements that apply to the SEDIMARK project and the consortium. This document is the first version of the DMAP and will be reviewed as soon as there is a clearer understanding of the types of data that will be collected.

Given the wide range of sources from which data will be collected or become available within the project, this document outlines that the consortium partners will consider embracing and applying the Guidelines on FAIR Data Management in Horizon 2020 and Horizon Europe (HE); “In general terms your data should be ‘FAIR’, that is Findable, Accessible, Interoperable and Re-usable” [1], as information about data to be collected becomes clearer”.

Open access is defined as the practice of providing on-line access to scientific information that is free of charge to the reader and that is reusable. In the context of research and innovation, scientific information can refer to peer-reviewed scientific research articles or research data.

The SEDIMARK consortium strongly believes in the concept of open science, and in the benefits that the European innovation ecosystem and economy can draw from allowing the reuse of data at a larger scale.
Hence, as described in this report, the consortium will deposit the public data produced within or collected for the purposes of the project in an open data repository once the exploitation rights are safeguarded. It will permit the user to access, mine, exploit, reproduce and disseminate free of charge. Furthermore, it will provide information as well about tools and instruments necessary for the project results whenever possible.

As Data Security and Data Privacy are of particular concern, SEDIMARK will consider the GDPR (General Data Protection Regulation) privacy principles from conception to design to build a novel privacy-preserving architecture.

Moreover, this report addresses the ethics issues detected in SEDIMARK with the aim of:

Providing the procedures and criteria that will be used to identify/recruit research participants when involving people (external to the project consortium) in the research activities (including dissemination) needed for the implementation of the project tasks and / or elaboration of the project deliverables.
Creating the templates of the informed consent/assent forms and information sheets (in language and terms intelligible to the participants) that will be needed to use the information provided in the research activities or for using the personal data for dissemination activities.
Ensuring that all AI-related research conforms to the Ethical AI guidelines for Trustworthy Artificial Intelligence.
This deliverable takes as a reference the updated guidelines for Horizon Europe Programme regarding open research and innovation, data management and ethics support.

All in all, this report represents a second snapshot taken mid-way through the execution of the project (already in its eighteenth month of activity). The final update on the data management plan will be reported after at M36, September 2025, as part of a dedicated section for data management plan updates.

Due to the nature of this report, iterative and offering annual updates, the new content provided in this iteration will be written in a dark blue colour, sticking to the project's stylistic guidelines.

The D1.3 deliverable can be downloaded from here.

Data quality is of the highest importance for companies to improve their decision-making systems and the efficiency of their products. In this current data-driven era, it is important to understand the effect that “dirty” or low-quality data can have on a business. Manual data cleaning is the common way to process data, accounting for more than 50% of the time of knowledge workers. SEDIMARK acknowledges the importance of data quality for both sharing and using data to extract knowledge and information for decision-making processes. Thus, one of the main goals of SEDIMARK is to develop a data processing pipeline that assesses and improves the quality of data generated and shared by the SEDIMARK data providers.

This deliverable presents the final version of the methods and techniques developed within SEDIMARK for processing data and improving their quality, extending the first version which was delivered in SEDIMARK Deliverable D3.1 [74]. The focus in this deliverable is to present the final version of the key techniques that are used for quality improvement of datasets, based on the requirements of the SEDIMARK platform so that they all work together smoothly.

SEDIMARK considers two main types of data generated and shared within the marketplace: (i) static/offline datasets and (ii) dynamic/streaming datasets. The project acknowledges that it is important to cater to both types of datasets equally, thus in most scenarios, separate and customised versions of the tools have been developed for static and streaming datasets. Techniques for outlier detection, noise removal, deduplication and imputation of missing values are important for improving the quality of datasets. These techniques aim to remove abnormal values or noise from the dataset, remove duplicate values or fill out gaps in some entries or add complete entries. Techniques for feature engineering such as feature extraction and selection have also been developed to enrich the datasets. Synthetic dataset creation is important in scenarios where data providers don’t want to share their real datasets (i.e. for privacy reasons) but want to share synthetic versions that mimic the real ones.

This deliverable also presents the framework to orchestrate the whole functionality of the data processing pipeline using a Data Processing Orchestration. This component enables end users to interact with the built-in data processing solutions through a simplified dashboard interface. This deliverable also presents the final version of the key quality metrics that SEDIMARK has defined for assessing the quality of datasets, both per data point and as a whole, as well as the key techniques for dataset quality improvement, designed to meet SEDIMARK platform requirements and ensure seamless integration.

Another important part is the description of techniques towards reducing the energy consumption of the components of the data processing pipeline and optimizing data efficiency, i.e. using techniques for data distillation, coreset selection and dimension reduction. Minimising the communication cost in distributed machine learning scenarios is also important for SEDIMARK, because communication can increase energy consumption. Techniques to optimise the Artificial Intelligence (AI) models both during training and inference are also presented, focusing on quantisation, pruning, low rank factorisation and knowledge distillation.
Finally, considering that minimising energy

consumption can influence performance or communication, the deliverable presents the final analysis on these trade-offs, aiming to provide insights to data providers on how to better configure the pipeline or what models they should select in order to achieve their targets (energy efficiency/performance/communication).

D3.2 deliverable can be downloaded from here.