In today’s fast-paced digital landscape, effective data processing is a critical component for any organization looking to derive insights and drive innovation. However, setting up data pipelines - from extraction to transformation and loading (ETL) - has traditionally required a high level of expertise. We’re happy to announce a solution that could democratize data orchestration for users of all experience levels: the SEDIMARK Data Orchestrator powered by Mage.ai and enhanced with Generative AI.
Our platform integrates the power of Large Language Models (LLMs) to automatically generate Mage.ai pipeline blocks, helping even those with minimal technical background create robust data workflows. Instead of spending hours - or even days - writing code and configuring pipelines, users now simply need to upload their dataset into the Orchestrator GUI.
Once the dataset is in place, the system, with a bit of guidance through a helpful prompt, takes over the heavy lifting. Using generative AI, the platform produces custom Mage.ai templates and workflows specifically tailored to your data. This eliminates the need for users to dive deep into code or ETL specifics.
Whether you’re dealing with traffic provided data, weather records, or IoT data streams, the process starts with uploading your dataset into the Orchestrator GUI.
With the help of generative AI and LLMs, the platform processes the data structure and requirements, and instantly generates Mage.ai pipeline blocks.
These blocks are based on pre-defined templates for tasks like data cleaning, transformation, anomaly detection, and prediction, all while allowing the flexibility to adapt to any dataset. You no longer must start from scratch.
As a less experienced user, all you need is a brief guiding prompt. The system understands the context of the data and the desired outcome, and it provides a workflow that’s ready to run.
Data engineering has often been a domain reserved for those with extensive technical know-how. With the introduction of our Generative AI-powered Data Orchestrator, this is no longer the case. By reducing the complexity and time involved in configuring ETL pipelines, we’re empowering organizations to:
With the launch of this new capability, we’re excited to see how businesses will leverage it to innovate. Whether you’re building predictive models, automating anomaly detection, or simply making data-driven decisions faster, the Sedimark Data Orchestrator simplifies every step of the process.
Last week, we had the privilege of meeting with our partners at LINKS Foundation in Torino for a productive General Assembly meeting.
Over the course of two days, we engaged in in-depth discussions and collaborative efforts to define the next steps as we approach the final year of SEDIMARK project.
Thanks to the hard work and dedication of the entire team, we’ve made significant progress and are excited about what’s to come! 💡 Stay tuned for more updates as we continue working towards a decentralized AI-enabled marketplace!
Data marketplaces and data spaces have become a new trend aiming to provide platforms so that companies and researchers can exchange datasets in a secure way. Data has become a new currency and its crucial to be able to get access to high quality datasets to quickly and easily gain some knowledge towards meeting your objectives. However, most such existing platforms are centralised, gathering all the datasets in the cloud or some central servers, without providing high privacy or not giving providers or consumers the necessary supplies to assess or improve the quality of the datasets they exchange.
SEDIMARK aims to change the domain of data marketplaces contributing to the decentralisation of data exchange platforms, providing their users with the needed tools for improving data quality and building knowledge upon the data.
This document presents the final complete version of the SEDIMARK functional and system architecture, aiming to provide details on what functionalities the SEDIMARK platform will provide and how these will interact to meet the main objectives of the project. SEDIMARK builds upon the concepts of trust, decentralisation, interoperability, data quality and intelligence to provide a fully decentralised data and services marketplace, where providers and consumers will be able to share their data and build knowledge upon them.
Before presenting in detail the system architecture, this document provides a complete list of the terms and concepts defined and used within the project so that the readers can understand how these terms are used within the context of SEDIMARK. The main actors are also defined, mainly split into (i) providers, who are providing data, ML models, or services and (ii) the consumers, who are consuming the assets that are provided.
This deliverable leverages the results of the SEDIMARK Deliverable D2.1 [1], in particular the work done in Task 2.1 (Use Cases definition) focusing on project use cases and T2.2 (Requirements Elicitation) focusing on system requirements. This deliverable also builds upon the SEDIMARK Deliverable D2.2 [2] which presented the initial version of the architecture. In this deliverable a refined list of requirements is also provided based on the feedback from the above tasks and deliverables, with several new requirements added in some categories.
This document also presents the final versions of internal and external interfaces of the SEDIMARK platform, providing a view on how the components will interact with each other and how the main services will be provided. For the latter, example data flows are also presented, showing the messages exchanged between key components, towards providing the service. A final system view of the architecture also shows how the platform can be instantiated in the real world.
This final version of the SEDIMARK architecture aims to provide an architecture for designing decentralised data spaces and marketplaces for researchers and engineers and also ideas and concepts to EU initiatives (i.e. the DSSC) so that common reference architectures can be built.
D2.3 deliverable can be downloaded from here.
In today’s data-driven world, finding relevant datasets is crucial for researchers, data scientists and businesses. This has led to the development of dataset recommendation systems. Similarly, as the movie recommendation system used by Netflix guiding users to discover the most relevant movies, the dataset recommender system aims to guide users to navigate this complex landscape of dataset discovery efficiently.
Recommender systems learn to analyse user behaviour to make intelligent suggestions. This can be achieved with a variety of techniques, ranging from content-based filtering approaches that focus on the item descriptions and recommend similar items to those that the user has previously interacted with; through collaborative filtering approaches that recommend items based on interactions of similar users; to hybrid approaches combining the content and collaborative filtering.
High-quality recommender systems provide many benefits to users such as efficiency by automating the search for relevant items, personalisation improving user satisfaction and enhanced discovery exposing users to items they may not be aware of.
It is clear that an efficient recommender system has many advantages in various domains and dataset recommendation is no different. However, with the exponential growth of data, often residing at various locations, finding the right dataset for a specific task or project has become increasingly challenging [1]. Moreover, numerous datasets lack high-quality descriptions making the discovery even harder [2]. This is particularly important for content-based recommender systems as they rely on high-quality metadata. Therefore insufficient dataset metadata information brings challenges associated with effective dataset recommendations, as high-quality recommendations rely on high-quality metadata information.
In SEDIMARK, we aim to address the challenge of poor quality metadata in dataset recommendation with the development of novel techniques for dataset metadata enrichment. With automatic and efficient metadata enrichment, SEDIMARK can improve the overall user experience and dataset discoverability and drive better decision-making for the future.
[1] Chapman, Adriane, et al. "Dataset search: a survey." The VLDB Journal 29.1 (2020): 251-272.
[2] Reis, Juan Ribeiro, Flavia Bernadini, and Jose Viterbo. "A new approach for assessing metadata completeness in open data portals." International Journal of Electronic Government Research (IJEGR) 18.1 (2022): 1-20.
WINGS ICT Solutions participated in the conference “Energy Efficiency in Manufacturing”, Hands-on Innovation and Technologies for a Net Zero Future, on Thursday, 13th of June 2024.
At WINGS ICT Solutions, we are committed to pioneering innovative solutions that drive energy efficiency and sustainability in the manufacturing sector. Our team showcased our latest advancements in smart manufacturing technologies that not only enhance productivity but also significantly reduce energy consumption.
ARTEMIS over WINGSChariot is the product of WINGS that is oriented to the proactive management of water, energy, gas infrastructures.
Based on the WINGS approach, it combines advanced technologies (IoT, AI, advanced networks and visualizations) with domain knowledge, to address diverse use cases. As a management system it delivers the following functionalities:
Commercial traction has been achieved, while further interest is stimulated in various areas and with various tentative partners.
In that respect, WINGS participates and exploits its developments in SEDIMARK. SEDIMARK enables to create a secure decentralized data marketplace for the energy sector with high quality data on edge based on blockchain and AI.
SEDIMARK will be in Budapest from 2nd to 4th October at the European Big Data Value Forum (EBDVF). We will be presenting the project and participating in a thrilling session together with other 5 projects from the 2021-Data-01-03 "Technologies for Data Management" Call.
The most recent results for the project will be presented and the discussions that will be held in the session aims at providing insights and recommendations for further developing and integrating building blocks into vertical Data Spaces. The discussion will delve into highlighting the complementarities or variations between the different projects involved, to let domain/sector specific (i.e. water, energy, green deal, mobility, agrifood, etc.) data space building blocks emerge, and specifically considering Data Interoperability, Data Sovereignty, Data Value Creation and Data Governance.
Join us at Crowne Plaza Budapest (Room Gold) on October 4th, from 10:15 to 11:45 for “Leveraging Technologies for Data Management to Implement Data Spaces”.
Have you ever wondered how a smart city manages to keep everything from urban planning to environmental monitoring running smoothly? The answer lies in something called Spatial Data Infrastructure (SDI). While it might sound technical, SDI framework plays a crucial role in making geographic information accessible and integrated, benefiting everyone.
Imagine a world where data about locations – from urban planning maps to environmental monitoring systems – is at your fingertips. SDI turns this vision into reality. By connecting data, technology, and people, SDI helps improve decision-making and efficiency in numerous areas of our lives.
The SEDIMARK Helsinki pilot aims to demonstrate how Digital Twin technology can revolutionize urban mobility with spatial data as the backbone. SEDIMARK's context broker (NGSI-LD) handles linked data, property graphs, and semantics using three main constructs: Entities, Properties, and Relationships. This integration opens up opportunities for new services and the development of a functional city, aiming to enhance geospatial data integration within urban digital twins. In Helsinki, the approach focuses on transitioning from a monolithic architecture to a modular, API-driven approach, developing Digital Twin viewers and tools, and collaborating on a city-wide Geospatial Data.
Join us on this journey as we dive into the world of Spatial Data Infrastructure and see how it's making our city smarter, more efficient, and better prepared for the future.
Photo credit. https://materialbank.myhelsinki.fi/images/attraction?sort=popularity&openMediaId=6614
When we think of data, especially from diverse traffic sources, beauty isn't typically the first thing that comes to mind. Instead, we imagine numbers, graphs, and charts, all designed to convey information quickly and efficiently. However, what if we could see data not just as a tool for analysis, but as a source of inspiration, capable of producing visuals as captivating as a masterpiece by Vincent van Gogh? Just like van Gogh's "Starry Night" finds beauty in complexity and chaos, we can render data into beautiful, meaningful visualizations.
Traffic data is inherently complex. It comes from a variety of sources of interoperable systems and devices. Each source provides a different perspective, capturing the flow of vehicles, the density of traffic, and the speed of travel at any given time. When combined, these data points create a comprehensive picture of urban movement.
Much like the seemingly chaotic yet harmonious art, raw traffic data can appear overwhelming. However, through careful visualization and simulation, patterns and insights emerge. Advanced algorithms process the data, identifying trends and correlations that aren't immediately apparent. For instance, heat maps can show areas of high congestion, while flow diagrams can illustrate the movement of vehicles through a city over time.
Data visualization is an art form in its own right. The choice of colors, shapes, and lines can transform a simple graph into a work of art. For example, a time-lapse visualization of traffic flow can resemble the dynamic motion in an urban city with streams of vehicles.
Helsinki mobility digital twin paves the way for a future where cities leverage data. This data-driven revolution, fueled by powerful data visualization, holds immense potential for creating a more efficient, sustainable, and safer urban transportation landscape.
So, can traffic data be beautiful? Absolutely. All it takes is the right perspective and a touch of creativity to turn numbers into a work of art.
Photo credit: Kuva.
Last Thursday, we had the pleasure of hosting Javier Valiño, Program Manager of the Data Space Working Group (Data Space WG) from the Eclipse Foundation, at Universidad de Cantabria.
During the meeting, Javier presented the latest advancements made by the Data Space WG. Their mission is to promote the global use of dataspace technologies, supporting the development and maintenance of secure data-sharing ecosystems. We also discussed SEDIMARK's initiatives in creating technologies for decentralized and secure data exchange.
Data Space WG goals are closely aligned with the SEDIMARK's innovative proposals for a decentralized marketplace, thus we will keep exploring future collaborations in the Data Space world.