SEDIMARK Logo

In today’s fast-paced digital landscape, effective data processing is a critical component for any organization looking to derive insights and drive innovation. However, setting up data pipelines - from extraction to transformation and loading (ETL) - has traditionally required a high level of expertise. We’re happy to announce a solution that could democratize data orchestration for users of all experience levels: the SEDIMARK Data Orchestrator powered by Mage.ai and enhanced with Generative AI.

Simplified Data Processing with AI Assistance

Our platform integrates the power of Large Language Models (LLMs) to automatically generate Mage.ai pipeline blocks, helping even those with minimal technical background create robust data workflows. Instead of spending hours - or even days - writing code and configuring pipelines, users now simply need to upload their dataset into the Orchestrator GUI.

Once the dataset is in place, the system, with a bit of guidance through a helpful prompt, takes over the heavy lifting. Using generative AI, the platform produces custom Mage.ai templates and workflows specifically tailored to your data. This eliminates the need for users to dive deep into code or ETL specifics.

How It Works

Whether you’re dealing with traffic provided data, weather records, or IoT data streams, the process starts with uploading your dataset into the Orchestrator GUI.

With the help of generative AI and LLMs, the platform processes the data structure and requirements, and instantly generates Mage.ai pipeline blocks.

These blocks are based on pre-defined templates for tasks like data cleaning, transformation, anomaly detection, and prediction, all while allowing the flexibility to adapt to any dataset. You no longer must start from scratch.

As a less experienced user, all you need is a brief guiding prompt. The system understands the context of the data and the desired outcome, and it provides a workflow that’s ready to run.

Democratizing Data Engineering

Data engineering has often been a domain reserved for those with extensive technical know-how. With the introduction of our Generative AI-powered Data Orchestrator, this is no longer the case. By reducing the complexity and time involved in configuring ETL pipelines, we’re empowering organizations to:

  1. Accelerate time-to-value. With AI doing most of the setup work, teams can focus on what truly matters—extracting insights from their data, not configuring workflows.
  2. Reduce the learning curve. No more spending weeks learning the intricacies of ETL processes. With our platform, even unexperienced users can be up and running in no time.
  3. Produces customizable workflows. While the platform provides default templates, advanced users still have the flexibility to customize their pipelines to meet more complex or specific requirements.

What’s Next?

With the launch of this new capability, we’re excited to see how businesses will leverage it to innovate. Whether you’re building predictive models, automating anomaly detection, or simply making data-driven decisions faster, the Sedimark Data Orchestrator simplifies every step of the process.

Last week, we had the privilege of meeting with our partners at LINKS Foundation in Torino for a productive General Assembly meeting.

Over the course of two days, we engaged in in-depth discussions and collaborative efforts to define the next steps as we approach the final year of SEDIMARK project.

Thanks to the hard work and dedication of the entire team, we’ve made significant progress and are excited about what’s to come! 💡 Stay tuned for more updates as we continue working towards a decentralized AI-enabled marketplace!

In today’s data-driven world, finding relevant datasets is crucial for researchers, data scientists and businesses. This has led to the development of dataset recommendation systems. Similarly, as the movie recommendation system used by Netflix guiding users to discover the most relevant movies, the dataset recommender system aims to guide users to navigate this complex landscape of dataset discovery efficiently. 

Recommender systems learn to analyse user behaviour to make intelligent suggestions. This can be achieved with a variety of techniques, ranging from content-based filtering approaches that focus on the item descriptions and recommend similar items to those that the user has previously interacted with; through collaborative filtering approaches that recommend items based on interactions of similar users; to hybrid approaches combining the content and collaborative filtering. 

High-quality recommender systems provide many benefits to users such as efficiency by automating the search for relevant items, personalisation improving user satisfaction and enhanced discovery exposing users to items they may not be aware of.

It is clear that an efficient recommender system has many advantages in various domains and dataset recommendation is no different. However, with the exponential growth of data, often residing at various locations, finding the right dataset for a specific task or project has become increasingly challenging [1]. Moreover, numerous datasets lack high-quality descriptions making the discovery even harder [2]. This is particularly important for content-based recommender systems as they rely on high-quality metadata. Therefore insufficient dataset metadata information brings challenges associated with effective dataset recommendations, as high-quality recommendations rely on high-quality metadata information. 

In SEDIMARK, we aim to address the challenge of poor quality metadata in dataset recommendation with the development of novel techniques for dataset metadata enrichment. With automatic and efficient metadata enrichment, SEDIMARK can improve the overall user experience and dataset discoverability and drive better decision-making for the future.

[1] Chapman, Adriane, et al. "Dataset search: a survey." The VLDB Journal 29.1 (2020): 251-272.

[2] Reis, Juan Ribeiro, Flavia Bernadini, and Jose Viterbo. "A new approach for assessing metadata completeness in open data portals." International Journal of Electronic Government Research (IJEGR) 18.1 (2022): 1-20.

WINGS ICT Solutions participated in the conference “Energy Efficiency in Manufacturing”, Hands-on Innovation and Technologies for a Net Zero Future, on Thursday, 13th of June 2024.

At WINGS ICT Solutions, we are committed to pioneering innovative solutions that drive energy efficiency and sustainability in the manufacturing sector. Our team showcased our latest advancements in smart manufacturing technologies that not only enhance productivity but also significantly reduce energy consumption.

ARTEMIS over WINGSChariot is the product of WINGS that is oriented to the proactive management of water, energy, gas infrastructures.

Based on the WINGS approach, it combines advanced technologies (IoT, AI, advanced networks and visualizations) with domain knowledge, to address diverse use cases. As a management system it delivers the following functionalities:

  • Efficient metering: optimized information flow and cost with 24/7 capability, prediction of demand and of capabilities;
  • Fault management: faulty meters, predictive maintenance, outage handling (energy), leakage or flood avoidance (water), outage handling.
  • Performance optimizations: optimization of water quality, maximization of revenue water, optimization of the deployment of renewables and of storage components, optimization for residences / businesses factories.
  • Configuration and security aspects.

Commercial traction has been achieved, while further interest is stimulated in various areas and with various tentative partners.

In that respect, WINGS participates and exploits its developments in SEDIMARK. SEDIMARK enables to create a secure decentralized data marketplace for the energy sector with high quality data on edge based on blockchain and AI.

SEDIMARK will be in Budapest from 2nd to 4th October at the European Big Data Value Forum (EBDVF). We will be presenting the project and participating in a thrilling session together with other 5 projects from the 2021-Data-01-03 "Technologies for Data Management" Call.

The most recent results for the project will be presented and the discussions that will be held in the session aims at providing insights and recommendations for further developing and integrating building blocks into vertical Data Spaces. The discussion will delve into highlighting the complementarities or variations between the different projects involved, to let domain/sector specific (i.e. water, energy, green deal, mobility, agrifood, etc.) data space building blocks emerge, and specifically considering Data Interoperability, Data Sovereignty, Data Value Creation and Data Governance.

Join us at Crowne Plaza Budapest (Room Gold) on October 4th, from 10:15 to 11:45 for “Leveraging Technologies for Data Management to Implement Data Spaces”.

Have you ever wondered how a smart city manages to keep everything from urban planning to environmental monitoring running smoothly? The answer lies in something called Spatial Data Infrastructure (SDI). While it might sound technical, SDI framework plays a crucial role in making geographic information accessible and integrated, benefiting everyone.

Imagine a world where data about locations – from urban planning maps to environmental monitoring systems – is at your fingertips. SDI turns this vision into reality. By connecting data, technology, and people, SDI helps improve decision-making and efficiency in numerous areas of our lives.

Smart City: SEDIMARK Helsinki Pilot and Spatial Data

The SEDIMARK Helsinki pilot aims to demonstrate how Digital Twin technology can revolutionize urban mobility with spatial data as the backbone. SEDIMARK's context broker (NGSI-LD) handles linked data, property graphs, and semantics using three main constructs: Entities, Properties, and Relationships. This integration opens up opportunities for new services and the development of a functional city, aiming to enhance geospatial data integration within urban digital twins. In Helsinki, the approach focuses on transitioning from a monolithic architecture to a modular, API-driven approach, developing Digital Twin viewers and tools, and collaborating on a city-wide Geospatial Data.

Join us on this journey as we dive into the world of Spatial Data Infrastructure and see how it's making our city smarter, more efficient, and better prepared for the future.

Photo credit. https://materialbank.myhelsinki.fi/images/attraction?sort=popularity&openMediaId=6614

When we think of data, especially from diverse traffic sources, beauty isn't typically the first thing that comes to mind. Instead, we imagine numbers, graphs, and charts, all designed to convey information quickly and efficiently. However, what if we could see data not just as a tool for analysis, but as a source of inspiration, capable of producing visuals as captivating as a masterpiece by Vincent van Gogh? Just like van Gogh's "Starry Night" finds beauty in complexity and chaos, we can render data into beautiful, meaningful visualizations.

The Complexity of Traffic Data

Traffic data is inherently complex. It comes from a variety of sources of interoperable systems and devices. Each source provides a different perspective, capturing the flow of vehicles, the density of traffic, and the speed of travel at any given time. When combined, these data points create a comprehensive picture of urban movement.

From Chaos to Clarity

Much like the seemingly chaotic yet harmonious art, raw traffic data can appear overwhelming. However, through careful visualization and simulation, patterns and insights emerge. Advanced algorithms process the data, identifying trends and correlations that aren't immediately apparent. For instance, heat maps can show areas of high congestion, while flow diagrams can illustrate the movement of vehicles through a city over time.

The beauty of data

Data visualization is an art form in its own right. The choice of colors, shapes, and lines can transform a simple graph into a work of art. For example, a time-lapse visualization of traffic flow can resemble the dynamic motion in an urban city with streams of vehicles.

Helsinki mobility digital twin

Helsinki mobility digital twin paves the way for a future where cities leverage data. This data-driven revolution, fueled by powerful data visualization, holds immense potential for creating a more efficient, sustainable, and safer urban transportation landscape.

So, can traffic data be beautiful? Absolutely. All it takes is the right perspective and a touch of creativity to turn numbers into a work of art.

Photo credit: Kuva.

Last Thursday, we had the pleasure of hosting Javier Valiño, Program Manager of the Data Space Working Group (Data Space WG) from the Eclipse Foundation, at Universidad de Cantabria.

During the meeting, Javier presented the latest advancements made by the Data Space WG. Their mission is to promote the global use of dataspace technologies, supporting the development and maintenance of secure data-sharing ecosystems. We also discussed SEDIMARK's initiatives in creating technologies for decentralized and secure data exchange.

Data Space WG goals are closely aligned with the SEDIMARK's innovative proposals for a decentralized marketplace, thus we will keep exploring future collaborations in the Data Space world.

In the modern era of big data, the challenge of integrating and analyzing data from various sources has become increasingly complex. Different data providers often use diverse formats and structures, leading to significant challenges in achieving data interoperability. This complexity necessitates robust mechanisms to convert and harmonize data, ensuring they can be effectively used for analysis and decision-making. SEDIMARK has identified two critical components in this process and is actively working on them: data formatter and data mapper.

A data formatter is designed to convert data from various providers, each using different formats, into the NGSI-LD standardized format. This standardization is crucial because it allows data from disparate sources to be compared, combined, and analyzed in a consistent manner. Without a data formatter, the heterogeneity of data formats would pose a significant barrier to interoperability. For example, data from providers might be in XLSX format, another in JSON, and yet another in CSV. A data formatter processes these different formats, transforming them into a unified format that can be easily managed and analyzed by SEDIMARK tools.

A data mapper comes into play after data processing to store the data and maps it to a specific data model. This process involves not only aligning the data with the model but also enriching it with quality metrics and metadata. During this stage, the data mapper adds valuable information about data quality obtained during the data processing step, such as identifying outliers and their corresponding anomaly scores, and missing and redundant data identification. This enriched data model becomes a powerful asset for future analyses, giving a complete picture of the data.By converting various data formats into a standard format and then mapping and enriching the data, SEDIMARK achieves a higher level of data integration. This process ensures that data from multiple sources can be used together seamlessly, facilitating more accurate and comprehensive analyses. Moreover, the inclusion of data quality metrics during the mapping process adds a layer of reliability and trustworthiness to the data. Information about outliers, missing data, and redundancy is crucial for data scientists and analysts, as it allows them to make informed decisions and apply appropriate processing techniques.

crossmenu