SEDIMARK Logo

In the context of climate change, water is a critical resource that must be managed very carefully. The ecosystem of water management is full of actors, each having a different responsibility and their own datasets which may be of value for other stakeholders. Currently, these datasets are not or are poorly shared. To tackle this issue, EGM has developed a Water Data Valorization Platform, that will enrich the SEDIMARK Marketplace ecosystem. It will be deployed in the municipality of Les Orres, where there is a need for an optimal way of handling all the data related to water, especially the one related to the Lac de Serre-Ponçon.

The platform revolves around the Stellio Context Information Broker, which allows connection and information sharing between all types of data and use cases. It is based on the European FIWARE open-source ecosystem which uses the NGSI-LD specification produced by ETSI. The platform provides some powerful visualization and business intelligence to view real time data or perform data analysis on a history of data and includes a set of modules specifically designed to deal with the need of actors in the water domain. Each module and its purpose are presented Hereafter.

Data collection module: In many cases, data is collected fully automatically, the workflow is set up once and then nothing else is to be done by the user. However, in some cases, the dataflow can only be automated up to a certain point, but still needs some input from a user.  For instance, a dataflow a could be automated but only a need the user to give an input file path or a list of attributes to select. This module allows the user to perform the required actions from a user-friendly interface directly in the platform without having to interact without the backend interface.

Data validation module: In the field of water data collection, the validation of the measurement is a very important part of the process. Indeed, most sensory devices are on the outside, subject to many potential disturbances, which need to be addressed. The data can be automatically pre-validated, but most actors in the water domain will want to perform a manual final validation of the data to ensure an optimal data quality. This module provides a user-friendly interface to perform this manual validation as easily and quickly as possible. It includes some optimized table and graph display of the attributes to be validated and allows to perform multiple actions (select multiple lines/columns, filter, perform basic mathematical operations, …) on the data to validate or invalidate them.

Calculation tool module: This module allows the user to launch all kinds of models available in the platform, such as Machine Learning and AI models, hydrological models, any kind of algorithm that takes one or many inputs to calculate one or many outputs.

Data Export module: After retrieving and processing its data on the platform, the user might want to export the data in different formats. This module allows the user to export its data in csv format. Optionally, the user can perform some temporal and/or geographical aggregation before exporting the data. In addition, the module proposes a possibility to export the data into a report or summary sheet, which works based on pre-registered report template. The user selects the template and the data to export to automatically generate an excel or word file.

Risk and Event management module: A common need in the water sector is to be able to set some threshold breach detection to monitor the behavior of the data. This module allows to create some Event to detect and historize the occurrence of threshold breach and missing data. In addition, it includes a risk management system, allowing the user to create a risk, i.e., defining some condition for different severity based on multiple measured attributes to monitor the risk and potentially be alerted whenever it occurs.

Alert creation module: Based on the risk and the event defined in the previous module, the user can define alerts that will trigger whenever the risk or event occurs. This module allows the user to set the condition on which the alert should trigger (e.g., risk ‘A’ occurred with severity ‘high’, or event ‘B’ occurred, …), the message to be delivered by the alert and the mailing list that should receive it.

Since the platform was designed to answer specifically to the need of the actors of the water domain, it includes a very narrow right and authorization management system, allowing to give to each specific actor (developer, data validator, data scientist, politic decider, citizen, …) the specific rights to access only the modules and/or dataset visualizations that are relevant to them. The water Data Valorization Platform will be deployed as a part of the SEDIMARK Marketplace. It will enrich the ecosystem with its modules, its services, and its dataset, to be used by many actors of the water domain but also any user with some data to be processed.

Image credit: Rémi Morel - OT Serre-Ponçon

This document serves as the first version of the SEDIMARK architecture aiming to provide an in-depth description of various architectural views and the roadmap on how the views were extracted. The main goal is to provide an architecture that supports the main concepts of SEDIMARK for full decentralisation, trustworthiness, intelligence, data quality and interoperability. This document is considered as one of the main deliverables of the project, because the main technical and evaluation activities will be based on the description of the functional components of the architecture and their interactions. SEDIMARK follows the concept of agile innovation-driven methodology for the development of the decentralised marketplace. This means that the architecture document should be considered as a “live” document that will be continuously updated and improved as the technical development and testing activities of the rest of the workpackages will evolve, aiming to identify omissions or issues with the initial architecture draft, so that these can be fixed by adapting the components or by adding missing components and removing components that are either not useful or duplicated. A new version of the SEDIMARK architecture will be provided in Month 24 (September 2024) in the Deliverable D2.3.
Considering that this document does not provide a fully functional architectural framework, but rather only a high-level document presenting initial concepts and ideas (not tested), it is intended for a limited audience, primarily for the project consortium to use it for driving the technical activities of the project in the rest of the work packages. Additionally, other researchers and developers in the areas of interest of the project will also find interesting ideas about developing decentralised data and services marketplaces. Moreover, EU initiatives and other research projects should consider the contents of the deliverable in order to help derive common architectures and concepts for creating data spaces and building marketplaces on top of them focusing on improved trustworthiness, data quality and intelligence.

This document, along with all the public deliverables and documents produced by SEDIMARK, can be found in the Publications & Resources section.

MYTILINEOS S.A. participates in SEDIMARK through BU Protergia, the Energy Unit and the largest independent electricity provider in Greece. Protergia has an Applied R&D and Innovation division working on innovative products and services. Two of these products / services are two different AI prediction models which analyze customer sales and behavior at a geospatial level.

The first model tries to predict customer segmentation in different regions via postal code while the second model attempts to quantify customer churn and predict it similarly in different regions via postal code. The results of these are used to manage efficiently business customers by analyzing the complaints in customer support in order to avoid losing existing customers as well as increase the loyalty of already existing customers via informing the local sales network.

The use case as a whole involves a combination of heterogeneous data where interoperability is critically important. The aforementioned AI models will be trained with decentralized data provided by SEDIMARK on the SEDIMARK Marketplace with low latency and under security and privacy preservation.

In this information age, data-driven business decision making has become a key factor for the success of organizations. If you are looking to access valuable information that allows you to make informed strategic decisions, Data Marketplaces are a powerful tool to consider.

What is a Data Marketplace?

Data Marketplaces are digital platforms designed to collect, manage and provide access to a wide range of relevant and up-to-date data. From demographics to market trends, these centralized spaces provide valuable information to drive your idea growth or business decision making.

Advantages for your business

  • Accurate and Updated Data: Obtain reliable and up-to-date information that allows you to be aware of the latest market trends and behaviors.
  • Informed Decision Making: Base your strategies on solid data, allowing you to make better decisions and reduce risks.
  • Accurate Segmentation: Access segmented and personalized data to better understand your customers and tailor your products and services to their needs.
  • Campaign Optimization: Improve the effectiveness of your marketing and advertising campaigns by targeting specific audiences with relevant messages.
  • Fostering Innovation: Find new opportunities and growth areas thanks to the information gathered in the Marketplace Data.

How it works

  • Data Collection: A wide variety of sources, including surveys, public data and social networks, feed Marketplaces with up-to-date and relevant information.
  • Access to Information: Users can access the data sets that best fit their needs, enabling them to obtain information specific to their objectives.
  • Security and Privacy: The Data Marketplace must present means that guarantee the security and protection of information, respecting the privacy of users and complying with current regulations.

Conclusion

In an increasingly competitive business environment, informed decision making is critical. Data Marketplaces give you the opportunity to gain a competitive advantage, identify opportunities and improve the efficiency of your business.

Are you ready to take the leap into data intelligence? Discover the potential of Data Marketplace through SEDIMARK's approach and get ready for success in the data-driven economy.

#DataMarketplace #DataIntelligence #InformedDecisionMaking #BusinessGrowth #OperationalEfficiency #DigitalTransformation #DataForSuccess

This document provides the initial detailed description of the project use cases (UCs) and the initial requirements. As far as it concerns the use cases, the goal is to define them with specific details about their implementation and how they will use the project tools, and especially aiming to combine data from different data sources and platforms to show the potential for secure combination and sharing of data across sites. With respect to the requirements, the aim is to gather requirements from various stakeholders, industrial applications, the UCs and the concept of EU Data Spaces, and analyse them, in order to extract functional and non-functional requirements for making the data marketplace decentralised, trustworthy, interoperable and open to new data (open data), with intelligent AI-based and energy efficient data management tools capable of providing high quality data and services to consumers.

All the public deliverables and documents produced by SEDIMARK can be found in the Publications & Resources section.

Automated Machine Learning (Auto-ML) is an emerging technology that automates the tasks involved in building, training, and deploying machine learning models [1]. With the increasing ubiquity of machine learning, there is an ever-growing demand for specialized data scientists and machine learning experts. However, not all organizations have the resources to hire these experts. Auto-ML software platforms address this issue by enabling organizations to utilize machine learning more easily, even without specialized experts. 

Auto-ML platforms can be obtained from third-party vendors, accessed through open-source repositories like GitHub, or developed internally. These platforms automate many of the tedious and error-prone tasks involved in machine learning, freeing up data scientists' time to focus on more complex tasks. Auto-ML uses advanced algorithms and techniques to optimize the model and improve its accuracy, leading to better results.

One of the key benefits of Auto-ML is that it reduces the risk of human error. Since many of the tasks involved in machine learning are tedious and repetitive, there is a high chance of error when performed manually. Auto-ML automates these tasks, reducing the risk of human error and improving the overall accuracy of the model. In addition to reducing errors, Auto-ML also provides transparency by documenting the entire process. This makes it easier for researchers to understand how the model was developed and to replicate the process. Auto-ML can also be used by teams of data scientists, enabling collaboration and sharing of insights.

Furthermore, Auto-GPT is one of the popular tools for Auto-ML. It is a language model that uses deep learning to generate human-like text. Auto-GPT can be used for a range of natural language processing tasks, including text classification, sentiment analysis, and language translation. By automating the process of text generation, Auto-GPT enables researchers to focus on more complex tasks, such as data analysis and model deployment. This is just one example of how Auto-ML is revolutionizing the field of machine learning and making it more accessible to organizations of all sizes.

SEDIMARK aims to enhance data quality and reduce the reliance on domain experts on the data curation process. To accomplish this objective, the SEDIMARK team in the Insight Centre for Data Analytics of the University College Dublin (UCD) is actively exploring the utilization of Auto-ML techniques. By leveraging Auto-ML, SEDIMARK strives to optimize its data curation process and minimize the involvement of domain experts, leading to more efficient and accurate results.

[1] He, Xin, Kaiyong Zhao, and Xiaowen Chu. "AutoML: A survey of the state-of-the-art." Knowledge-Based Systems 212 (2021): 106622.

In the digital age, streaming data - information that is generated and processed in real-time - is abundant. Applying Artificial Intelligence (AI) to mine this data holds immense value.  It enables real-time decision-making and provides immediate insights, which is particularly beneficial for industries like finance, healthcare, and transportation, where instant responses can make a significant difference.

However, mining streaming data with AI is not without challenges [1]. The sheer volume and speed of the data make it difficult for conventional data mining methods to keep up. It demands high-speed processing and robust algorithms to handle real-time analysis. Furthermore, maintaining data quality and integrity is paramount, but challenging in a real-time context. Ensuring privacy and security of the data while mining it also poses significant obstacles. And, given the 'black box' nature of many AI systems, transparency and understanding of the data mining process can also be a concern.

SEDIMARK, a secure decentralized and intelligent data and services marketplace, is making strides to address these issues. The Insight Centre for Data Analytics in University College Dublin contributes to the development of innovative AI technologies capable of efficiently handling and mining streaming data. By combining advanced distributed AI technologies with a strong commitment to ethical guidelines, SEDIMARK is paving the way for a future where AI-driven insights from streaming data can be harnessed effectively, reliably, and ethically. Our aim is to transform the challenges of real-time data processing into opportunities, enhancing decision-making capabilities and fostering a more data-driven world.

[1] Gomes, Heitor Murilo, et al. "Machine learning for streaming data: state of the art, challenges, and opportunities." ACM SIGKDD Explorations Newsletter 21.2 (2019): 6-22.

In today's world, Artificial Intelligence (AI) is widespread and used in many different areas, such as the tech industry, financial services, health care, retail and manufacturing to name just a few. The main drive behind the surge of AI applications is its ability to extract useful information from very large data.

Despite the incredible positives AI has brought in recent years, it has also sparked numerous doubts about its trustworthiness. Some of the issues flagged include the lack of understanding of the algorithms used, in many cases described as black boxes. Similarly, it is often unclear what sort of data is applied in the training process of the AI system. Since AI systems learn from the data it is provided, it is crucial that this data does not contain biased human decisions or reflect unbalanced social biases.

To address these and many more trust issues in the emerging AI systems, the European Commission appointed the High-Level Expert Group on IA, and in 2019 this group presented Ethics Guidelines for Trustworthy AI. The outcome of these guidelines is that trustworthy AI should be lawful, ethical and robust and this should be achieved by addressing the following 7 key requirements:

  • Human Agency Oversight - allowing humans to make informed decisions and foster their fundamental human rights, while also ensuring proper human oversight of the AI system.
  • Technical Robustness and Safety - AI systems need to be safe, accurate, reliable and reproducible.
  • Privacy and Data Governance - respecting user privacy alongside ensuring the quality and integrity of the data.
  • Transparency - Ai transparency is achieved through the explainability of the AI systems and their decisions.
  • Diversity, non-discrimination and fairness - The AI system must avoid unfair bias while being accessible to all.
  • Societal and Environmental well-being - it must be ensured that the AI system is sustainable and environmentally friendly.
  • Accountability - accountability and responsibility for AI systems as well as their outcomes must be ensured.

In SEDIMARK it is our goal to develop cutting-edge AI technology such as machine learning and deep learning to enhance the experience of its users. In our path to this discovery, we aim to follow Trustworthy AI guidelines throughout the lifecycle of our project and beyond so that the AI developed and used in this project can be fully trusted by its users.

The SEDIMARK team in the Insight Centre for Data Analytics of University College Dublin (UCD) aims to exploit Insight’s expertise to promote ethical AI research within SEDIMARK and help the rest of the partners towards ensuring that the AI modules developed within the project follow the Ethical AI requirements.

ARTEMIS is the product of WINGS that is oriented to the proactive management of water, energy, gas infrastructures.

Based on the WINGS approach, it combines advanced technologies (IoT, AI, advanced networks and visualizations) with domain knowledge, to address diverse use cases. Being a management system it delivers the following functionalities.

  • Efficient metering: optimized information flow and cost with 24/7 capability, prediction of demand and of capabilities);
  • Fault management: faulty meters, predictive maintenance, outage handling (energy), leakage or flood avoidance (water), outage handling.
  • Performance optimizations: optimization of water quality, maximization of revenue water, optimization of the deployment of renewables and of storage components, optimization for residences / businesses factories.
  • Configuration and security aspects.

Commercial traction has been achieved, while further interest is stimulated in various areas and with various tentative partners.

In parallel WINGS strives to develop and integrate further advances.  A wave of new projects related to ARTEMIS activities is being implemented. SEDIMARK aims to create a secure decentralised data marketplace based on distributed ledger technology and AI. Under this new approach,

  • Data will no longer be stored on the “core cloud” but also on “edge systems”, close to where they are generated, thus avoiding security concerns.
  • According to diverse strategies, data will be “cleaned”, labelled and classified, in accordance with legal / ethical frameworks and FAIR (findable, accessible, interoperable and reusable) principles, for enabling easy linkage and efficient utilization.
  • Diverse analysis mechanisms can be powered.

Within SEDIMARK, WINGS contributes on the marketplace (leveraging its experience in other vertical sectors, like food security and safety) and with AI strategies.

SEDIMARK will empower European stakeholders to set the proper foundation for the energy market, expand their competences and compete and scale at a global level

crossmenu