SEDIMARK Logo

SEDIMARK does not focus on one particular domain but intends to design and prototype a secure decentralised and intelligent data and services marketplace that bridges remote data platforms and allows the efficient and privacy-preserving sharing of vast amounts of heterogeneous, high-quality, certified data and services supporting the common EU data spaces.

Four use cases are included in the project and one of them, driven by EGM, will focus on ‘water data’ exploitation. SEDIMARK will make use of the AI-based tools for data quality management, metadata management and semantic interoperability for the gathering of data. It will build upon the SEDIMARK decentralized infrastructure for handling of security and privacy policies and provide integrated services for validation, semantic enrichment and transformation of the data.

In our commitment to collaboratively advance water data management and utilization, SEDIMARK is proud to partner with the ICT4WATER cluster, comprising more than 60 pioneering highly digitized water-related projects. As part of this collaboration, SEDIMARK will actively contribute by sharing its latest advancements with the ICT4WATER ecosystem, fostering the creation of a decentralized and secure marketplace for data and services. Together, we aim to collectively drive innovation and sustainable solutions in the realm of water resource management supported by ICT tools.

In the modern digital age, ensuring seamless data management, storage, and retrieval is of utmost importance. Enter Distributed Storage Solutions (DSS), the backbone for businesses aiming for consistent data access, elasticity, and robustness. At the core of leveraging the full prowess of DSS is an element often overlooked - the orchestrator pipeline. Let’s dive deeper into why this component is the unsung hero of data management.

A Deep Dive into Distributed Storage

Rather than placing all eggs in one basket with a singular, centralized system, DSS prefers to spread them out. By scattering data over a multitude of devices, often geographically dispersed, DSS ensures data availability, even when individual systems falter. It's the vanguard of reliable storage for the modern enterprise.

Why the Orchestrator Pipeline Steals the Show

Imagine an orchestra without its conductor – chaotic, right? The orchestrator pipeline for DSS is much like that crucial conductor, ensuring every piece fits together in perfect harmony. Here's how it makes a difference in the realm of DSS:

  • The Automation Magic: Seamlessly manages data storage, retrieval, and flow across various nodes.
  • Master of Balancing: Channelizes data traffic efficiently, promoting top-tier performance with minimal lag.
  • Guardian Angel Protocols: Steps in to resurrect data during system failures, keeping business operations uninterrupted.
  • The Efficiency Maestro: Regularly gauges system efficiency, making on-the-fly tweaks for optimal functioning.

Why Combine the Orchestrator with DSS?

There are four main reasons to combine the orchestrator with DSS:

  1. Trustworthy Operations: By streamlining and fine-tuning data tasks, it minimizes chances of human errors.
  2. Effortless Scaling: As data reservoirs expand, the orchestrator ensures DSS stretches comfortably, dodging manual hiccups.
  3. Resource Utilization at its Best: Champions the cause of optimal resource use, optimizing costs in the long run.
  4. Silky-smooth Functioning: System updates or maintenance? The orchestrator ensures no hitches, keeping operations smooth.

Final Thoughts

While DSS paints a compelling picture of modern data storage, the orchestrator pipeline is the brush that brings out its true colors, crafting an efficient, harmonious data masterpiece. In a world where data stands tall as a business linchpin, it's not just about storing it – it's about managing it with flair.

In this information age, data-driven business decision making has become a key factor for the success of organizations. If you are looking to access valuable information that allows you to make informed strategic decisions, Data Marketplaces are a powerful tool to consider.

What is a Data Marketplace?

Data Marketplaces are digital platforms designed to collect, manage and provide access to a wide range of relevant and up-to-date data. From demographics to market trends, these centralized spaces provide valuable information to drive your idea growth or business decision making.

Advantages for your business

  • Accurate and Updated Data: Obtain reliable and up-to-date information that allows you to be aware of the latest market trends and behaviors.
  • Informed Decision Making: Base your strategies on solid data, allowing you to make better decisions and reduce risks.
  • Accurate Segmentation: Access segmented and personalized data to better understand your customers and tailor your products and services to their needs.
  • Campaign Optimization: Improve the effectiveness of your marketing and advertising campaigns by targeting specific audiences with relevant messages.
  • Fostering Innovation: Find new opportunities and growth areas thanks to the information gathered in the Marketplace Data.

How it works

  • Data Collection: A wide variety of sources, including surveys, public data and social networks, feed Marketplaces with up-to-date and relevant information.
  • Access to Information: Users can access the data sets that best fit their needs, enabling them to obtain information specific to their objectives.
  • Security and Privacy: The Data Marketplace must present means that guarantee the security and protection of information, respecting the privacy of users and complying with current regulations.

Conclusion

In an increasingly competitive business environment, informed decision making is critical. Data Marketplaces give you the opportunity to gain a competitive advantage, identify opportunities and improve the efficiency of your business.

Are you ready to take the leap into data intelligence? Discover the potential of Data Marketplace through SEDIMARK's approach and get ready for success in the data-driven economy.

#DataMarketplace #DataIntelligence #InformedDecisionMaking #BusinessGrowth #OperationalEfficiency #DigitalTransformation #DataForSuccess

SEDIMARK recently participated in Data Week 2023 in Lulea, Sweden, which was organised by the Big Data Value Association (BDVA), a European initiative promoting data-driven digital transformation of society and the economy. Sedimark presented their work at a session organised by the Data Spaces Business Alliance (DSBA), an organisation promoting business transformation in the data economy.

The session, entitled, “Data Management and Data Sharing for trusted AI platforms” saw SEDIMARK present their concept alongside a diverse group of EU Horizon funded projects (Waterverse, STELAR, EnrichMyData and HPLT) also focussed on future tools for data management and quality control. The session pondered the question of how the tools and approaches developed within these projects would support the implementation and deployment of data driven and trustworthy AI applications within data spaces.

A further aim of the session was to consider how the projects could make use of and contribute to a number of core common building blocks for data spaces outlined in a recent working document of the DSBA. The individual project presentations were followed by a lively panel discussion, in which these questions were further pursued.

* Image credit: BDVA Twitter account

SEDIMARK partners: University of Surrey, INRIA and University College of Dublin published a new work at the 2022 8th #IEEE World Forum on IoT Conference in Yokohama, Japan, on a privacy-preserving ontology inspired by #GDPR requirements, for semantically interoperable #IoT data value chains. Check out the paper here.

Abstract

Testing and experimentation are crucial for promoting innovation and building systems that can evolve to meet high levels of service quality. IoT data that belong to users and from which their personal information can be inferred are frequently shared in the background of IoT systems with third parties for experimentation and building quality services. This sharing raises privacy concerns, as in most cases, the data are gathered and shared without the user's knowledge or explicit consent. With the introduction of GDPR, IoT systems and experimentation platforms that federate data from different deployments, testbeds, and data providers must be privacy-preserving. The wide adoption of IoT applications in scenarios ranging from smart cities to Industry 4.0 has raised concerns for the privacy of users' data collected using IoT devices. Inspired by the GDPR requirements, we propose an IoT ontology built using available standards that enhances privacy, enables semantic interoperability between IoT deployments, and supports the development of privacy-preserving experimental IoT applications. We also propose recommendations on how to efficiently use our ontology within a IoT testbed and federating platforms. Our ontology is validated for different quality assessment criteria using standard validation tools. We focus on “experimentation” without loss of generality because it covers scenarios from both research and industry that are directly linked with innovation.

This document provides the initial detailed description of the project use cases (UCs) and the initial requirements. As far as it concerns the use cases, the goal is to define them with specific details about their implementation and how they will use the project tools, and especially aiming to combine data from different data sources and platforms to show the potential for secure combination and sharing of data across sites. With respect to the requirements, the aim is to gather requirements from various stakeholders, industrial applications, the UCs and the concept of EU Data Spaces, and analyse them, in order to extract functional and non-functional requirements for making the data marketplace decentralised, trustworthy, interoperable and open to new data (open data), with intelligent AI-based and energy efficient data management tools capable of providing high quality data and services to consumers.

All the public deliverables and documents produced by SEDIMARK can be found in the Publications & Resources section.

Automated Machine Learning (Auto-ML) is an emerging technology that automates the tasks involved in building, training, and deploying machine learning models [1]. With the increasing ubiquity of machine learning, there is an ever-growing demand for specialized data scientists and machine learning experts. However, not all organizations have the resources to hire these experts. Auto-ML software platforms address this issue by enabling organizations to utilize machine learning more easily, even without specialized experts. 

Auto-ML platforms can be obtained from third-party vendors, accessed through open-source repositories like GitHub, or developed internally. These platforms automate many of the tedious and error-prone tasks involved in machine learning, freeing up data scientists' time to focus on more complex tasks. Auto-ML uses advanced algorithms and techniques to optimize the model and improve its accuracy, leading to better results.

One of the key benefits of Auto-ML is that it reduces the risk of human error. Since many of the tasks involved in machine learning are tedious and repetitive, there is a high chance of error when performed manually. Auto-ML automates these tasks, reducing the risk of human error and improving the overall accuracy of the model. In addition to reducing errors, Auto-ML also provides transparency by documenting the entire process. This makes it easier for researchers to understand how the model was developed and to replicate the process. Auto-ML can also be used by teams of data scientists, enabling collaboration and sharing of insights.

Furthermore, Auto-GPT is one of the popular tools for Auto-ML. It is a language model that uses deep learning to generate human-like text. Auto-GPT can be used for a range of natural language processing tasks, including text classification, sentiment analysis, and language translation. By automating the process of text generation, Auto-GPT enables researchers to focus on more complex tasks, such as data analysis and model deployment. This is just one example of how Auto-ML is revolutionizing the field of machine learning and making it more accessible to organizations of all sizes.

SEDIMARK aims to enhance data quality and reduce the reliance on domain experts on the data curation process. To accomplish this objective, the SEDIMARK team in the Insight Centre for Data Analytics of the University College Dublin (UCD) is actively exploring the utilization of Auto-ML techniques. By leveraging Auto-ML, SEDIMARK strives to optimize its data curation process and minimize the involvement of domain experts, leading to more efficient and accurate results.

[1] He, Xin, Kaiyong Zhao, and Xiaowen Chu. "AutoML: A survey of the state-of-the-art." Knowledge-Based Systems 212 (2021): 106622.

In the digital age, streaming data - information that is generated and processed in real-time - is abundant. Applying Artificial Intelligence (AI) to mine this data holds immense value.  It enables real-time decision-making and provides immediate insights, which is particularly beneficial for industries like finance, healthcare, and transportation, where instant responses can make a significant difference.

However, mining streaming data with AI is not without challenges [1]. The sheer volume and speed of the data make it difficult for conventional data mining methods to keep up. It demands high-speed processing and robust algorithms to handle real-time analysis. Furthermore, maintaining data quality and integrity is paramount, but challenging in a real-time context. Ensuring privacy and security of the data while mining it also poses significant obstacles. And, given the 'black box' nature of many AI systems, transparency and understanding of the data mining process can also be a concern.

SEDIMARK, a secure decentralized and intelligent data and services marketplace, is making strides to address these issues. The Insight Centre for Data Analytics in University College Dublin contributes to the development of innovative AI technologies capable of efficiently handling and mining streaming data. By combining advanced distributed AI technologies with a strong commitment to ethical guidelines, SEDIMARK is paving the way for a future where AI-driven insights from streaming data can be harnessed effectively, reliably, and ethically. Our aim is to transform the challenges of real-time data processing into opportunities, enhancing decision-making capabilities and fostering a more data-driven world.

[1] Gomes, Heitor Murilo, et al. "Machine learning for streaming data: state of the art, challenges, and opportunities." ACM SIGKDD Explorations Newsletter 21.2 (2019): 6-22.

In today's world, Artificial Intelligence (AI) is widespread and used in many different areas, such as the tech industry, financial services, health care, retail and manufacturing to name just a few. The main drive behind the surge of AI applications is its ability to extract useful information from very large data.

Despite the incredible positives AI has brought in recent years, it has also sparked numerous doubts about its trustworthiness. Some of the issues flagged include the lack of understanding of the algorithms used, in many cases described as black boxes. Similarly, it is often unclear what sort of data is applied in the training process of the AI system. Since AI systems learn from the data it is provided, it is crucial that this data does not contain biased human decisions or reflect unbalanced social biases.

To address these and many more trust issues in the emerging AI systems, the European Commission appointed the High-Level Expert Group on IA, and in 2019 this group presented Ethics Guidelines for Trustworthy AI. The outcome of these guidelines is that trustworthy AI should be lawful, ethical and robust and this should be achieved by addressing the following 7 key requirements:

  • Human Agency Oversight - allowing humans to make informed decisions and foster their fundamental human rights, while also ensuring proper human oversight of the AI system.
  • Technical Robustness and Safety - AI systems need to be safe, accurate, reliable and reproducible.
  • Privacy and Data Governance - respecting user privacy alongside ensuring the quality and integrity of the data.
  • Transparency - Ai transparency is achieved through the explainability of the AI systems and their decisions.
  • Diversity, non-discrimination and fairness - The AI system must avoid unfair bias while being accessible to all.
  • Societal and Environmental well-being - it must be ensured that the AI system is sustainable and environmentally friendly.
  • Accountability - accountability and responsibility for AI systems as well as their outcomes must be ensured.

In SEDIMARK it is our goal to develop cutting-edge AI technology such as machine learning and deep learning to enhance the experience of its users. In our path to this discovery, we aim to follow Trustworthy AI guidelines throughout the lifecycle of our project and beyond so that the AI developed and used in this project can be fully trusted by its users.

The SEDIMARK team in the Insight Centre for Data Analytics of University College Dublin (UCD) aims to exploit Insight’s expertise to promote ethical AI research within SEDIMARK and help the rest of the partners towards ensuring that the AI modules developed within the project follow the Ethical AI requirements.

crossmenu
SEDIMARK
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.