Nowadays, users register to a service and, usually, the service itself stores the users data - the identity. Today the majority of online services are centralized and rely, in some form, on a single authority for identity management. SEDIMARK instead aims to be a fully decentralized data Marketplace.
This architectural choice has consequences also on the management of the users belonging to the system. With decentralization in mind, SEDIMARK adopts a new model for the identity, the Self Sovereign Identity (SSI).
SSI is a digital identity model that gives the user who creates it full control over his or her identity and the information to be shared.
The SSI model is rooted on the Decentralized Identity paradigm: it is the user him/her-self – the Holder of the identity - that owns a unique identity composed of a set of attributes.
The attributes are releasedand associated to the identity by other entities – the Issuers of such claims.
These claims can be checked by other entities – called Verifiers. As an example, imagine a new graduate from a university. His/Her digital identity may contain a claim “Graduated” issued by the University. A future employer who wants to check this information acts as the Verifier.
SSI in practice
SSI? Never heard of it!
Yes, SSI is is a relatively new concept in the field of digital identity. It is an emerging technology relying on blockchain and other distributed ledgers which are in turn still evolving. Embracing and implementing these new identity systems is a process that requires time…
…But things are moving forward!
Microsoft has recently released a new product called Microsoft Entra Verified ID that employs decentralized identity.
Also European Union is addressing the EU citizen identities towards a model where the users have full control of their data with the European Digital Identity Wallet.
SSI in SEDIMARK
SEDIMARK will deploy its own custom SSI framework relying on IOTA Tangle DLT.
The users of the marketplace will have full control on their digital identity, allowing to preserve and maintain their privacy. Users have the ability to create and manage their own identities without relying on a central authority.
Moreover, thanks to SSI, also the authentication and authorization policies can be enforced with a more granular control. For example, a data provider can verify who is authorized to receive its data, liming the access only to a certain group.
Do you want to know more? Stay tuned for next blog posts by signing up to our newsletter below.
Follow us on LinkedIn and Twitter / X!
Source image from Shutterstock.
Data and AI in action! On the 25th-27th of October the 𝗘𝘂𝗿𝗼𝗽𝗲𝗮𝗻 𝗕𝗶𝗴 𝗗𝗮𝘁𝗮 𝗩𝗮𝗹𝘂𝗲 𝗙𝗼𝗿𝘂𝗺 #EBDVF, organised by #BDVA - Big Data Value Association, took place in Valencia, Spain. We had an insightful time exploring the most recent developments and reflections in Data and AI alongside professionals, researchers, policymakers and other entities across Europe.
SEDIMARK, which is always on the cutting edge of innovation, was introduced to the community, and ideas from similar research projects were exchanged to create synergies. WINGS ICT Solutions presented the scope of the project which is to design and prototype a secure decentralised and intelligent #data and services #marketplace that bridges remote data platforms and allows the efficient and privacy-preserving sharing of vast amounts of heterogeneous, high-quality, certified data and services supporting the common #EU #data #spaces.
Several crucial insights emerged from the discussion, highlighting the pressing requirement for standardized data, the significance of responsible data #governance, and the importance of aligning technological endeavors with well-defined business goals to ensure a meaningful impact. Additionally, the enormous potential of #data #spaces in promoting #data #sharing, catalyzing business expansion, and generating tangible value was underscored.
#EBDVF is over, but SEDIMARK’s work on #data and #AI and future realities continues!
In the context of climate change, water is a critical resource that must be managed very carefully. The ecosystem of water management is full of actors, each having a different responsibility and their own datasets which may be of value for other stakeholders. Currently, these datasets are not or are poorly shared. To tackle this issue, EGM has developed a Water Data Valorization Platform, that will enrich the SEDIMARK Marketplace ecosystem. It will be deployed in the municipality of Les Orres, where there is a need for an optimal way of handling all the data related to water, especially the one related to the Lac de Serre-Ponçon.
The platform revolves around the Stellio Context Information Broker, which allows connection and information sharing between all types of data and use cases. It is based on the European FIWARE open-source ecosystem which uses the NGSI-LD specification produced by ETSI. The platform provides some powerful visualization and business intelligence to view real time data or perform data analysis on a history of data and includes a set of modules specifically designed to deal with the need of actors in the water domain. Each module and its purpose are presented Hereafter.
Data collection module: In many cases, data is collected fully automatically, the workflow is set up once and then nothing else is to be done by the user. However, in some cases, the dataflow can only be automated up to a certain point, but still needs some input from a user. For instance, a dataflow a could be automated but only a need the user to give an input file path or a list of attributes to select. This module allows the user to perform the required actions from a user-friendly interface directly in the platform without having to interact without the backend interface.
Data validation module: In the field of water data collection, the validation of the measurement is a very important part of the process. Indeed, most sensory devices are on the outside, subject to many potential disturbances, which need to be addressed. The data can be automatically pre-validated, but most actors in the water domain will want to perform a manual final validation of the data to ensure an optimal data quality. This module provides a user-friendly interface to perform this manual validation as easily and quickly as possible. It includes some optimized table and graph display of the attributes to be validated and allows to perform multiple actions (select multiple lines/columns, filter, perform basic mathematical operations, …) on the data to validate or invalidate them.
Calculation tool module: This module allows the user to launch all kinds of models available in the platform, such as Machine Learning and AI models, hydrological models, any kind of algorithm that takes one or many inputs to calculate one or many outputs.
Data Export module: After retrieving and processing its data on the platform, the user might want to export the data in different formats. This module allows the user to export its data in csv format. Optionally, the user can perform some temporal and/or geographical aggregation before exporting the data. In addition, the module proposes a possibility to export the data into a report or summary sheet, which works based on pre-registered report template. The user selects the template and the data to export to automatically generate an excel or word file.
Risk and Event management module: A common need in the water sector is to be able to set some threshold breach detection to monitor the behavior of the data. This module allows to create some Event to detect and historize the occurrence of threshold breach and missing data. In addition, it includes a risk management system, allowing the user to create a risk, i.e., defining some condition for different severity based on multiple measured attributes to monitor the risk and potentially be alerted whenever it occurs.
Alert creation module: Based on the risk and the event defined in the previous module, the user can define alerts that will trigger whenever the risk or event occurs. This module allows the user to set the condition on which the alert should trigger (e.g., risk ‘A’ occurred with severity ‘high’, or event ‘B’ occurred, …), the message to be delivered by the alert and the mailing list that should receive it.
Since the platform was designed to answer specifically to the need of the actors of the water domain, it includes a very narrow right and authorization management system, allowing to give to each specific actor (developer, data validator, data scientist, politic decider, citizen, …) the specific rights to access only the modules and/or dataset visualizations that are relevant to them. The water Data Valorization Platform will be deployed as a part of the SEDIMARK Marketplace. It will enrich the ecosystem with its modules, its services, and its dataset, to be used by many actors of the water domain but also any user with some data to be processed.
Image credit: Rémi Morel - OT Serre-Ponçon
MYTILINEOS S.A. participates in SEDIMARK through BU Protergia, the Energy Unit and the largest independent electricity provider in Greece. Protergia has an Applied R&D and Innovation division working on innovative products and services. Two of these products / services are two different AI prediction models which analyze customer sales and behavior at a geospatial level.
The first model tries to predict customer segmentation in different regions via postal code while the second model attempts to quantify customer churn and predict it similarly in different regions via postal code. The results of these are used to manage efficiently business customers by analyzing the complaints in customer support in order to avoid losing existing customers as well as increase the loyalty of already existing customers via informing the local sales network.
The use case as a whole involves a combination of heterogeneous data where interoperability is critically important. The aforementioned AI models will be trained with decentralized data provided by SEDIMARK on the SEDIMARK Marketplace with low latency and under security and privacy preservation.
SEDIMARK does not focus on one particular domain but intends to design and prototype a secure decentralised and intelligent data and services marketplace that bridges remote data platforms and allows the efficient and privacy-preserving sharing of vast amounts of heterogeneous, high-quality, certified data and services supporting the common EU data spaces.
Four use cases are included in the project and one of them, driven by EGM, will focus on ‘water data’ exploitation. SEDIMARK will make use of the AI-based tools for data quality management, metadata management and semantic interoperability for the gathering of data. It will build upon the SEDIMARK decentralized infrastructure for handling of security and privacy policies and provide integrated services for validation, semantic enrichment and transformation of the data.
In our commitment to collaboratively advance water data management and utilization, SEDIMARK is proud to partner with the ICT4WATER cluster, comprising more than 60 pioneering highly digitized water-related projects. As part of this collaboration, SEDIMARK will actively contribute by sharing its latest advancements with the ICT4WATER ecosystem, fostering the creation of a decentralized and secure marketplace for data and services. Together, we aim to collectively drive innovation and sustainable solutions in the realm of water resource management supported by ICT tools.
In the modern digital age, ensuring seamless data management, storage, and retrieval is of utmost importance. Enter Distributed Storage Solutions (DSS), the backbone for businesses aiming for consistent data access, elasticity, and robustness. At the core of leveraging the full prowess of DSS is an element often overlooked - the orchestrator pipeline. Let’s dive deeper into why this component is the unsung hero of data management.
A Deep Dive into Distributed Storage
Rather than placing all eggs in one basket with a singular, centralized system, DSS prefers to spread them out. By scattering data over a multitude of devices, often geographically dispersed, DSS ensures data availability, even when individual systems falter. It's the vanguard of reliable storage for the modern enterprise.
Why the Orchestrator Pipeline Steals the Show
Imagine an orchestra without its conductor – chaotic, right? The orchestrator pipeline for DSS is much like that crucial conductor, ensuring every piece fits together in perfect harmony. Here's how it makes a difference in the realm of DSS:
- The Automation Magic: Seamlessly manages data storage, retrieval, and flow across various nodes.
- Master of Balancing: Channelizes data traffic efficiently, promoting top-tier performance with minimal lag.
- Guardian Angel Protocols: Steps in to resurrect data during system failures, keeping business operations uninterrupted.
- The Efficiency Maestro: Regularly gauges system efficiency, making on-the-fly tweaks for optimal functioning.
Why Combine the Orchestrator with DSS?
There are four main reasons to combine the orchestrator with DSS:
- Trustworthy Operations: By streamlining and fine-tuning data tasks, it minimizes chances of human errors.
- Effortless Scaling: As data reservoirs expand, the orchestrator ensures DSS stretches comfortably, dodging manual hiccups.
- Resource Utilization at its Best: Champions the cause of optimal resource use, optimizing costs in the long run.
- Silky-smooth Functioning: System updates or maintenance? The orchestrator ensures no hitches, keeping operations smooth.
While DSS paints a compelling picture of modern data storage, the orchestrator pipeline is the brush that brings out its true colors, crafting an efficient, harmonious data masterpiece. In a world where data stands tall as a business linchpin, it's not just about storing it – it's about managing it with flair.
In this information age, data-driven business decision making has become a key factor for the success of organizations. If you are looking to access valuable information that allows you to make informed strategic decisions, Data Marketplaces are a powerful tool to consider.
What is a Data Marketplace?
Data Marketplaces are digital platforms designed to collect, manage and provide access to a wide range of relevant and up-to-date data. From demographics to market trends, these centralized spaces provide valuable information to drive your idea growth or business decision making.
Advantages for your business
- Accurate and Updated Data: Obtain reliable and up-to-date information that allows you to be aware of the latest market trends and behaviors.
- Informed Decision Making: Base your strategies on solid data, allowing you to make better decisions and reduce risks.
- Accurate Segmentation: Access segmented and personalized data to better understand your customers and tailor your products and services to their needs.
- Campaign Optimization: Improve the effectiveness of your marketing and advertising campaigns by targeting specific audiences with relevant messages.
- Fostering Innovation: Find new opportunities and growth areas thanks to the information gathered in the Marketplace Data.
How it works
- Data Collection: A wide variety of sources, including surveys, public data and social networks, feed Marketplaces with up-to-date and relevant information.
- Access to Information: Users can access the data sets that best fit their needs, enabling them to obtain information specific to their objectives.
- Security and Privacy: The Data Marketplace must present means that guarantee the security and protection of information, respecting the privacy of users and complying with current regulations.
In an increasingly competitive business environment, informed decision making is critical. Data Marketplaces give you the opportunity to gain a competitive advantage, identify opportunities and improve the efficiency of your business.
Are you ready to take the leap into data intelligence? Discover the potential of Data Marketplace through SEDIMARK's approach and get ready for success in the data-driven economy.
#DataMarketplace #DataIntelligence #InformedDecisionMaking #BusinessGrowth #OperationalEfficiency #DigitalTransformation #DataForSuccess
SEDIMARK partners: University of Surrey, INRIA and University College of Dublin published a new work at the 2022 8th #IEEE World Forum on IoT Conference in Yokohama, Japan, on a privacy-preserving ontology inspired by #GDPR requirements, for semantically interoperable #IoT data value chains. Check out the paper here.
Testing and experimentation are crucial for promoting innovation and building systems that can evolve to meet high levels of service quality. IoT data that belong to users and from which their personal information can be inferred are frequently shared in the background of IoT systems with third parties for experimentation and building quality services. This sharing raises privacy concerns, as in most cases, the data are gathered and shared without the user's knowledge or explicit consent. With the introduction of GDPR, IoT systems and experimentation platforms that federate data from different deployments, testbeds, and data providers must be privacy-preserving. The wide adoption of IoT applications in scenarios ranging from smart cities to Industry 4.0 has raised concerns for the privacy of users' data collected using IoT devices. Inspired by the GDPR requirements, we propose an IoT ontology built using available standards that enhances privacy, enables semantic interoperability between IoT deployments, and supports the development of privacy-preserving experimental IoT applications. We also propose recommendations on how to efficiently use our ontology within a IoT testbed and federating platforms. Our ontology is validated for different quality assessment criteria using standard validation tools. We focus on “experimentation” without loss of generality because it covers scenarios from both research and industry that are directly linked with innovation.
Automated Machine Learning (Auto-ML) is an emerging technology that automates the tasks involved in building, training, and deploying machine learning models . With the increasing ubiquity of machine learning, there is an ever-growing demand for specialized data scientists and machine learning experts. However, not all organizations have the resources to hire these experts. Auto-ML software platforms address this issue by enabling organizations to utilize machine learning more easily, even without specialized experts.
Auto-ML platforms can be obtained from third-party vendors, accessed through open-source repositories like GitHub, or developed internally. These platforms automate many of the tedious and error-prone tasks involved in machine learning, freeing up data scientists' time to focus on more complex tasks. Auto-ML uses advanced algorithms and techniques to optimize the model and improve its accuracy, leading to better results.
One of the key benefits of Auto-ML is that it reduces the risk of human error. Since many of the tasks involved in machine learning are tedious and repetitive, there is a high chance of error when performed manually. Auto-ML automates these tasks, reducing the risk of human error and improving the overall accuracy of the model. In addition to reducing errors, Auto-ML also provides transparency by documenting the entire process. This makes it easier for researchers to understand how the model was developed and to replicate the process. Auto-ML can also be used by teams of data scientists, enabling collaboration and sharing of insights.
Furthermore, Auto-GPT is one of the popular tools for Auto-ML. It is a language model that uses deep learning to generate human-like text. Auto-GPT can be used for a range of natural language processing tasks, including text classification, sentiment analysis, and language translation. By automating the process of text generation, Auto-GPT enables researchers to focus on more complex tasks, such as data analysis and model deployment. This is just one example of how Auto-ML is revolutionizing the field of machine learning and making it more accessible to organizations of all sizes.
SEDIMARK aims to enhance data quality and reduce the reliance on domain experts on the data curation process. To accomplish this objective, the SEDIMARK team in the Insight Centre for Data Analytics of the University College Dublin (UCD) is actively exploring the utilization of Auto-ML techniques. By leveraging Auto-ML, SEDIMARK strives to optimize its data curation process and minimize the involvement of domain experts, leading to more efficient and accurate results.
 He, Xin, Kaiyong Zhao, and Xiaowen Chu. "AutoML: A survey of the state-of-the-art." Knowledge-Based Systems 212 (2021): 106622.