SEDIMARK Logo

Enhancing Data Interoperability and Quality with Data Formatter and Mapper

SEDIMARK · June 28, 2024
Data Mapper representation

In the modern era of big data, the challenge of integrating and analyzing data from various sources has become increasingly complex. Different data providers often use diverse formats and structures, leading to significant challenges in achieving data interoperability. This complexity necessitates robust mechanisms to convert and harmonize data, ensuring they can be effectively used for analysis and decision-making. SEDIMARK has identified two critical components in this process and is actively working on them: data formatter and data mapper.

A data formatter is designed to convert data from various providers, each using different formats, into the NGSI-LD standardized format. This standardization is crucial because it allows data from disparate sources to be compared, combined, and analyzed in a consistent manner. Without a data formatter, the heterogeneity of data formats would pose a significant barrier to interoperability. For example, data from providers might be in XLSX format, another in JSON, and yet another in CSV. A data formatter processes these different formats, transforming them into a unified format that can be easily managed and analyzed by SEDIMARK tools.

A data mapper comes into play after data processing to store the data and maps it to a specific data model. This process involves not only aligning the data with the model but also enriching it with quality metrics and metadata. During this stage, the data mapper adds valuable information about data quality obtained during the data processing step, such as identifying outliers and their corresponding anomaly scores, and missing and redundant data identification. This enriched data model becomes a powerful asset for future analyses, giving a complete picture of the data.By converting various data formats into a standard format and then mapping and enriching the data, SEDIMARK achieves a higher level of data integration. This process ensures that data from multiple sources can be used together seamlessly, facilitating more accurate and comprehensive analyses. Moreover, the inclusion of data quality metrics during the mapping process adds a layer of reliability and trustworthiness to the data. Information about outliers, missing data, and redundancy is crucial for data scientists and analysts, as it allows them to make informed decisions and apply appropriate processing techniques.

Subscribe to SEDIMARK!

* required

📣 Registrations are open for #WKE2025 – the winter edition of Water Knowledge Europe!

Join us in Brussels 🇧🇪 on 2–4 Dec 2025 to connect ideas, pitch projects, and build winning consortia for the upcoming #HorizonEU 2026–2027 calls.

👉https://watereurope.eu/event/water-knowledge-europe-2025-winter-edition/#WKE2025

🍁 As leaves fall and cities prepare for winter, urban maintenance ramps up.

Discover how open datasets on street cleaning, leaf collection, and infrastructure checks help keep our cities safe, clean, and efficient.

Explore the datasets 👉 https://link.europa.eu/gGN6fY

🇪🇺 We welcome the upcoming #EUInnovationAct!

Following the public consultation, the #EITCommunity shared some key ideas:

🔹Make scaling easier
🔹Unlock finance for intangibles
🔹Use public procurement for innovation
🔹Support all regions

👉Learn more: http://link.europa.eu/7jPDxW

Load More
crossmenu
SEDIMARK
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.