In the fast-evolving world of data science and AI, ensuring that workflows are reproducible, portable, and scalable is essential for success. However, many modern tools prioritize ease of use over standardization, making it difficult to share and execute workflows across different environments.
The SEDIMARK team tackled this challenge by creating a transformation methodology to convert Mage.ai workflows into Common Workflow Language (CWL) and Python-based workflows. This work aims to ensure compatibility with industry standards, improving the portability, reproducibility, and scalability of data pipelines. By integrating standardized workflows, SEDIMARK enables organizations to confidently manage their AI pipelines in secure, decentralized environments.
In the fast-evolving world of data science and AI, ensuring that workflows are reproducible, portable, and scalable is essential for success. However, many modern tools prioritize ease of use over standardization, making it difficult to share and execute workflows across different environments.
Mage.ai is a powerful tool for building data pipelines quickly and easily. However, its native workflows are not compatible with industry-standard formats like CWL, which are essential for reproducibility and portability.
CWL (Common Workflow Language) is an open standard that ensures workflows can be shared, reproduced, and executed across different platforms. It’s widely used in fields like bioinformatics and data science to standardize workflows for deployment in cloud environments, HPC clusters, and edge computing platforms.
By converting Mage.ai pipelines to CWL, organizations participating in SEDIMARK can achieve:
The SEDIMARK team developed a two-step methodology to transform Mage.ai pipelines into standardized workflows:
This transformation enables organizations to leverage SEDIMARK’s decentralized marketplace while ensuring that their data pipelines remain compatible with industry standards.
The SEDIMARK team developed a two-step methodology to transform Mage.ai pipelines into standardized workflows:
This transformation enables organizations to leverage SEDIMARK’s decentralized marketplace while ensuring that their data pipelines remain compatible with industry standards.
The SEDIMARK project aims to establish a distributed data marketplace, where organizations can securely exchange data pipelines, AI models, and other digital assets. The transformation from Mage.ai to CWL ensures that these assets are portable, reproducible, and compatible with existing standards.
For data scientists, engineers, and organizations participating in SEDIMARK, this transformation bridges the gap between intuitive pipeline design and standardized execution, enabling scalable and secure data processing.