🐥

CWLCon 2024 Session 2

2024/05/10に公開

Session 2 Schedule (UTC Time) APAC-Americas Friendly time

日本語はこちら

Session 2 (Online only)
Thursday 9 May / Friday 10 May
APAC-Americas Friendly time

Session 2 Schedule - CWLCon 2024

The discussions on handling data in cloud and distributed environments were particularly engaging. It's clear that the development environments, including various editors, are becoming more robust, making development more accessible.

One notable trend was presentations on using only the CommandLineTool component of CWL or writing a complete unit of Workflow and using it as an executable part. These presentations often included custom implementations for distributing tasks, combining them, or retrying errors. These are designed with the assumption that they will be deployed across multiple environments.

For tasks intended for multiple environments, it seems many environments already provide APIs. Therefore, the abstract parts are written first, followed by environment-specific adapters for control, similar to how Toil operates.

Tasks were discussed in terms of granularity:

  • Per sample: Execution completes on a single compute node (instance) per file.
  • Per hardware: Differentiates targets like FPGA and GPU.
  • Per job queue system: Distributes tasks between systems like Slurm and Kubernetes.

JSON-Schemas for Validating Your CWL Code and CWL Code Inputs

JSON-Schemas for Validating Your CWL Code and CWL Code Inputs - CWLCon 2024 - Common Workflow Language Discourse

This relates to the CWL development environment. Using JSON-LD definitions can become a powerful tool during CWL development. This information was previously discussed at the APAC EMEA meeting.

Joining regional meetings, if time permits, is recommended for such insights:
Community | Common Workflow Language (CWL)

NGS360 - A NGS Data Management and Analysis Platform

NGS360 - A NGS Data Management and Analysis Platform - CWLCon 2024 - Common Workflow Language Discourse

The task execution part employs CWL's CommandLineTool. They are developing an implementation called PAML for orchestrating sample inputs and error handling.

NGS360/PAML: Multi-Platform Launcher Framework

Currently, it supports:

  • Arvados
  • Seven Bridges

CWL @ ICA (Illumina Connected Analytics)

CWL @ ICA (Illumina Connected Analytics) - CWLCon 2024 - Common Workflow Language Discourse

I personally found it interesting how they use temporaryFailure to handle instances when a Spot Instance goes down. The way it was presented was very engaging. I am extremely interested in the internal implementation.

CWL-Enabled Reusable and Reproducible Genomic Data Management and Analysis in R

CWL-Enabled Reusable and Reproducible Genomic Data Management and Analysis in R - CWLCon 2024 - Common Workflow Language Discourse

Discussion centered on using R. Notably, the implementation rworkflow/RcwlCloud allows submitting CWL jobs to cloud environments from R.

Backends include:

  • Anvil
  • CAVATICA
  • CANCERGENOMICS Cloud

Working Environment features interactive apps like Rstudio and Jupyter, supporting both CWL and WDL workflows.

Performance Evaluation of GPU-intensive Genome Analysis Workflows in HPC and Cloud

Performance Evaluation of GPU-intensive Genome Analysis Workflows in HPC and Cloud - CWLCon 2024 - Common Workflow Language Discourse

This presentation involved benchmarking analysis pipelines using CWL, focusing on GPU usage. Future discussions may include GPU scheduling challenges.

Extending CWL for High-Performance Computing: A Visual Workflow System with HPC Enhancements

Extending CWL for High-Performance Computing: A Visual Workflow System with HPC Enhancements - CWLCon 2024 - Common Workflow Language Discourse

Discussions covered operational uses in supercomputing centers, particularly visualizing job submissions. The API provided in the supercomputing environment uses Slurm and Kubernetes.

zetako/cwl.go: CWL Parser and Runner.

Developed in Go

, this project has sparked considerable interest.

TODO

TODO: Need to follow up with Alexis regarding input file validation methods discussed previously.

Discussion