Fuzzfile Syntax Guide

Overview

Workflows are the fundamental unit of work in Fuzzball, orchestrating container execution, data movement, and resource allocation across compute clusters. They are described by Fuzzfiles — YAML 1.2 documents with a well-defined structure and syntax. A workflow definition consists of several top-level sections that work together to define the complete execution lifecycle describing how data flows into and out of the workflow, what compute operations to perform, what resources those operations require, and how jobs depend on each other. The top level sections are:

version: v4
files:
defaults:
annotations:
volumes:
jobs:
services:

Each section serves a specific purpose:

version: Specifies the workflow syntax version (v1 or v4; new workflows must use v4) [required]
files: Defines inline file content that can be referenced in volumes and jobs [optional]
defaults: Sets default configurations (mounts, environment, policy, resources) for jobs [optional]
annotations: Provides workflow-level metadata for scheduling and placement [optional]
preemptible: Marks all of the workflow’s allocations as eligible for preemption [optional]
volumes: Defines storage volumes with data ingress (importing) and egress (exporting) capabilities [optional]
jobs: Specifies the computational work to be performed, organized as a directed acyclic graph (DAG)
services: Defines long-running container services that support jobs or provide external endpoints

A workflow needs to have at least 1 job or service.

Top-Level Sections

version

Required. Denotes the workflow syntax version. Valid values are v1 (legacy) and v4. New workflows must use v4 and the V4 Fuzzfile syntax. The v1 format is supported only for backward compatibility with existing workflows. Note that all snippets in this reference use V4 syntax.

Fuzzball makes a best-effort attempt to auto-upgrade v1 workflows to v4 at execution time. V1 volume references (reference: volume://...) and old-format mounts are automatically converted. However, writing version: v4 for new workflows avoids the upgrade step and ensures access to all V4 features.

Default version in the Web UI

When creating a new workflow in the Web UI (via “Create New” in the Workflow Editor) or hydrating a workflow template from the Workflow Catalog that lacks an explicit version field, the Web UI automatically defaults to version: v4. This means that new workflows created through the Web UI will use v4 syntax for volumes, mounts, and other workflow elements unless you manually edit the version field to v1.

If you have existing v1 workflows and want to preserve v1 syntax when creating variations, either copy an existing v1 workflow as a starting point or explicitly set version: v1 in the Workflow Editor’s YAML view (press e to access it).

Structure:

version: v4

Example:

version: v4

Upgrading existing V1 workflows

If you have existing v1 Fuzzfiles that you want to convert to v4 format permanently (rather than relying on the automatic upgrade at execution time), use the CLI upgrade command:

$ fuzzball workflow upgrade my-workflow.fz > my-workflow-v4.fz

The upgrade runs server-side with your identity context, allowing persistent volume names to be resolved against the provisioners available to your group. The output is YAML by default; use -o json for JSON output.

To verify that a workflow (V1 or V4) is valid before submitting it:

$ fuzzball workflow validate my-workflow-v4.fz

The upgrade command is useful for batch-converting a library of V1 workflows. Redirect the output to a new file so you can review the changes before replacing the original:
$ fuzzball workflow upgrade old-workflow.fz > new-workflow.fz
$ diff old-workflow.fz new-workflow.fz

preemptible

Optional. When true, marks every allocation produced by the workflow — jobs, services, and the internal image, volume, and file stages — as eligible for preemption by higher-priority allocations on clusters where preemption is enabled. Defaults to false.

Setting this field is equivalent to starting the workflow with fuzzball workflow start --preemptible. A job or service that sets its own preemptible value explicitly overrides the workflow-level setting.

Structure:

preemptible: <boolean>

Example:

version: v4
preemptible: true
jobs:
  batch:                # preemptible via the workflow-level setting
    image:
      uri: docker://alpine:latest
    command: [run-batch]
  critical:
    image:
      uri: docker://alpine:latest
    command: [run-critical]
    preemptible: false  # overrides the workflow-level setting

files

Optional. Defines inline file contents that can be referenced in volume ingress and jobs using a file://<arbitrary_name> URI scheme. This is useful for embedding small configuration files, scripts, or data directly in the workflow definition rather than storing them externally. Note that some content may have to be base64 encoded to ensure that the resulting yaml file is valid.

Structure:

files:
  <file-name>: |
    <file-contents>
  <another-file-name>: |
    <more-contents>

Fields:

<file-name>: An arbitrary name to identify this inline file. This name is used in file:// URIs to reference the content. Can be any valid YAML key.
<file-contents>: The actual content of the file. Use the | literal block scalar for multi-line content.

Example:

files:
  config.ini: |
    [settings]
    debug=true
    timeout=30
  data.csv: |
    id,name,value
    1,alpha,42
    2,beta,3.14

Usage:

Inline files can be referenced in two ways:

In volume ingress - to copy the content into a volume:

volumes:
  scratch:
    use: ephemeral
    size: 50GB
    ingress:
      - source:
          uri: file://config.ini          # References the inline file
        destination:
          uri: file://config/app.ini      # Destination in the volume

In job files - to bind mount the content directly into a container:

jobs:
  my-job:
    image:
      uri: docker://alpine:latest
    files:
      /etc/app/config.ini: file://config.ini   # Mounted at this path in container
    script: |
      #!/bin/sh
      cat /etc/app/config.ini

defaults

Optional. Sets default configurations that are applied to all jobs in the workflow. Individual jobs can override these defaults by specifying their own values. This is useful for reducing repetition when multiple jobs share common settings like volume mounts or environment variables.

Structure:

defaults:
  job:
    env:                    # Optional
      - <VAR>=<value>
    mounts:                 # Optional
      <mount-path>: <volume-name>
    policy:                 # Optional
      timeout:
        execute: <duration>
    resource:               # Optional
      cpu:
        cores: <number>
        affinity: <CORE|SOCKET|NUMA>
        sockets: <number>
        threads: <boolean>
      memory:
        size: <size>
        by-core: <boolean>
      devices:
        <device-type>: <count>
      exclusive: <boolean>
      annotations:
        <key>: <value>

Fields:

job: Container for job-level defaults
- env (optional): List of environment variables in KEY=VALUE format that will be set in all job containers
- mounts (optional): Map of absolute volume mount paths to volume names.
- policy (optional): Default execution policies for all jobs
  - timeout (optional): Time limits for job execution
    - execute: Maximum duration (e.g., 2h, 30m, 1h30m)
- resource (optional): Default hardware resource requirements
  - cpu (required if resource is specified): CPU requirements
    - cores: Number of CPU cores
    - affinity: Binding strategy - CORE, SOCKET, or NUMA
    - sockets (optional): Number of CPU sockets
    - threads (optional): Whether to expose hardware threads (hyperthreading)
  - memory (required if resource is specified): Memory requirements
    - size: Amount of RAM (e.g., 4GB, 512MB)
    - by-core (optional): If true, size is per CPU core
  - devices (optional): Map of device types to counts (e.g., nvidia.com/gpu: 1)
  - exclusive (optional): If true, job gets exclusive access to the node
  - annotations (optional): Custom key-value pairs for advanced scheduling

Example:

defaults:
  job:
    env:
      - SCRATCH=/scratch
      - LOG_LEVEL=info
    mounts:
      /scratch: scratch
    policy:
      timeout:
        execute: 1h
    resource:
      cpu:
        cores: 2
        affinity: CORE
        threads: false
      memory:
        size: 4GB
        by-core: false

In this example, all jobs will automatically:

Have /scratch and LOG_LEVEL environment variables set
Mount the scratch volume at /scratch
Timeout after 1 hour
Request 2 CPU cores and 4GB of memory

Individual jobs can override any of these defaults by specifying their own values. Default and job specified environment variables are merged with job variables taking precedence over defaults.

annotations

Optional. Sets workflow-level annotation defaults as key-value pairs. These annotations are merged into each job’s allocation annotations; per-job resource.annotations values take precedence over these defaults when the same key appears in both places. The merged set is what the scheduler matches key-by-key against each candidate node provisioner’s annotations map — node provisioners that do not have a matching entry for a given key are rejected for that job.

Structure:

annotations:
  <key>: <value>
  <another-key>: <another-value>

Fields:

<key>: String key identifying the annotation
<value>: String value for the annotation

Example:

annotations:
  nvidia.com/gpu.model: A100
  nvidia.com/gpu.memory: "80"

Common Use Cases:

GPU selection: Require a specific GPU model, family, or minimum memory
Node pool selection: Target specific sets of compute nodes that expose a matching annotation

Annotation matching is strict by default. Every key in the merged allocation annotations (from both workflow-level annotations and per-job resource.annotations) must either be present with a matching value in each candidate node provisioner’s annotations map, or the cluster admin must add the key to scheduler.ignoredAnnotations in the central configuration. The platform’s own internal annotation keys — a specific enumerated set under fuzzball.io/* such as fuzzball.io/workflow.id, fuzzball.io/job.name, and fuzzball.io/provisioner_cluster_id — are skipped automatically. This is an allowlist, not a prefix match: the fuzzball.io/ prefix is reserved for the platform, and any user-defined key placed under that prefix (e.g. fuzzball.io/my-team-label) is not auto-exempt and would still need to be added to scheduler.ignoredAnnotations. Treat the fuzzball.io/ namespace as off-limits for your own annotations.
Annotation keys used only for organizational metadata, auditing, or policy routing should not appear in either annotations or resource.annotations unless the cluster admin has added them to scheduler.ignoredAnnotations; otherwise the scheduler will reject any node provisioner that does not carry those keys.

See Scheduler annotation matching in the configuration reference for built-in GPU matchers and configuration details.

volumes

Optional. Defines storage volumes that jobs can access. Volumes provide persistent or ephemeral storage for workflow data and support data ingress (importing data at workflow start) and egress (exporting data at workflow end).

Structure:

volumes:
  <volume-name>:
    use: <provisioner-name>    # Optional
    name: <persistent-name>    # Optional (only for persistent volumes)
    size: <capacity>           # Optional (only for ephemeral volumes)
    annotations:               # Optional (only for ephemeral volumes)
      <key>: <value>
    ingress:                   # Optional
      - source:
          uri: <source-uri>
          secret: <secret-ref>    # Optional
        destination:
          uri: <dest-uri>
        policy:                # Optional
          timeout:
            execute: <duration>
    egress:                    # Optional
      - source:
          uri: <source-uri>
        destination:
          uri: <dest-uri>
          secret: <secret-ref>    # Optional
        policy:                # Optional
          timeout:
            execute: <duration>

Fields:

<volume-name>: Arbitrary name for the volume, used to reference it in job mounts sections (e.g., scratch, data)
use (optional): Storage provisioner name. Can also be ephemeral or persistent for automatic provisioner selection
reference (deprecated): Legacy V3 volume URI format (volume://scope/class/name). Still supported for backward compatibility but use/name/size/annotations is preferred
ingress (optional): List of files to import into the volume at workflow start
- source (required): Where to fetch the data from
  - uri: Source location. Supported schemes: s3://, http://, https://, file:// (references inline files from the files section), and hf:// (pulls a model, dataset, or space repository from the HuggingFace Hub — see HuggingFace ingress)
  - secret (optional): Reference to credentials for accessing the source (format: secret://<scope>/<name>)
- destination (required): Where to place the data in the volume
  - uri: Destination path in the volume (format: file://<path>). When jobs mount this volume, the path will be relative to the mount point
- policy (optional): Execution policies for the transfer
  - timeout (optional): Time limit for the transfer
    - execute: Maximum duration (e.g., 5m, 1h)
egress (optional): List of files to export from the volume at workflow end. Structure is similar to ingress, but source is the file in the volume and destination is the external storage location

Ephemeral volume fields — these fields define a volume that is created when the workflow runs and destroyed when it completes:

size (optional): Requested volume capacity (e.g., 10GB, 1TiB)
annotations (optional): Key-value pairs for provisioner auto-selection. Fuzzball matches these against provisioner annotations to select the right storage backend

The size and annotations fields are only valid for ephemeral volumes. They cannot be combined with the name field.

Persistent volume fields — these fields reference an existing volume that was created before the workflow (via the CLI, Web UI, or API):

name (optional): Name of an existing persistent volume to bind. The volume must already exist on the provisioner specified by use, or on any accessible provisioner if use is omitted. Workflows that reference a volume that does not exist are rejected at submission time.

The name field is the signal that distinguishes a persistent volume from an ephemeral one. When name is set, the volume is persistent. When name is absent, the volume is ephemeral. Persistent volumes are never created dynamically — they must be created ahead of time.

Example:

volumes:
  scratch:
    use: my-provisioner
    size: 50GB
    ingress:
      - source:
          uri: s3://my-bucket/input-data.tar.gz
          secret: secret://group/AWS_CREDENTIALS
        destination:
          uri: file://inputs/data.tar.gz
        policy:
          timeout:
            execute: 10m
      - source:
          uri: file://config.ini    # References inline file
        destination:
          uri: file://config/app.ini
    egress:
      - source:
          uri: file://results/output.tar.gz
        destination:
          uri: s3://my-bucket/results/output.tar.gz
          secret: secret://group/AWS_CREDENTIALS
  data:
    use: my-provisioner
    name: research-data

In this example:

scratch is ephemeral (no name, has size), downloads data from S3 at start, and uploads results at end
data is persistent (has name), survives workflow completion with no data transfers

Path Resolution Example:

If a job mounts scratch at /scratch, a file ingressed to file://inputs/data.tar.gz will be available in the job container at /scratch/inputs/data.tar.gz.

jobs

Naming Constraint: Job and service names must be unique within a workflow—a service and a job cannot share the same name. This constraint exists because jobs and services share the same DNS namespace for hostname resolution within the workflow. If you attempt to submit a workflow with a name conflict, you will receive a validation error:
service name '<name>' conflicts with job name
For example, this workflow definition is invalid and will be rejected:
version: v4
jobs:
  database:
    image:
      uri: docker://alpine:latest
    command: ["echo", "job"]
    resource:
      cpu:
        cores: 1
      memory:
        size: 1GB
services:
  database:  # ERROR: conflicts with job name
    image:
      uri: docker://postgres:16
    resource:
      cpu:
        cores: 2
      memory:
        size: 4GB
Ensure all job and service names are distinct before submission.

Optional (but at least one job or service is required). Defines the computational work to be performed as a directed acyclic graph (DAG) of compute steps. Jobs can run scientific software, data processing scripts, or any containerized application. Jobs are the fundamental execution unit in a workflow.

Structure:

jobs:
  <job-name>:
    image:                            # Required
      uri: <image-uri>
      secret: <secret-ref>            # Optional
      decryption-secret: <secret-ref> # Optional
    command: [<arg1>, <arg2>, ...]    # One of command or script required (mutually exclusive with script)
    script: |                         # One of command or script required (mutually exclusive with command)
      <script-content>
    args: [<arg1>, <arg2>, ...]       # Optional
    env:                              # Optional
      - <VAR>=<value>
    cwd: <working-directory>          # Optional
    mounts:                           # Optional
      <mount-path>: <volume-name>
    files:                            # Optional
      <container-path>: <file-uri>
    resource:                         # Optional
      cpu:                            # Required if resource specified
        cores: <number>
        affinity: <CORE|SOCKET|NUMA>
        sockets: <number>             # Optional
        threads: <boolean>            # Optional
      memory:                         # Required if resource specified
        size: <size>
        by-core: <boolean>            # Optional
      devices:                        # Optional
        <device-type>: <count>
      exclusive: <boolean>            # Optional
      annotations:                    # Optional
        <key>: <value>
    policy:                           # Optional
      timeout:
        execute: <duration>
    requires: [<job1>, <job2>, ...]   # Optional (deprecated, use depends-on)
    depends-on:                       # Optional
      - name: <job-name>
        status: <RUNNING|FINISHED>
        description: <text>           # Optional
    multinode:                        # Optional (mutually exclusive with task-array, network)
      nodes: <number>
      implementation: <ompi|openmpi|mpich|gasnet|generic>
      procs-per-node: <number>        # Optional
    network:                          # Optional (mutually exclusive with multinode, task-array)
      isolated: <boolean>             # Required (must be true)
      expose-tcp: [<number>, <number>, ...] # Optional
      expose-udp: [<number>, <number>, ...] # Optional
    task-array:                       # Optional (mutually exclusive with multinode, network)
      start: <number>
      end: <number>
      concurrency: <number>           # Optional
    preemptible: <boolean>            # Optional

Fields:

<job-name>: Arbitrary name identifying this job (e.g., preprocess-data, train-model). Must be a valid DNS subdomain and must be unique within a workflow. This name appears in fuzzball workflow status and is used in dependency specifications
image (required): Container image specification
- uri: Image location. Supported schemes: docker:// for OCI containers, oras:// for SIF images (e.g., oras://depot.ciq.com/fuzzball/fuzzball-applications/curl.sif:latest)
- secret (optional): Credentials for private registries (format: secret://<scope>/<name>)
- decryption-secret (optional): Secret to decrypt encrypted SIF images
command (required, mutually exclusive with script): List of arguments for the container entrypoint (e.g., [python3, script.py, --input, /data])
script (required, mutually exclusive with command): Multi-line shell script to execute. May start with a shebang line naming the interpreter (e.g., #!/bin/bash); when omitted, #!/bin/sh is assumed
args (optional): Additional arguments passed to command or script
env (optional): List of environment variables in KEY=VALUE format
cwd (optional): Working directory for the job. Must be an absolute path. Defaults to the image’s working directory or /
mounts (optional): Map of absolute volume mount paths to volume names (from the volumes section)
files (optional): Map of container paths to inline file URIs. Bind mounts inline files directly into the container (e.g., /etc/config.ini: file://my-config)
resource (optional): Hardware resource requirements for scheduling
- cpu (required if resource specified): CPU requirements
  - cores: Number of CPU cores (must be > 0)
  - affinity: Binding strategy - CORE (any cores), SOCKET (same socket), or NUMA (same NUMA domain). Defaults to CORE
  - sockets (optional): Number of physical CPU sockets
  - threads (optional): Whether to expose hardware threads (hyperthreading)
- memory (required if resource specified): Memory requirements
  - size: Amount of RAM with units (e.g., 4GB, 512MB, 2GiB)
  - by-core (optional): If true, size is per CPU core, i.e. total size is cores * size
- devices (optional): Map of device types to counts (e.g., nvidia.com/gpu: 2)
- exclusive (optional): If true, job gets exclusive node access
- annotations (optional): Custom key-value pairs for advanced node selection (e.g., CPU architecture, GPU model)
policy (optional): Execution policies
- timeout (optional): Time limits
  - execute: Maximum job duration (e.g., 2h, 30m)
requires (optional, deprecated): List of job names that must complete before this job starts. Use depends-on instead
depends-on (optional): List of concrete dependencies with status requirements. Note that depending on a job array to finish will wait for all tasks to finish.
- name: Job or service name to depend on
- status: Required status - FINISHED (job/service completed) or RUNNING (job/service is running). See Checking Workflow Status for how these map to the CLI status labels and the STAGE_STATUS_* enum.
- description (optional): Human-readable explanation of the dependency
multinode (optional, mutually exclusive with task-array/network): Multi-node parallel execution
- nodes: Number of nodes to allocate
- implementation: MPI/communication implementation - ompi, openmpi, mpich, gasnet, or generic
- procs-per-node (optional): Processes per node. Defaults to number of allocated CPUs
network (optional, mutually exclusive with multinode and task-array): If present, job is run inside its own (isolated) network namespace.
- isolated: Must be true when the network section is specified.
- expose-tcp (optional): List of container TCP ports to expose
- expose-udp (optional): List of container UDP ports to expose
task-array (optional, mutually exclusive with multinode and network): Embarrassingly parallel execution
- start: Starting task ID (inclusive, must be > 0)
- end: Ending task ID (inclusive, must be >= start)
- concurrency (optional): Maximum parallel tasks (max 200). Each task receives $FB_TASK_ID
preemptible (optional): If true, this job may be preempted by higher-priority allocations on clusters where preemption is enabled. When unset, the job inherits the workflow-level preemptible setting (or the --preemptible start flag); an explicit true or false overrides it

Example:

jobs:
  preprocess:
    image:
      uri: docker://python:3.11
    script: |
      #!/bin/sh
      python3 preprocess.py --input /data/raw/${FB_TASK_ID} --output /data/processed/${FB_TASK_ID}
    env:
      - PYTHONUNBUFFERED=1
    mounts:
      /data: data
    resource:
      cpu:
        cores: 4
        affinity: NUMA
      memory:
        size: 8GB
    policy:
      timeout:
        execute: 30m
    task-array:
      start: 1
      end: 1000
      concurrency: 100

  train-multinode:
    image:
      uri: docker://nvcr.io/nvidia/pytorch:24.01-py3
      secret: secret://user/NGC_API_KEY
    script: |
      #!/bin/bash
      python -m torch.distributed.run train.py --input /data/processed
    depends-on:
      - name: preprocess
        status: FINISHED
    resource:
      cpu:
        cores: 32
        affinity: SOCKET
      memory:
        size: 128GB
      devices:
        nvidia.com/gpu: 4
      annotations:
        nvidia.com/gpu.model:: NVIDIA L40
    multinode:
      nodes: 4
      implementation: openmpi
      procs-per-node: 4

This example demonstrates:

preprocess: Basic job task array with resource requests
train-multinode: Multi-node MPI job with GPUs and explicit dependencies

services

Naming Constraint: Service and job names must be unique within a workflow—a service and a job cannot share the same name. This constraint exists because jobs and services share the same DNS namespace for hostname resolution within the workflow. If you attempt to submit a workflow with a name conflict, you will receive a validation error:
service name '<name>' conflicts with job name
See the jobs section for a detailed example of this validation constraint.

Optional (but at least one job or service is required). Defines long-running container services that support jobs or provide external endpoints. Unlike jobs which complete and exit, services run continuously for the workflow duration (or until dependent jobs finish). Services are useful for databases, web servers, message queues, interactive computing (e.g. jupyter or Rstudio), AI inference servers, or any persistent service that jobs need to access.

Structure:

services:
  <service-name>:
    image:                         # Required
      uri: <image-uri>
      secret: <secret-ref>         # Optional
      decryption-secret: <secret-ref> # Optional
    command: [<arg1>, <arg2>, ...] # One of command or script required (mutually exclusive with script)
    script: |                      # One of command or script required (mutually exclusive with command)
      <script-content>
    args: [<arg1>, <arg2>, ...]    # Optional
    env:                           # Optional
      - <VAR>=<value>
    cwd: <working-directory>       # Optional
    mounts:                        # Optional
      <mount-path>: <volume-name>
    files:                         # Optional
      <container-path>: <file-uri>
    resource:                      # Optional
      cpu:
        cores: <number>
        affinity: <CORE|SOCKET|NUMA>
        sockets: <number>          # Optional
        threads: <boolean>         # Optional
      memory:
        size: <size>
        by-core: <boolean>         # Optional
      devices:                     # Optional
        <device-type>: <count>
      exclusive: <boolean>         # Optional
      annotations:                 # Optional
        <key>: <value>
    requires: [<job1>, <svc1>, ...] # Optional (deprecated, use depends-on)
    depends-on:                     # Optional
      - name: <job-or-service-name>
        status: <RUNNING|FINISHED>
        description: <text>        # Optional
    multinode:                     # Optional
      nodes: <number>
      implementation: <ompi|openmpi|mpich|gasnet|generic>
      procs-per-node: <number>     # Optional
    network:                       # Optional
      host: <boolean>              # Optional
      ports:
        - name: <port-name>
          port: <port-number>
          protocol: <tcp|udp>      # Optional
      endpoints:                   # Optional
        - name: <endpoint-name>
          port-name: <port-name>   # References port name above
          protocol: <http|https|grpc|grpcs|tcp|tls>
          type: <subdomain|path>
          scope: <endpoint-scope>  # One of: user, group, organization, public
    persist: <boolean>             # Optional
    readiness-probe:                # Optional
      exec:                        # One of: exec, http-get, tcp-socket, grpc
        command: [<arg1>, <arg2>]
      http-get:
        path: <path>
        port: <port-number>
        scheme: <HTTP|HTTPS>       # Optional
        http-headers:               # Optional
          - name: <header-name>
            value: <header-value>
      tcp-socket:
        port: <port-number>
      grpc:
        port: <port-number>
        service: <service-name>    # Optional
      initial-delay-seconds: <seconds>  # Optional
      period-seconds: <seconds>        # Optional
      timeout-seconds: <seconds>       # Optional
      success-threshold: <number>      # Optional
      failure-threshold: <number>      # Optional
    autoscaler:                     # Optional
      replicas:                    # Required for a self-scaling service
        min: <number>              # Optional (0 enables scale-to-zero)
        max: <number>              # Required, >= 1 and >= min
      metrics:                     # Optional
        enabled: <boolean>
        path: <metrics-path>
        port: <port-number>
        interval: <seconds>
        includes: [<metric1>, ...]
      scale-up:                    # Optional
        triggers:
          - metrics-query: <promql-expression>
            cooldown-period: <seconds>  # Optional
            services: [<svc1>, ...]     # Optional (cross-service scaling)
      scale-down:                  # Optional
        metrics-query: <promql-expression>
        cooldown-period: <seconds> # Optional
    dynamic-config:                 # Optional
      path: <container-path>
      services: [<svc1>, <svc2>]
      script: |
        <bash-script>
    preemptible: <boolean>         # Optional

Fields:

Services share many fields with jobs (image, command, script, args, env, cwd, mounts, files, resource, depends-on, multinode, preemptible). See the jobs section for details on these common fields. Service-specific fields are:

Important: A multinode service is not designed to run multiple identical instances of a service with load-balanced client connections. It is intended for distributed services (e.g., MPI-based, vLLM cluster …), where the endpoint is served exclusively on rank 0. Clients always connect to rank 0, which coordinates with the other instances internally.

<service-name>: Arbitrary name identifying this service. Must be a valid DNS subdomain and must be unique within a workflow.
network (optional): Network configuration for service exposure. If present and the list of exposed ports is not empty, service is run inside its own (isolated) network namespace.
- host (optional): If true, service serves on the host network namespace.
- ports: List of ports the service listens on
  - name: Identifier for this port (used in endpoints)
  - port: Port number (1-65535)
  - protocol (optional): tcp or udp. Defaults to tcp
- endpoints (optional): List of external endpoints to create
  - name: Endpoint identifier
  - port-name: References a port name from the ports list
  - protocol: Protocol - http, https, grpc, grpcs, tcp, or tls
  - type: Endpoint style - subdomain (creates <name>.<workflow-id>.<account>.fuzzball) or path (creates /endpoints/<account>/<workflow-id>/<name>)
  - scope: Determines who can access the endpoint. One of user (only the workflow creator), group (anyone in the same account), organization (anyone in the same organization), public (anyone without authentication). Defaults to group if not specified.
ports and endpoints are YAML lists: each entry begins with a -, and a service may declare more than one. Write them as sequence items (as shown in the structure above), not as a single mapping.
persist (optional): If true, service continues running even after all dependent jobs finish and continues until the workflow is cancelled. If false (default), service stops when no jobs/services depend on it.
readiness-probe (optional): Kubernetes-style health check to determine when service is ready. Service status transitions from STARTED to RUNNING only after probe succeeds
- exec: Run a command in the container. Success if exit code is 0
  - command: Command to execute
- http-get: HTTP GET request. Success if status code is 200-399
  - path: HTTP path
  - port: Port number
  - scheme (optional): HTTP or HTTPS
  - http-headers (optional): Custom HTTP headers
- tcp-socket: TCP connection attempt. Success if connection establishes
  - port: Port number
- grpc: gRPC health check. Success per gRPC health checking protocol
  - port: Port number
  - service (optional): gRPC service name
- initial-delay-seconds (optional): Delay before first probe
- period-seconds (optional): Frequency of probes
- timeout-seconds (optional): Probe timeout
- success-threshold (optional): Consecutive successes needed
- failure-threshold (optional): Consecutive failures before marking unhealthy
autoscaler (optional): Automatically scales the number of replicas of this service up and down based on PromQL queries evaluated against the service’s metrics. See Service Autoscaling for a full guide.
- replicas: Bounds the replica count for a self-scaling service. Required unless the service’s scale-up triggers target other services.
  - min (optional): Minimum number of replicas. 0 enables scale-to-zero (the service runs no replicas until a trigger fires).
  - max: Maximum number of replicas. Must be at least 1 and not less than min.
- metrics (optional): Configures collection of the service’s own metrics, used by the scale policies.
  - enabled: Set to true to scrape the service’s metrics endpoint.
  - path: HTTP path of the metrics endpoint (e.g. /metrics).
  - port: Container port exposing the metrics endpoint.
  - interval: Scrape interval, in seconds.
  - includes (optional): Restricts collection to the named metrics. All metrics are collected when empty.
- scale-up (optional): One or more triggers that add replicas (including from zero).
  - triggers: List of independent scale-up conditions. Each trigger:
    - metrics-query: A PromQL expression. The trigger fires when the query returns a true (non-zero) result. Use the bool modifier so the comparison yields 1/0 (e.g. vllm:num_requests_waiting >= bool 2).
    - cooldown-period (optional): Minimum number of seconds between consecutive scaling actions for the targeted service.
    - services (optional): Names of other services this trigger scales, instead of this service’s own replicas (cross-service scaling — see the autoscaling guide). Leave empty to scale the owning service itself.
- scale-down (optional): A single policy that removes replicas of this service.
  - metrics-query: PromQL expression; the policy fires when it returns a true result.
  - cooldown-period (optional): Minimum number of seconds between consecutive scale-downs.
dynamic-config (optional): Renders a configuration file from the live IP addresses of other services and bind-mounts it into this service’s container, re-rendering it as those services' replicas scale up and down.
- path: Destination path of the rendered file inside the container.
- services: Services whose addresses are made available to the script. Each service name is exposed as a bash array named SERVICES_<NAME> (uppercased, non-identifier characters replaced with _); for example service worker becomes ${SERVICES_WORKER[@]}.
- script: A bash script whose standard output becomes the file contents. It runs in a restricted interpreter: builtins and heredocs only — no external commands (except a no-argument cat for cat <<EOF templating) and no filesystem access.

Example:

services:
  postgres:
    image:
      uri: docker://postgres:16
    env:
      - POSTGRES_PASSWORD=secret
      - POSTGRES_DB=myapp
    mounts:
      /var/lib/postgresql/data: db
    resource:
      cpu:
        cores: 4
      memory:
        size: 8GB
    network:
      ports:
        - name: postgres
          port: 5432
          protocol: tcp
    readiness-probe:
      tcp-socket:
        port: 5432
      initial-delay-seconds: 5
      period-seconds: 10
    persist: true

  api-server:
    image:
      uri: docker://mycompany/api:v1.2.3
      secret: secret://group/REGISTRY_CREDS
    env:
      - DATABASE_URL=postgresql://postgres:5432/myapp
    depends-on:
      - name: postgres
        status: RUNNING
        description: "API needs database connection"
    resource:
      cpu:
        cores: 2
      memory:
        size: 4GB
    network:
      ports:
        - name: http
          port: 8080
      endpoints:
        - name: api
          port-name: http
          protocol: https
          type: subdomain
          scope: group
    readiness-probe:
      http-get:
        path: /health
        port: 8080
      initial-delay-seconds: 10
      period-seconds: 5
      failure-threshold: 3

jobs:
  data-processor:
    image:
      uri: docker://mycompany/processor:latest
    script: |
      #!/bin/sh
      python process.py --api-url http://api-server:8080
    depends-on:
      - name: api-server
        status: RUNNING
    resource:
      cpu:
        cores: 8
      memory:
        size: 16GB

This example shows:

postgres: Persistent database service with readiness probe
api-server: REST API depending on postgres, exposed via HTTPS subdomain endpoint at account scope
data-processor: Job that depends on api-server being running before it starts