Storage Volumes and Workflows
Storage volumes can be created either from the CLI or web UI, but they can also be created automatically by workflows. This is a convenience especially for ephemeral volumes used for a single workflow.
We have seen that a storage class can restrict the storage volume scope within its definition. This scope governs the creation (but not the usage) of volumes. Scopes can be
group: Owners of a group can create storage volumes for their groups. This excludes the user’s personal group.user: Single users can create storage volumes for their user only. This excludes other users and groups.all: There is no scope restriction.
Why do these settings not control the usage scope?
Storage volumes scoped to a user are usable from any group a user is member of. This allows, for
example, a user of the group Apple to use their home directory while running
workflows from the group Apple.
Storage volumes can be used by their reference from workflow volumes. A storage volume reference includes (at a minimum) a scope and the storage class name. An optional custom name can be specified depending on the storage class naming configuration. So a storage volume reference takes this form:
volume://<scope>/<storage_class_name>/<optional_custom_name>
Where scope can be user or group depending on the
defined storage class scope, and storage_class_name is the name of the storage class (e.g.,
scratch, data, nfs). If the storage class scope is set to group, the reference will use
the prefix volume://group/. If the storage class scope is set to user, the reference will use the
prefix volume://user/.
So if you want to use a storage volume from an
ephemeral storage class like scratch
we defined earlier (which has scope set to user), you would use the reference
volume://user/scratch, where scratch is the storage class name. In our definition of the scratch
ephemeral class, the custom name portion of the reference is automatically generated using the
workflow ID, so you don’t need to include /<optional_custom_name> in the reference.
When a group is created, a group ID (GID) is allocated for the group. All group members have the allocated GID added to their user.
When a storage volume is created
within a group, the group ownership of the top level directory is set to either the primary group
ID of user or root group ID for the storage volume directory group. This configuration is set in the
storage volume definition. Group permissions set at the top level directory are set to read/write.
setgid is configured at the top level directory and is used to ensure data written to the storage
volume has group ownership set to the allocated GID.
By default, umask is typically set 002. The umask enables a user to do anything (read, write,
execute) with the files created while other users can only read and execute, but not alter them. As
a result, group members can share data written to a storage volume. Setting a umask in a
workflow can modify the permissions files
created with in the storage volume.
Let’s use volumes from the scratch (ephemeral) and data (persistent) classes created previously
in a workflow. Neither use the optional custom name part of a volume reference. The persistent data
volume for the user running the example workflow below may already exist having been created either
explicitly or automatically by a previous workflow and persists. If it does not yet exist it will be created
automatically. The ephemeral scratch volume will be created automatically and deleted at the end of
the workflow.
# this is test.fz
version: v1
volumes:
data:
reference: volume://user/data
scratch:
reference: volume://user/scratch
ingress:
- source:
uri: https://raw.githubusercontent.com/ErikSchierboom/sentencegenerator/master/samples/the-king-james-bible.txt
destination:
uri: file://bible.txt
jobs:
read:
image:
uri: docker://alpine:latest
mounts:
scratch:
location: /scratch
data:
location: /data
script: |
#!/bin/sh
cat /scratch/bible.txt | tr '[:lower:]' '[:upper:]' > /data/bible.txt
policy:
timeout:
execute: 10m
retry:
attempts: 3
resource:
cpu:
cores: 1
affinity: NUMA
memory:
size: 1GiB
After creating the Fuzzfile, you can run the workflow like so:
# fuzzball workflow start test.fz
Workflow "9ce2772a-924c-4b1a-b10e-87a864fb16d7" started.
# fuzzball workflow describe 9ce2772a-924c-4b1a-b10e-87a864fb16d7
Name: test.fz
Email: bob@me.llc
UserId: b8077c97-0185-437c-9425-feb063a5884b
Status: STAGE_STATUS_FINISHED
Cluster: unset-cluster
Created: 2025-04-15 02:02:17PM
Started: 2025-04-15 02:02:17PM
Finished: 2025-04-15 02:02:28PM
Error:
Stages:
KIND | STATUS | NAME | STARTED | FINISHED
Workflow | Finished | 9ce2772a-924c-4b1a-b10e-87a864fb16d7 | 2025-04-15 02:02:17PM | 2025-04-15 02:02:28PM
Volume | Finished | data | 2025-04-15 02:02:17PM | 2025-04-15 02:02:19PM
Volume | Finished | scratch | 2025-04-15 02:02:17PM | 2025-04-15 02:02:19PM
Image | Finished | docker://alpine:latest | 2025-04-15 02:02:17PM | 2025-04-15 02:02:18PM
File | Finished | https://raw.githubusercontent.com/ErikSchi... | 2025-04-15 02:02:22PM | 2025-04-15 02:02:23PM
Job | Finished | read | 2025-04-15 02:02:24PM | 2025-04-15 02:02:25PM
We can see the persistent storage volume on the NFS server containing the upper-cased text file.
# tree /srv/fuzzball/storage_data
/srv/fuzzball/storage_data
├── alice
│ ├── ...
└── bob
└── bible.txt
2 directories, 4 files