Storage Volumes and Workflows
Storage volumes can be created either from the CLI or web UI, but they can also be created automatically by workflows. This is a convenience especially for ephemeral volumes used for a single workflow.
We have seen that a storage class can restrict the storage volume scope within its definition. This scope governs the creation (but not the usage) of volumes. Scopes can be
account: Owners of an account can create storage volumes for their accounts. This excludes the user account.user: Single users can create storage volumes for their user account only. This excludes other accounts.all: There is no scope restriction.
Why do these settings not control the usage scope?
Storage volumes created within a user account are usable from any account a user is member
of. This allows, for example, a user of the account Apple to use their home directory while running
workflows from the account Apple.
Storage volumes can be used by their reference from workflow volumes. A storage volume reference includes (at a minimum) a scope and the storage class to which it belongs. An optional custom name can be specified depending on the storage class naming configuration. So a storage volume reference takes this form:
volume://<scope>/<storage_volume_name>/<optional_custom_name>
Where scope can be either user or account
depending of the defined storage class scope. If the storage class scope is set to account, the
reference will forcibly start with the prefix volume://account. Conversely if the storage class
scope is set to user, the reference will forcibly start with the prefix volume://user .
So if you want to use a storage volume in an
ephemeral storage class like scratch
we defined earlier and which has defined the scope to user, you would need to use the reference
volume://user/scratch. In our definition of the scratch ephemeral class, the volume name is
automatically set to the workflow ID so the custom name is optional and will be ignored.
When an account is created, a group ID (GID) is allocated for the account. All account members have the allocated GID added to their user.
When a storage volume is created
within an account, the group ownership of the top level directory is set to either the primary group
ID of user or root group ID for the storage volume directory group. This configuration is set in the
storage volume definition. Group permissions set at the top level directory are set to read/write.
setgid is configured at the top level directory and is used to ensure data written to the storage
volume has group ownership set to the allocated GID.
By default, umask is typically set 002. The umask enables a user to do anything (read, write,
execute) with the files created while other users can only read and execute, but not alter them. As
a result, account members can share data written to a storage volume. Setting a umask in a
workflow can modify the permissions files
created with in the storage volume.
Let’s use volumes from the scratch (ephemeral) and data (persistent) classes created previously
in a workflow. Neither use the optional custom name part of a volume reference. The persistent data
volume for the user running the example workflow below may already exist having been created either
explicitly or automatically by a previous workflow and persists. If it does not yet exist it will be created
automatically. The ephemeral scratch volume will be created automatically and deleted at the end of
the workflow.
# this is test.fz
version: v1
volumes:
data:
reference: volume://user/data
scratch:
reference: volume://user/scratch
ingress:
- source:
uri: https://raw.githubusercontent.com/ErikSchierboom/sentencegenerator/master/samples/the-king-james-bible.txt
destination:
uri: file://bible.txt
jobs:
read:
image:
uri: docker://alpine:latest
mounts:
scratch:
location: /scratch
data:
location: /data
command:
- sh
- -c
- "cat /scratch/bible.txt | tr '[:lower:]' '[:upper:]' > /data/bible.txt"
policy:
timeout:
execute: 10m
retry:
attempts: 3
resource:
cpu:
cores: 1
affinity: NUMA
memory:
size: 1GiB
After creating the Fuzzfile, you can run the workflow like so:
# fuzzball workflow start test.fz
Workflow "9ce2772a-924c-4b1a-b10e-87a864fb16d7" started.
# fuzzball workflow describe 9ce2772a-924c-4b1a-b10e-87a864fb16d7
Name: test.fz
Email: bob@me.llc
UserId: b8077c97-0185-437c-9425-feb063a5884b
Status: STAGE_STATUS_FINISHED
Cluster: unset-cluster
Created: 2025-04-15 02:02:17PM
Started: 2025-04-15 02:02:17PM
Finished: 2025-04-15 02:02:28PM
Error:
Stages:
KIND | STATUS | NAME | STARTED | FINISHED
Workflow | Finished | 9ce2772a-924c-4b1a-b10e-87a864fb16d7 | 2025-04-15 02:02:17PM | 2025-04-15 02:02:28PM
Volume | Finished | data | 2025-04-15 02:02:17PM | 2025-04-15 02:02:19PM
Volume | Finished | scratch | 2025-04-15 02:02:17PM | 2025-04-15 02:02:19PM
Image | Finished | docker://alpine:latest | 2025-04-15 02:02:17PM | 2025-04-15 02:02:18PM
File | Finished | https://raw.githubusercontent.com/ErikSchi... | 2025-04-15 02:02:22PM | 2025-04-15 02:02:23PM
Job | Finished | read | 2025-04-15 02:02:24PM | 2025-04-15 02:02:25PM
We can see the persistent storage volume on the NFS server containing the upper-cased text file.
# tree /srv/fuzzball/storage_data
/srv/fuzzball/storage_data
├── alice
│ ├── ...
└── bob
└── bible.txt
2 directories, 4 files