Storage Class Definitions
A storage class definition defines how storage volumes are created/published/deleted/named on a cluster. Only organization owners can control storage class definitions for security reasons.
Below we present examples of definitions of persistent and ephemeral classes.
Here is an example of a persistent storage class backed by an NFS share using the NFS storage driver
installed in the previous section. We assume access
to an NFS server at ${NFS_SERVER_IP} providing a /srv/fuzzball/storage_data share:
version: v1
name: data
description: Persistent data
driver: nfs.csi.k8s.io
properties:
persistent: true
retainOnDelete: true
parameters:
server: ${NFS_SERVER_IP}
share: /srv/fuzzball/storage_data
capacity:
size: 100
unit: GiB
access:
type: filesystem
mode: multi_node_multi_writer
mount:
options:
- nfsvers=4
user: user
group: user
permissions: 770
scope: user
volumes:
nameArgs:
- USERNAME
nameFormat: "{{username}}"
maxByAccount: 1
We will discuss the fields of this YAML definition in detail. First, it is necessary to provide the definition version, a name, and a description.
The name cannot be changed later, so ensure it is correct when first created.
# always v1 for now
version: v1
# The name is mandatory and will be used by Fuzzball as reference to the volume
name: data
# A nice, human-readable description for this storage volume
description: Persistent data
# The storage driver to use
driver: nfs.csi.k8s.io
properties:
persistent: true
retainOnDelete: true
In the above example, persistent
specifies that the storage volumes
will persist and can be used for input and output in multiple concurrent and/or subsequent
workflows. This is in contrast to ephemeral
which only exist for the duration of a single workflow.
And retainOnDelete specifies that the content of storage volumes should be retained even
when the storage volume itself is deleted in fuzzball by its owner.
TheretainOnDeletesetting only retains data when the owner deletes the volume. If the owner deletes the data explicitly, this setting will not allow them to be recovered.
This field takes key/value pairs and its content is driver dependent. Consult the CSI driver
documentation to determine appropriate keys for a given
driver. In this example server and share are set for an NFS CSI Driver to specify the NFS server
and the export for the volumes mount:
parameters:
server: ${NFS_SERVER_IP}
share: /srv/fuzzball/storage_data
It is possible to set in parameter values the following arguments which will be substituted:
{{uid}}will be set to the user ID as defined bymount.user{{gid}}will be set to the group ID as defined bymount.group
capacity:
size: 100
unit: GiB
This allows you to specify the storage volume capacity. The recognized unit values (case
insensitive) are as follows:
MiBGiBTiBPiBEiB
access:
type: filesystem
mode: multi_node_multi_writer
This section allows one to define the access type and the access mode of the volumes.
Recognized (case insensitive) values for type:
filesystem
Recognized (case insensitive) values for mode:
MULTI_NODE_READER_ONLY: to allow one or multiple Substrate nodes to mount the volume in RO modeMULTI_NODE_MULTI_WRITER: to allow one or multiple Substrate nodes to mount the volume a RW mode
mount:
options:
- nfsvers=4
user: user
group: user
permissions: 770
The mount section allows one to specify additional mount options to pass. Options are driver
dependent (see CSI Driver documentation). In addition to
mount options, an
organization owner
controls volumes ownership and permissions that are applied on storage volume directories when used.
For that purpose we can define user and group keys
with the following values:
- Recognized values for
user:user: sets the storage volume directory owner to the user’s UIDroot: sets the storage volume directory owner to root (UID=0)
groupcan take as value:user: this tells to set the primary group ID of user for the storage volume directory grouproot: this tells to set the root group ID for the storage volume directory groupaccount: this tells to set the account group ID for the storage volume directory group
The valueuserrequires that Keycloak be configured with an LDAP provider.
Additionally, an organization owner can define permissions bits to apply on storage volume directories, recognized values:
700750755775770500550555
Storage volumes can have a restricted scope. The scope parameter controls the creation scope (but
not the usage scope). Possible scopes are account, user, or all for both.
scope: user
The scope determines which account type can create a storage volume:
accountspecifies that storage volumes can be created from any account except from the user accountuserspecifies that storage volumes can be created from the user account onlyallspecifies that storage volumes can be created from any accounts
Many CSI Drivers use the storage volume name as the directory name in order to control the storage volumes. The section below provides a flexible way to define storage volume names automatically:
volumes:
nameArgs:
- USERNAME
nameFormat: "{{username}}"
# this is the maximum of storage volumes that can be created by an account
maxByAccount: 1
nameArgs and nameFormat constitute the equivalent of getting sprintf(nameFormat, nameArgs...)
result where {{arg_name}} placeholders are substituted by their respective values. Placeholders
argument names are case-insensitive.
Recognized name arguments:
USERNAME: substituted with the username of user creating/using a storage volumeORGANIZATION_ID: substituted with the organization ID the user belongs to when creating/using a storage volumeACCOUNT_ID: substituted with the account ID the user is using when creating/using a storage volumeWORKFLOW_ID: substituted with the workflow ID the user is running when creating/using a storage volumeCUSTOM_NAME: substituted with the name the user provided when creating/using a storage volume
In the example above, when user userx creates/uses a storage volume, the resulting name
for the volume name is user-userx. The NFS CSI Driver uses the volume name as the directory name,
so when userx creates/uses a storage volume based on this definition, the CSI Drivers will
set /srv/fuzzball/storage_data/userx as the resulting volume path.
When
CUSTOM_NAMEis used as part of the volume name users can affect the name of the volume by using volume references in the formatvolume://<scope>/<storage_volume_name>/<optional_custom_name>. This will allow users to create multiple distinct persistent storage volumes for a storage class up to the maximum specified withmaxByAccount. IfCUSTOM_NAMEis not used then the<custom_name>part of the volume reference is ignored. IfCUSTOM_NAMEis used it may be advisable to include additional information in thenameFormatto ensure unique paths on the storage backend. For example:volumes: nameArgs: - USERNAME - CUSTOM_NAME nameFormat: "{{username}}-{{custom_name}}" maxByAccount: 2
While possible to useWORKFLOW_IDfor persistent volume naming, it is not generally helpful to include the workflow id of the creating job in the volume name.
The name arguments ORGANIZATION_ID and ACCOUNT_ID don’t ensure a unique volume name, with the
exception of ACCOUNT_ID when maxByAccount is set to 1. All other arguments can be used in
isolation and will produce unique volume names.
A definition file for the storage class underlying ephemeral volumes for workflows is broadly similar:
version: v1
name: scratch
description: Ephemeral Scratch Volumes
driver: nfs.csi.k8s.io
properties:
persistent: false
retainOnDelete: false
parameters:
server: ${NFS_SERVER_ID}
share: /srv/fuzzball/storage_scratch
capacity:
size: 100
unit: GiB
access:
type: filesystem
mode: multi_node_multi_writer
mount:
options:
- nfsvers=4
user: user
group: user
permissions: 770
scope: user
volumes:
nameArgs:
- WORKFLOW_ID
nameFormat: "{{workflow_id}}"
Here we useWORKFLOW_IDas the sole element used for naming the ephemeral volumes which will ensure that each workflow will get a unique, private scratch volume backed by a subdirectory of the scratch share created earlier named after the workflow id.
To highlight changes from the
persistent example, we have set
persistent property to false and we also don’t want to keep data when workflow has finished by
setting retainOnDelete to false.