Viewing BLAST Results
The run of the stock BLAST workflow template saves results to a persistent volume. You can access and post-process these results from other Fuzzball workflows by mounting the same persistent volume in other Fuzzball workflows like, for example, Jupyter or an interactive shell. In the following example we will use the Fuzzball CLI to start a workflow that will sleep until it times out or we stop it and then connect to it to access the blast results
$ cat <<EOF > shell.fz
version: v1
volumes:
data:
reference: volume://user/persistent
jobs:
shell:
image:
uri: docker://rockylinux:9
mounts:
data:
location: /data
env:
- "PS1=(fuzzball)$ "
cwd: /data/results
command:
- /bin/bash
- "-c"
- sleep 8h
resource:
cpu:
cores: 1
threads: true
memory:
size: 2GiB
policy:
timeout:
execute: 8h
EOF
$ fuzzball workflow start shell.fz
Workflow "ed24d154-4bd5-4695-b458-7f821710e4a7" started.
$ fuzzball workflow describe ed24d154-4bd5-4695-b458-7f821710e4a7
Name: shell.fz
Email: wresch@ciq.com
UserId: 87145648-b830-4291-ab7e-40880d61334e
Status: STAGE_STATUS_STARTED
Cluster: fuzzball-aws-stable
Created: 2025-04-23 01:45:59PM
Started: 2025-04-23 01:45:59PM
Finished: N/A
Error:
Stages:
KIND | STATUS | NAME | STARTED | FINISHED
Workflow | Started | ed24d154-4bd5-4695-b458-7f821710e4a7 | 2025-04-23 01:45:59PM | N/A
Volume | Finished | data | 2025-04-23 01:46:00PM | 2025-04-23 01:46:32PM
Image | Finished | docker://rockylinux:9 | 2025-04-23 01:46:00PM | 2025-04-23 01:46:21PM
Job | Started | shell | 2025-04-23 01:48:48PM | N/A
The workflow is running and you can connect to it like so:
$ fuzzball workflow exec --tty ed24d154-4bd5-4695-b458-7f821710e4a7 shell /bin/bash
(fuzzball)$ pwd
/data/results
(fuzzball)$ ls -lh blast
total 40K
drwxrwxr-x. 2 user group 6.0K Apr 18 17:35 301eb4c1-1f24-4cac-8d7d-8a6c67db16bb
drwxrwxr-x. 2 user group 6.0K Apr 18 17:27 58abcdba-022f-4256-a516-2bb762a2b6b2
drwxrwxr-x. 2 user group 6.0K Apr 22 18:03 64888a33-ac09-4fb6-8db6-20aa35fbddc9
drwxrwxr-x. 2 user group 6.0K Apr 21 21:43 6bf078f5-575d-48a1-ae44-6c5532551f45
drwxrwxr-x. 2 user group 6.0K Apr 21 21:33 8c9625fa-802c-42dd-a065-2bf66a8a6680
drwxrwxr-x. 2 user group 6.0K Apr 18 19:44 cceab18b-dd1e-4553-bb05-a42bda1d76b3
drwxrwxr-x. 2 user group 6.0K Apr 18 20:54 d2b33c1c-8f0e-43fe-a052-bebc476c68d9
drwxrwxr-x. 2 user group 6.0K Apr 18 18:50 dad618e7-e965-4ee3-9ebc-f611ffe20924
drwxrwxr-x. 2 user group 6.0K Apr 18 13:49 e7b1e1d6-0d53-4aa1-a17f-b57a4603ee72
drwxrwxr-x. 2 user group 6.0K Apr 18 19:00 ed5e81ba-c354-43c5-9f23-b13d86596294
(fuzzball)$ head -30 blast/e7b1e1d6-0d53-4aa1-a17f-b57a4603ee72/pox_efc.blast.out
BLASTP 2.16.0+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: PDB protein database
170,598 sequences; 48,617,182 total letters
Query= YP_232915.1 serine protease inhibitor-like [Vaccinia virus]
Length=369
Score E
Sequences producing significant alignments: (Bits) Value
4KDS_A Chain A, Plasminogen activator inhibitor 1 [Oncorhynchus m... 153 3e-42
(fuzzball)$ exit
$ fuzzball workflow stop ed24d154-4bd5-4695-b458-7f821710e4a7
In a previous section, we created a modified workflow template that
saved its BLAST result file {{.RunName}}.blast.out to an S3 bucket at destination
s3://<bucket>/<path...>. In our example the saved object was
s3://co-ciq-misc-support/results/pox_efc.blast.out which is what we will use below.
One method to download the results file to your workstation and view it is to use the AWS CLI to interface with the S3 bucket where your results are stored. If you do not have the AWS CLI installed, please see the AWS CLI installation instructions for more information.
Using the AWS CLI command aws s3 cp, the result file can be downloaded to your workstation. The
command below copies the result file at S3 URI s3://<bucket>/<path...>/{{.RunName}}.blast.out to
your working directory.
$ aws s3 cp s3://co-ciq-misc-support/results/pox_efc.blast.out .
From there you can use any standard tools to view, parse or process the BLAST output format you choose.
$ cat pox_efc.blast.out
BLASTP 2.16.0+
Reference: Stephen F. Altschul, Thomas L. Madden, Alejandro A.
Schaffer, Jinghui Zhang, Zheng Zhang, Webb Miller, and David J.
Lipman (1997), "Gapped BLAST and PSI-BLAST: a new generation of
protein database search programs", Nucleic Acids Res. 25:3389-3402.
Reference for composition-based statistics: Alejandro A. Schaffer,
L. Aravind, Thomas L. Madden, Sergei Shavirin, John L. Spouge, Yuri
I. Wolf, Eugene V. Koonin, and Stephen F. Altschul (2001),
"Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements", Nucleic Acids
Res. 29:2994-3005.
Database: PDB protein database
170,598 sequences; 48,617,182 total letters
Query= YP_233018.1 A16
Length=377
Score E
Sequences producing significant alignments: (Bits) Value
8GP6_A Chain A, Virion membrane protein A16 [Vaccinia virus WR] 720 0.0
7AE2_B Chain B, MNT ANTITOXIN [Aphanizomenon flos-aquae 2012/KM1/D3] 30.8 2.7
3G02_A Chain A, Epoxide hydrolase [Aspergillus niger] 30.0 8.6
1QO7_A Chain A, EPOXIDE HYDROLASE [Aspergillus niger] 30.0 9.0
>8GP6_A Chain A, Virion membrane protein A16 [Vaccinia virus WR]
Length=348
Score = 720 bits (1859), Expect = 0.0, Method: Compositional matrix adjust.
Identities = 342/342 (100%), Positives = 342/342 (100%), Gaps = 0/342 (0%)
Query 1 MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFC 60
MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFC
Sbjct 1 MGAAVTLNRIKIAPGIADIRDKYMELGFNYPEYNRAVKFAEESYTYYYETSPGEIKPKFC 60
...snip...