Executing a LAMMPS Workflow
The Fuzzfile below
runs a standard Lenard-Jones 3D
melt experiment using LAMMPS. The workflow
generates output file log.lammps
which contains simulation and benchmark results which can be
saved through data egress to a path defined
in an S3 URI. This workflow uses a LAMMPS container from Nvidia’s NGC
catalog and runs on a single node with a
GPU.
version: v1
volumes:
lammps-data-volume:
name: lammps-data-volume
reference: volume://user/ephemeral
ingress:
- source:
uri: https://www.lammps.org/inputs/in.lj.txt
destination:
uri: file://in.lj.txt
# egress:
# - source:
# uri: file://log.lammps
# destination:
# uri:
# secret:
jobs:
run-lammps:
image:
uri: docker://nvcr.io/hpc/lammps:29Sep2021
env:
- LAMMPS_INPUT_FILE_PATH=/data/in.lj.txt
command:
- /bin/sh
- '-c'
- >-
lmp -k on g 1 -sf kk -pk kokkos cuda/aware on neigh full comm device binsize 2.8 -var x 8 -var y 4 -var z 8
-in ${LAMMPS_INPUT_FILE_PATH}
cwd: /data
resource:
cpu:
cores: 2
affinity: NUMA
memory:
size: 14GiB
devices:
nvidia.com/gpu: 1
mounts:
lammps-data-volume:
location: /data
You can run this workflow either through the GUI or the CLI.
If you click “Workflow Editor” and “Create New”, you will see a blank page in the workflow editor.
Now you can either click the ellipses (...
) menu in the lower right and select “Edit YAML” or
simply press e
on your keyboard. An editor with a Fuzzfile stub will appear.
You can delete the current contents and copy and paste the workflow definition of from above.
Now pressing “save” will return you to the interactive workflow editor. You will now see a
job named run-lammps
instead of a blank
editor page. The Fuzzball GUI will automatically validate the yaml file for syntax errors.
Submitting your workflow to Fuzzball with the GUI is easy. Simply press the triangular “Start Workflow” button in the lower right corner of the workflow editor. You will be prompted to provide an optional descriptive name for your workflow.
Now you can click on “Start Workflow” in the lower right corner of the dialog box and your workflow will be submitted. If you click “Go to Status” you can view the workflow status page. The screenshot below shows the status page for a hello world workflow submission.
To retrieve logs produced by this workflow, select a job within the workflow such as
make-blast-database
, and click the “Logs” option on the right.
To run this workflow through the CLI you will need access to the Fuzzball CLI. You can install it using the Fuzzball CLI installation instructions.
First, you can create a Fuzzfile lammps-gpu-ngc.fz
with the contents above using the text
editor of your choice.
You can start start this workflow using the CLI by running the following command:
$ fuzzball workflow start lammps-gpu-ngc.fz
Workflow "45d81a83-dd54-4052-afab-09dd3216abbe" started.
You can monitor the workflow’s status by running the following command:
$ fuzzball workflow describe <workflow uuid>
Name: lammps-gpu-ngc.fz
Email: bphan@ciq.co
UserId: e554e134-bd2d-455b-896e-bc24d8d9f81e
Status: STAGE_STATUS_FINISHED
Created: 2024-06-21 09:23:58AM
Started: 2024-06-21 09:23:58AM
Finished: 2024-06-21 09:26:44AM
Error:
Stages:
KIND | STATUS | NAME | STARTED | FINISHED
Workflow | Finished | 45d81a83-dd54-4052-afab-09dd3216abbe | 2024-06-21 09:23:58AM | 2024-06-21 09:26:44AM
Volume | Finished | lammps-data-volume | 2024-06-21 09:23:58AM | 2024-06-21 09:24:16AM
Image | Finished | docker://nvcr.io/hpc/lammps:29Sep2021 | 2024-06-21 09:23:58AM | 2024-06-21 09:24:15AM
File | Finished | https://www.lammps.org/inputs/in.lj.txt | 2024-06-21 09:24:31AM | 2024-06-21 09:24:35AM
| | ->... | |
Job | Finished | run-lammps | 2024-06-21 09:25:46AM | 2024-06-21 09:26:09AM
File | Finished | file://log.lammps -> | 2024-06-21 09:26:24AM | 2024-06-21 09:26:28AM
| | s3://co-ciq-misc-supp... | |
You can view outputs logged by the workflow using the fuzzball workflow log
command and provide the
workflow UUID and job name. For example, executing the following command, will output logs from job
run-lammps
in the workflow:
$ fuzzball workflow logs <workflow uuid> run-lammps
LAMMPS (29 Sep 2021)
KOKKOS mode is enabled (src/KOKKOS/kokkos.cpp:97)
will use up to 1 GPU(s) per node
using 1 OpenMP thread(s) per MPI task
Lattice spacing in x,y,z = 1.6795962 1.6795962 1.6795962
Created orthogonal box = (0.0000000 0.0000000 0.0000000) to (268.73539 134.36770 268.73539)
1 by 1 by 1 MPI processor grid
Created 8192000 atoms
using lattice units in orthogonal box = (0.0000000 0.0000000 0.0000000) to (268.73539 134.36770 268.73539)
create_atoms CPU = 3.047 seconds
Neighbor list info ...
update every 20 steps, delay 0 steps, check no
max neighbors/atom: 2000, page size: 100000
master list distance cutoff = 2.8
ghost atom cutoff = 2.8
binsize = 2.8, bins = 96 48 96
1 neighbor lists, perpetual/occasional/extra = 1 0 0
(1) pair lj/cut/kk, perpetual
attributes: full, newton off, kokkos_device
pair build: full/bin/kk/device
stencil: full/bin/3d
bin: kk/device
Setting up Verlet run ...
Unit style : lj
Current step : 0
Time step : 0.005
Per MPI rank memory allocation (min/avg/max) = 1181.0 | 1181.0 | 1181.0 Mbytes
Step Temp E_pair E_mol TotEng Press
0 1.44 -6.7733681 0 -4.6133683 -5.0196694
100 0.75927734 -5.761232 0 -4.6223161 0.19102612
Loop time of 1.92832 on 1 procs for 100 steps with 8192000 atoms
Performance: 22402.943 tau/day, 51.859 timesteps/s
72.4% CPU use with 1 MPI tasks x 1 OpenMP threads
MPI task timing breakdown:
Section | min time | avg time | max time |%varavg| %total
---------------------------------------------------------------
Pair | 0.16802 | 0.16802 | 0.16802 | 0.0 | 8.71
Neigh | 0.29642 | 0.29642 | 0.29642 | 0.0 | 15.37
Comm | 0.072074 | 0.072074 | 0.072074 | 0.0 | 3.74
Output | 0.00063816 | 0.00063816 | 0.00063816 | 0.0 | 0.03
Modify | 1.3215 | 1.3215 | 1.3215 | 0.0 | 68.53
Other | | 0.06967 | | | 3.61
Nlocal: 8.19200e+06 ave 8.192e+06 max 8.192e+06 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Nghost: 727553.0 ave 727553 max 727553 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Neighs: 0.00000 ave 0 max 0 min
Histogram: 1 0 0 0 0 0 0 0 0 0
FullNghs: 6.15293e+08 ave 6.15293e+08 max 6.15293e+08 min
Histogram: 1 0 0 0 0 0 0 0 0 0
Total # of neighbors = 6.1529347e+08
Ave neighs/atom = 75.109066
Neighbor list builds = 5
Dangerous builds not checked
Total wall time: 0:00:13