Running AlphaFold 3 Reference Generation on HPC with Apptainer¶
This guide explains how to generate Linux reference outputs on HPC clusters that don't support Docker but provide Apptainer (formerly Singularity).
Advanced and optional workflow
This is not required for normal macOS/MLX prediction usage. Use this only if you need Linux reference artifacts for cross-platform parity checks.
Overview¶
The reference image generates canonical Linux outputs for cross-platform parity validation. These outputs are compared against macOS outputs to ensure the C++ extensions produce identical results.
Prerequisites¶
- Apptainer 1.1+ (or Singularity 3.8+)
- Access to the AlphaFold 3 repository
Building the Image Locally¶
Option 1: Build with Docker, Convert to SIF¶
On a machine with Docker:
# Build the reference image
cd /path/to/alphafold3
docker build -f docker/Dockerfile.reference -t alphafold3-reference .
# Save as tarball for transfer
docker save alphafold3-reference -o alphafold3-reference.tar
# Transfer to HPC, then convert
apptainer build alphafold3-reference.sif docker-archive://alphafold3-reference.tar
Option 2: Build Directly with Apptainer¶
If your HPC has network access during build:
# Build from Dockerfile (requires root or fakeroot)
apptainer build --fakeroot alphafold3-reference.sif \
docker-daemon://alphafold3-reference:latest
Generating Reference Outputs¶
# Create output directory
mkdir -p reference_outputs
# Run reference generation
apptainer run \
--bind ./reference_outputs:/output \
alphafold3-reference.sif
# Outputs will be in ./reference_outputs/
ls reference_outputs/*.npz
Running with Slurm¶
#!/bin/bash
#SBATCH --job-name=af3-reference
#SBATCH --time=00:30:00
#SBATCH --mem=8G
#SBATCH --cpus-per-task=4
module load apptainer
apptainer run \
--bind $PWD/reference_outputs:/output \
$PWD/alphafold3-reference.sif
Verifying Outputs¶
The generated NPZ files should have platform=Linux metadata:
import numpy as np
files = [
'reference_outputs/fasta_parsing_reference.npz',
'reference_outputs/msa_profile_reference.npz',
'reference_outputs/string_array_reference.npz',
]
for f in files:
data = np.load(f, allow_pickle=True)
print(f"{f}: platform={data['platform']}")
# Should print: platform=Linux
Troubleshooting¶
Permission Errors on Bind Mount¶
apptainer run --bind ./reference_outputs:/output:rw alphafold3-reference.sif
Fakeroot Not Available¶
Contact your HPC admin to enable fakeroot, or build the image on a machine with Docker and transfer the SIF file.
Network Issues During Build¶
Pre-download the base image on a login node with network access:
apptainer pull docker://python:3.12-slim-bookworm
What Gets Generated¶
| File | Description | C++ Module |
|---|---|---|
fasta_parsing_reference.npz | FASTA parsing outputs | fasta_iterator |
msa_profile_reference.npz | MSA profile computation | msa_profile |
string_array_reference.npz | String array operations | string_array |
Each NPZ file contains: - Input data or checksums - Output arrays - platform metadata (should be "Linux") - python_version metadata