Skip to content

Running AlphaFold 3 Reference Generation on HPC with Apptainer

This guide explains how to generate Linux reference outputs on HPC clusters that don't support Docker but provide Apptainer (formerly Singularity).

Advanced and optional workflow

This is not required for normal macOS/MLX prediction usage. Use this only if you need Linux reference artifacts for cross-platform parity checks.

Overview

The reference image generates canonical Linux outputs for cross-platform parity validation. These outputs are compared against macOS outputs to ensure the C++ extensions produce identical results.

Prerequisites

  • Apptainer 1.1+ (or Singularity 3.8+)
  • Access to the AlphaFold 3 repository

Building the Image Locally

Option 1: Build with Docker, Convert to SIF

On a machine with Docker:

# Build the reference image
cd /path/to/alphafold3
docker build -f docker/Dockerfile.reference -t alphafold3-reference .

# Save as tarball for transfer
docker save alphafold3-reference -o alphafold3-reference.tar

# Transfer to HPC, then convert
apptainer build alphafold3-reference.sif docker-archive://alphafold3-reference.tar

Option 2: Build Directly with Apptainer

If your HPC has network access during build:

# Build from Dockerfile (requires root or fakeroot)
apptainer build --fakeroot alphafold3-reference.sif \
    docker-daemon://alphafold3-reference:latest

Generating Reference Outputs

# Create output directory
mkdir -p reference_outputs

# Run reference generation
apptainer run \
    --bind ./reference_outputs:/output \
    alphafold3-reference.sif

# Outputs will be in ./reference_outputs/
ls reference_outputs/*.npz

Running with Slurm

#!/bin/bash
#SBATCH --job-name=af3-reference
#SBATCH --time=00:30:00
#SBATCH --mem=8G
#SBATCH --cpus-per-task=4

module load apptainer

apptainer run \
    --bind $PWD/reference_outputs:/output \
    $PWD/alphafold3-reference.sif

Verifying Outputs

The generated NPZ files should have platform=Linux metadata:

import numpy as np

files = [
    'reference_outputs/fasta_parsing_reference.npz',
    'reference_outputs/msa_profile_reference.npz',
    'reference_outputs/string_array_reference.npz',
]

for f in files:
    data = np.load(f, allow_pickle=True)
    print(f"{f}: platform={data['platform']}")
    # Should print: platform=Linux

Troubleshooting

Permission Errors on Bind Mount

apptainer run --bind ./reference_outputs:/output:rw alphafold3-reference.sif

Fakeroot Not Available

Contact your HPC admin to enable fakeroot, or build the image on a machine with Docker and transfer the SIF file.

Network Issues During Build

Pre-download the base image on a login node with network access:

apptainer pull docker://python:3.12-slim-bookworm

What Gets Generated

File Description C++ Module
fasta_parsing_reference.npz FASTA parsing outputs fasta_iterator
msa_profile_reference.npz MSA profile computation msa_profile
string_array_reference.npz String array operations string_array

Each NPZ file contains: - Input data or checksums - Output arrays - platform metadata (should be "Linux") - python_version metadata