Data standards and tools for sharing protein structure predictions at scale.
Structure prediction generates data at a scale that existing sharing methods cannot handle—a difficult-to-manage volume across thousands of structures, in formats that differ between predictors and contain far more raw output than collaborators need.
TSP (TSL Structure Package) captures the essential information from predictions—structures, confidence scores, PAE matrices—in a compact, queryable format. Users can filter by confidence or protein ID before downloading structure files. A TSP is a single portable object that can be easily shared, archived, or distributed via Zenodo with a permanent DOI.
| Tool | Purpose |
|---|---|
| tsp-maker (Python) | Automates parsing and packaging of raw predictor output (AF2, AF3, Boltz2) |
| tslstructures (R) | Enables easy analysis and extraction from large structure datasets |
A Zenodo community provides a central index where datasets remain under individual ownership but are discoverable in one place.
pak::pak("TeamMacLean/tslstructures")
library(tslstructures)
datasets <- list_datasets()
install_dataset("10.5281/zenodo.12345678")
ds <- load_dataset("my-structures")
high_conf <- ds |> filter(plddt_mean > 90)
structure <- get_structure(ds, "P12345_AF3_1")pip install git+https://github.com/TeamMacLean/tsp-maker.git
tsp-maker parse af3 /predictions /intermediate
tsp-maker build /intermediate /my-dataset --name my-structures
tsp-maker upload /my-dataset --publishFull documentation: teammaclean.github.io/share_structures
| Package | Language | Purpose | Links |
|---|---|---|---|
| tslstructures | R | Access and analyse datasets | GitHub · Docs |
| tsp-maker | Python | Create and upload datasets | GitHub · Docs |
share_structures/
├── docs/ # Project documentation (MkDocs)
├── tslstructures/ # R package source
├── tsp-maker/ # Python package source
└── mkdocs.yml
MIT
The Sainsbury Laboratory