AI Synthesis and Generative Molecular Design Pro
ChemistryAtlas exposes a two-tier AI chemistry layer:
- AI Synthesis for retrosynthesis, forward prediction, reaction context, impurity prediction, route scoring, and synthesis feasibility.
- Generative Molecular Design Pro for target-property generation, scaffold hopping, R-group optimization, active learning, ChemProp-style prediction, MS/MS AI workflows, docking, and virtual screening.
The apps are designed to work immediately with local deterministic backends and to upgrade automatically when optional sidecar services are configured.
Optional Sidecars
Set these environment variables on the backend service to enable external engines:
| Engine | Environment variable | Used by |
|---|---|---|
| ASKCOS | CHEMISTRYATLAS_ASKCOS_URL | Retrosynthesis Pro, Forward Predictor Pro, ASKCOS Connector |
| ChemProp | CHEMISTRYATLAS_CHEMPROP_URL | ChemProp Model Builder Pro, Batch Predictor |
| ms-pred / ICEBERG | CHEMISTRYATLAS_MSPRED_URL | AI MS/MS Predictor |
| PyScreener | CHEMISTRYATLAS_PYSCREENER_URL | PyScreener Docking Workflow |
| MolPAL | CHEMISTRYATLAS_MOLPAL_URL | MolPAL Active-Learning Virtual Screening |
If a sidecar is unavailable, the app returns sidecar_status=fallback and runs the local backend.
For the local Docker backend, docker-compose.v2.yml and docker-compose.roost.yml default these URLs to host.docker.internal:
CHEMISTRYATLAS_CHEMPROP_URL=http://host.docker.internal:8091
CHEMISTRYATLAS_MSPRED_URL=http://host.docker.internal:8092
CHEMISTRYATLAS_PYSCREENER_URL=http://host.docker.internal:8093
CHEMISTRYATLAS_MOLPAL_URL=http://host.docker.internal:8094
Installed Source Status
The lightweight reaction packages are installed directly in the backend and are durable backend dependencies:
rdchiralrdcanon
The heavier projects are checked out under external/coley/ for sidecar setup:
external/coley/chempropexternal/coley/ms-predexternal/coley/pyscreenerexternal/coley/molpal
ChemProp is also installed into a host-side sidecar virtualenv:
external/coley/envs/chemprop
The current ChemProp smoke test trains a one-epoch model through the HTTP sidecar and the full ChemistryAtlas app API. ms-pred, PyScreener, and MolPAL have runnable HTTP sidecars now; their native heavy engines are still intentionally separate from the main backend and fall back to deterministic RDKit/manifest workflows until their dedicated environments are installed.
ASKCOS v2 is listed by the Coley group at https://gitlab.com/mlpds_mit/askcosv2, but anonymous clone from this environment requested GitLab credentials. Clone it into external/coley/askcosv2 once access is available.
Local Backends
The local implementations are not placeholders. They generate CSV and HTML reports from:
- RDKit reaction SMARTS and product prediction
- RDKit descriptors and fingerprint similarity
- Random Forest QSAR / active-learning fallbacks
- R-group enumeration and multi-objective ranking
- neutral-loss MS/MS heuristics and spectrum cosine scoring
- docking manifests, ligand-efficiency ranking, and prep checklists
- route scoring and synthesis feasibility heuristics
Key Apps
AI Synthesis
- Reaction Template Explorer
- Forward Reaction Predictor Pro
- Retrosynthesis Planner Pro
- Reaction Context Recommender
- Impurity / Side Product Predictor
- Route Scorer
- Synthesis Feasibility Report
- ASKCOS Sidecar Connector
- Literature Reaction Similarity Search
Generative Molecular Design Pro
- Target-Property Molecule Generator
- Scaffold Hopping Designer
- R-Group Optimization Pro
- Multi-Objective Molecular Optimizer
- Synthesis-Aware Molecule Generator
- Matched Pair Design Suggestions
- Active-Learning Molecular Design Loop
- Generative Library Builder
- ChemProp Model Builder Pro
- ChemProp Batch Predictor
- ChemProp Uncertainty / Applicability Domain
AI Spectra + Virtual Screening
- AI MS/MS Predictor
- Spectrum Similarity Scorer
- Candidate Structure Ranker
- Formula + Spectrum Elucidation
- PyScreener Docking Workflow
- MolPAL Active-Learning Virtual Screening
- Protein/Ligand Prep Checklist
- Docking Result Ranker
- Docking-to-Design Loop
Sample Payloads
Retrosynthesis Pro:
aspirin
max_depth=3
Forward Predictor Pro:
reaction_type=amide
reactants=CC(=O)O.NC
R-Group Optimization Pro:
core=c1ccc([*:1])cc1
r=F,Cl,Me,OMe,NH2,CN
target_logp=2
mw_max=250
ChemProp Model Builder Pro:
target=target
predict=CCO,CCN
smiles,target
CCO,0.2
CCN,0.5
c1ccccc1,1.1
CCCl,0.7
AI MS/MS Predictor:
caffeine
precursor_mz=195.0877
adduct=[M+H]+
PyScreener Docking Workflow:
protein=target.pdb
ligands=aspirin,caffeine
center=0,0,0
box=20,20,20
Verification
Run the live backend suite:
python3 scripts/check_real_chemistry_backends.py http://127.0.0.1:8083
The suite includes sample jobs for the complete AI synthesis, generative design, MS/MS, docking, and active-screening layer. With the sidecars running, the suite verifies sidecar_status=used for ChemProp, ms-pred, PyScreener, and MolPAL routes.