ChemistryAtlas App · Discovery + Design
Diversity / Cluster Selection
Select diverse compounds by greedy max-min distance over RDKit Morgan fingerprints.
App Documentation
Diversity / Cluster Selection
Overview
Select diverse compounds by greedy max-min distance over RDKit Morgan fingerprints. It is in the Discovery + Design category and is intended to move from individual molecules to search, design, route, and library decisions.
When To Use It
- You need a focused workflow for diversity / cluster selection without leaving ChemistryAtlas.
- You want a result that can be saved, shared, or chained into another chemistry app.
- You want the calculation assumptions and limitations visible next to the output.
Inputs
text- Chemistry input - type: textarea - Use formulas, names, SMILES-like text, reactions, or key=value options. Heavier engines will plug into this same app surface.
Recommended Workflow
- Define the target structure, scaffold, reaction, or library rule; run the design/search operation; triage candidates using feasibility, novelty, safety, and property filters.
- Start with the smallest representative input, confirm the parser understood it, then scale to a larger list or workflow.
- Save the generated report when the result will feed a notebook entry, route review, model comparison, or team discussion.
Outputs
- A Markdown-style chemistry report with parsed inputs, assumptions, and calculated or predicted results.
- Structured tables when the app returns multiple compounds, reagents, routes, peaks, candidates, or model rows.
- Warnings, fallback notes, and sidecar availability messages when a specialized engine is not installed or not reachable.
Method And Backend Notes
This app has a runnable ChemistryAtlas backend path. Backend type: model. ChemistryAtlas roadmap MVP: runnable report now; specialist cheminformatics/model backend plugs into this app surface next. Use the output as a structured starting point for chemistry judgment, follow-up calculation, or experimental planning.
How To Interpret Results
- Use ranked results as shortlists for expert review, literature checks, synthesis feasibility, and orthogonal model validation.
- Compare results across related molecules, controls, blanks, literature examples, or known reactions whenever possible.
- For decisions that affect safety, synthesis scale-up, biological testing, purchasing, or publication, verify with primary data and expert review.
Example Input
aspirin
benzoic acid
caffeine
CCO
CCN
budget=3
Common Checks Before Acting
- Confirm names, salts, stereochemistry, tautomers, protonation state, and hydration state.
- Check units, concentrations, equivalent definitions, and significant figures.
- Record external database versions, model versions, sidecar availability, and any manual edits made after the app output.
Related Apps
- Similarity + Substructure Search (
similarity-substructure-search) - Retrosynthesis Planner (
retrosynthesis-planner) - Forward Reaction / Product Predictor (
forward-reaction-predictor) - Scaffold + R-Group Analyzer (
scaffold-rgroup-analyzer) - Generative Molecular Design (
generative-molecular-design)
Acknowledgements And Validation
- ChemistryAtlas documentation and UI were prepared for chemistry discovery workflows.
- Where available, calculations may use open-source cheminformatics, reaction-informatics, spectra, docking, or machine-learning engines such as RDKit-family tooling, ASKCOS-style sidecars, ChemProp, ms-pred/ICEBERG, PyScreener, and MolPAL.
- Always verify important results against primary literature, official SDS records, instrument software, validated models, and local laboratory procedures.
- Model-driven outputs should include model version, training domain, uncertainty, and independent validation before operational use.