- ESM3 Github: https://github.com/evolutionaryscale/esm
- ESM3 Paper: https://www.evolutionaryscale.ai/papers/esm3-simulating-500-million-years-of-evolution-with-a-language-model
- Neural Discrete Representation Learning (paper referenced by ESM3 for structure tokenization): https://arxiv.org/abs/1711.00937
- SaProt: Protein Language Modeling with Structure-aware Vocabulary: https://www.biorxiv.org/content/10.1101/2023.10.01.560349v1
- Fast and accurate protein structure search with Foldseek: https://www.nature.com/articles/s41587-023-01773-0
- Bilingual Language Model for Protein Sequence and Structure: https://www.biorxiv.org/content/10.1101/2023.07.23.550085v2.abstract
- Learning the Language of Protein Structure: https://arxiv.org/abs/2405.15840v1
- ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning: https://www.biorxiv.org/content/10.1101/2024.05.30.596740v1
- MULAN: Multimodal Protein Language Model for Sequence and Structure Encoding: https://www.biorxiv.org/content/10.1101/2024.05.30.596565v1
- The Continuous Language of Protein Structure: https://www.biorxiv.org/content/10.1101/2024.05.11.593685v1