- ESM3 Github: https://github.com/evolutionaryscale/esm - ESM3 Paper: https://www.evolutionaryscale.ai/papers/esm3-simulating-500-million-years-of-evolution-with-a-language-model - Neural Discrete Representation Learning (paper referenced by ESM3 for structure tokenization): https://arxiv.org/abs/1711.00937 - SaProt: Protein Language Modeling with Structure-aware Vocabulary: https://www.biorxiv.org/content/10.1101/2023.10.01.560349v1 - Fast and accurate protein structure search with Foldseek: https://www.nature.com/articles/s41587-023-01773-0 - Bilingual Language Model for Protein Sequence and Structure: https://www.biorxiv.org/content/10.1101/2023.07.23.550085v2.abstract - Learning the Language of Protein Structure: https://arxiv.org/abs/2405.15840v1 - ProTrek: Navigating the Protein Universe through Tri-Modal Contrastive Learning: https://www.biorxiv.org/content/10.1101/2024.05.30.596740v1 - MULAN: Multimodal Protein Language Model for Sequence and Structure Encoding: https://www.biorxiv.org/content/10.1101/2024.05.30.596565v1 - The Continuous Language of Protein Structure: https://www.biorxiv.org/content/10.1101/2024.05.11.593685v1