"Accurate de novo design of high-affinity protein-binding macrocycles using deep learning"
Today, we share a research article led by the teams of David Baker and Gaurav Bhardwaj, published in Nature Chemical Biology. This study developed RFpeptides, a denoising diffusion-based generative AI pipeline that enables the de novodesign of macrocyclic peptides by integrating the RoseTTAFold2 (RF2) structure prediction network and the RFdiffusion protein backbone generation framework. This work achieves, for the first time, the precise design of high-affinity macrocyclic peptide binders targeting multiple protein targets, validated by X-ray crystallography showing high agreement between the designed and experimental structures (Cα RMSD < 1.5 Å). It provides a scalable and efficient platform for the rational design of macrocyclic peptides.
01 Research Background
Macrocyclic peptides hold great potential as therapeutic agents, bridging the gap between small molecules and large biologics: they combine the cell membrane permeability of small molecules with the high affinity and selectivity of biologics. However, traditional macrocyclic peptide discovery relies on natural product screening or large-scale library screening (e.g., phage display), which are resource-intensive, time-consuming, and offer limited control over the binding mode. Despite advances in protein design, existing computational methods (e.g., physics-based design or language models) cannot achieve robust de novodesign of macrocyclic peptide binders, especially in the absence of known binding partners or predefined motifs. Major challenges include the combinatorial explosion of the macrocyclic peptide sequence space, complex structure-activity relationships, and the need to simultaneously optimize multiple properties such as binding affinity, selectivity, and permeability. Therefore, developing a computational method capable of precisely generating high-affinity macrocyclic peptide binders is crucial.
02 Innovative Highlights
First diffusion model-based pipeline for macrocyclic peptide generation: RFpeptides combines the diffusion framework of RFdiffusion with the protein structure prediction capability of RF2, introducing cyclic relative positional encoding, enabling the model to generate macrocyclic peptide backbones directly in a continuous embedding space without pre-training a latent space or relying on known binding motifs.
Deep integration of sequence design and structural optimization: The pipeline employs ProteinMPNN for sequence design combined with Rosetta Relax for structural optimization, ensuring the generated peptide sequences are compatible with the backbones while meeting the requirements of the binding interface and physicochemical properties.
End-to-end validation and high-precision structural matching: A small number of candidates (≤20 per target) were selected from a large pool of generated peptides for synthesis and experimental testing, yielding medium- to high-affinity binders against four different protein targets (MCL1, MDM2, GABARAP, and RbtA). X-ray crystallography confirmed high agreement between the designed and experimental structures (Cα RMSD < 1.5 Å), demonstrating the atomic-level accuracy of the method.
03 Results and Discussion
3.1 Computational Framework and Characteristics of Generated Peptides
The RFpeptides pipeline operates through a three-step process: first, generating macrocyclic peptide backbones using a modified RFdiffusion; second, designing amino acid sequences using ProteinMPNN; and finally, filtering using complex structures predicted by AfCycDesign or RoseTTAFold. The generated peptides exhibit high backbone diversity, and their physicochemical properties are similar to those of natural macrocyclic peptides. The sequence naturalness (evaluated by ProGen2 perplexity) shows no significant difference from the training set (Training set PPL=17.93 vs. Generated set PPL=17.90). Selected peptides are enriched in key residues such as lysine (K), leucine (L), and arginine (R), enhancing cationicity and amphipathicity, thereby optimizing membrane-targeting potential.

Fig. 1. RFpeptides is a diffusion-based pipeline for the de novo design of protein-binding macrocycles.
3.2 De Novo Design for MCL1 and MDM2
For the protein targets MCL1 (Myeloid cell leukemia 1) and MDM2 (p53-regulating protein), RFpeptides generated thousands of macrocyclic peptide backbones, which were filtered using a combination of AfCycDesign predictions and Rosetta interface metrics. For MCL1, 14 out of 27 designed peptides were successfully synthesized, with 3 showing binding activity. The best binder, MCB_D2, had a Kd of 2 µM. The X-ray crystal structure (2.1 Å resolution) showed high agreement with the experimental structure (Cα RMSD=0.7 Å). The binding interface involved helical and loop regions, including cation-π interactions not observed in natural binders.

Fig. 2. De novo design and characterization of macrocyclic binders to MCL1 and MDM2.
3.3 Design of High-Affinity Binders for GABARAP
For the GABA A receptor-associated protein (GABARAP), RFpeptides designed 80,000 candidate peptides, from which 13 diverse peptides were selected via clustering for experimental testing. Six peptides were successfully synthesized, with GAB_D8 and GAB_D23 exhibiting nM-level affinity (Kd = 6 nM and 36 nM, respectively). Their efficacy in inhibiting the GABARAP-K1 peptide interaction was validated by AlphaScreen (IC50 = 0.7 nM and 2.5 nM, respectively). X-ray structures showed high agreement between the design and experimental structures (Cα RMSD=1.2 Å for GAB_D8 and 1.7 Å for GAB_D23). The binding mode involved β-sheet and helical structures, demonstrating the method's adaptability to different pocket shapes.

Fig.3. De novo design of high-affinity macrocycle binders to GABARAP.
3.4 Design Targeting the Predicted Structure of RbtA
For Rhombotarget A (RbtA) from Acinetobacter baumannii(for which no experimental structure exists), RFpeptides used AF2 and RF2 to predict the target structure and designed macrocyclic peptide binders. From 20,000 generated backbones, 26 candidate peptides were selected, with RBB_D10 showing high affinity (Kd=9.4 nM). X-ray crystallography confirmed high agreement between the predicted and experimental target structures (Cα RMSD=1.2 Å), and the complex structure matched the design model (Cα RMSD=1.4 Å), proving the method's precision even for predicted targets.

Fig. 4. Accurate de novo design of a high-affinity cyclic peptide binder against the predicted structure of RbtA from A. baumannii.
04 Conclusion and Future Perspectives
The RFpeptides pipeline developed in this study represents a major breakthrough in the field of macrocyclic peptide design. Its core value lies in the deep integration of generative AI and structure prediction, enabling fully automated design of high-affinity binding peptides from target sequences or structures. Experimental validation shows that this method can generate highly active and specific binders for diverse targets (including those with flat binding surfaces), with high agreement between designed and experimental structures, achieving a success rate of 76% (35 out of 46 tested peptides were active). Future work could extend to conditional generation (e.g., targeting specific epitopes), incorporation of non-canonical amino acids and cyclization chemistries, and application to a wider range of therapeutic targets (e.g., membrane proteins or protein-protein interaction interfaces). RFpeptides provides a scalable platform for accelerating peptide drug discovery, holding promise for advancing precision medicine and anti-infective therapies.
Original Article:
Rettie SA, Juergens D, Adebomi V, Bueso YF, Zhao Q, Leveille AN, Liu A, Bera AK, Wilms JA, Üffing A, Kang A, Brackenbrough E, Lamb M, Gerben SR, Murray A, Levine PM, Schneider M, Vasireddy V, Ovchinnikov S, Weiergräber OH, Willbold D, Kritzer JA, Mougous JD, Baker D, DiMaio F, Bhardwaj G. Accurate de novo design of high-affinity protein-binding macrocycles using deep learning. Nat Chem Biol. 2025 Jun 20.














