github

PaddlePaddle / PaddleHelix

  • четверг, 5 сентября 2024 г. в 00:00:03
https://github.com/PaddlePaddle/PaddleHelix

Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集



English | 简体中文


Version python version support os DOI

Latest News

2024.08.30 We are excited to announce great news! The initial version of the HelixFold3 server, designed for biomolecular structure prediction, is now available on the PaddleHelix website (https://paddlehelix.baidu.com/app/all/helixfold3/forecast). We encourage everyone to explore its capabilities and leverage it for impactful and innovative research.

2024.08.15 PaddleHelix released the codes and model parameters of HelixFold3, biomolecular structure prediction replicating the capabilities of AlphaFold3. HelixFold3 achieves accuracy comparable to AlphaFold3 in predicting the structures of the conventional ligands, nucleic acids, and proteins. The initial release of HelixFold3 is available as open source on GitHub for non-commercial academic research, promising to advance biomolecular research and accelerate discoveries. Refer to codes for more details.

2024.05.23 PaddleHelix released the codes of HelixDock, a pre-training model on large-scale generated docking conformations to unlock the potential of protein-ligand structure prediction, significantly improving prediction accuracy and generalizability. Please refer to paper and codes for more details. Welcome to PaddleHelix website to try out the structure prediction online service.

2024.05.13 Paper "Multi-purpose RNA Language Modeling with Motif-aware Pre-training and Type-guided Fine-tuning" is accepted by Nature Machine Intelligence. Please refer to paper and codes for more details.

2024.04.16 PaddleHelix released the technical report of HelixFold-Multimer, a protein complex structure prediction model which achieves remarkable success in antigen-antibody and peptide-protein structure prediction. Please refer to the report for more details. The online structure prediction services for general and antigen-antibody protein complex are now available at link1 and link2 on the PaddleHelix platform respectively.

2023.10.09 The work of HelixFold-Single titled with "A method for multiple-sequence-alignment-free protein structure prediction using a protein language model" is accepted by Nature Machine Intelligence. Please refer to paper for more details.

2022.12.08 Paper "HelixMO: Sample-Efficient Molecular Optimization in Scene-Sensitive Latent Space" is accepted by BIBM 2022. Please refere to link1 or link2 for more details. We also deployed the drug design service on the website PaddleHelix.

2022.08.11 PaddleHelix released the codes of HelixGEM-2, a novel Molecular Property Prediction Network that models full-range many-body interactions. And it ranked 1st in the OGB PCQM4Mv2 leaderboard. Please refer to paper and codes for more details.

2022.07.29 PaddleHelix released the codes of HelixFold-Single, an MSA-free protein structure prediction pipeline relying on only the primary sequences, which can predict the protein structures within seconds. Please refer to paper and codes for more details. Welcome to PaddleHelix website to try out the structure prediction online service.

2022.07.18 PaddleHelix fully released HelixFold including training and inference pipeline. The complete training time are optimized from 11 days to 5.12 days. Ultra-long monomer protein (around 6600 AA) prediction is supported now. Please refer to paper and codes for more details.

2022.07.07 Paper "BatchDTA: implicit batch alignment enhances deep learning-based drug–target affinity estimation" is published in Briefings in Bioinformatics. Please refer to paper and codes for more details.

2022.05.24 Paper "HelixADMET: a robust and endpoint extensible ADMET system incorporating self-supervised knowledge transfer" is published in Bioinformatics. Refer to paper for more information.

2022.02.07 Paper "Geometry-enhanced molecular representation learning for property prediction" is published in Nature Machine Intelligence. Please refer to paper and codes to explore the algorithm.

More news ...

2022.01.07 PaddleHelix released the reproduction of AlphaFold 2 inference pipeline using PaddlePaddle in HelixFold.

2021.11.23 Paper "Multimodal Pre-Training Model for Sequence-based Prediction of Protein-Protein Interaction" is accepted by MLCB 2021. Please refer to paper and code for more details.

2021.10.25 Paper "Docking-based Virtual Screening with Multi-Task Learning" is accepted by BIBM 2021.

2021.09.29 Paper "Property-Aware Relation Networks for Few-shot Molecular Property Prediction" is accepted by NeurIPS 2021 as a Spotlight Paper. Please refer to PAR for more details.

2021.07.29 PaddleHelix released a novel geometry-level molecular pre-training model, taking advantage of the 3D spatial structures of the molecules. Please refer to GEM for more details.

2021.06.17 PaddleHelix team won the 2nd place in the OGB-LCS KDD Cup 2021 PCQM4M-LSC track, predicting DFT-calculated HOMO-LUMO energy gap of molecules. Please refer to the solution for more details.

2021.05.20 PaddleHelix v1.0 released. 1) Update from static framework to dynamic framework; 2) Add new applications: molecular generation and drug-drug synergy.

2021.05.18 Paper "Structure-aware Interactive Graph Neural Networks for the Prediction of Protein-Ligand Binding Affinity" is accepted by KDD 2021. The code is available at here.

2021.03.15 PaddleHelix team ranks 1st in the ogbg-molhiv and ogbg-molpcba of OGB, predicting the molecular properties.


Introduction

PaddleHelix is a bio-computing tool, taking advantage of the machine learning approaches, especially deep neural networks, for facilitating the development of the following areas:

  • Drug Discovery. Provide 1) Large-scale pre-training models: compounds and proteins; 2) Various applications: molecular property prediction, drug-target affinity prediction, and molecular generation.
  • Vaccine Design. Provide RNA design algorithms, including LinearFold and LinearPartition.
  • Precision Medicine. Provide application of drug-drug synergy.

Resources

Application Platform

PaddleHelix platform provides the AI + biochemistry abilities for the scenarios of drug discovery, vaccine design and precision medicine.

Installation Guide

PaddleHelix is a bio-computing repository based on PaddlePaddle, a high-performance Parallelized Deep Learning Platform. The installation prerequisites and guide can be found here.

Tutorials

We provide abundant tutorials to help you navigate the repository and start quickly.

Examples

We also provide examples that implement various algorithms and show the methods running the algorithms:

Competition Solutions

The PaddleHelix team participated in multiple competitions related to bio-computing. The solutions can be found here.

Guide for Developers

  • To develop new functions based on the source code of PaddleHelix, please refer to guide for developers.
  • For more details on the APIs, please refer to the documents.

Copyright and License

Shield: CC BY-NC-SA 4.0

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0