Autonomous self-evolving research on biomedical data: the DREAM paradigm
Abstract
In contemporary biomedical research, the efficiency of data-driven approaches is hindered by large data volumes, tool selection complexity, and human resource limitations, necessitating the development of fully autonomous research systems to meet complex analytical needs. Such a system should include the ability to autonomously generate research questions, write analytical code, configure the computational environment, judge and interpret the results, and iteratively generate in-depth questions or solutions, all without human intervention. Here we developed DREAM, the first biomedical Data-dRiven self-Evolving Autonomous systeM, which can independently conduct scientific research without human involvement. Utilizing a clinical dataset and two omics datasets, DREAM demonstrated its ability to raise and deepen scientific questions, with difficulty scores for clinical data questions surpassing top published articles by 5.7% and outperforming GPT-4 and bioinformatics graduate students by 58.6% and 56.0%, respectively. Overall, DREAM has a success rate of 80% in autonomous clinical data mining. Certainly, human can participate in different steps of DREAM to achieve more personalized goals. After evolution, 10% of the questions exceeded the average scores of top published article questions on originality and complexity. In the autonomous environment configuration of the eight bioinformatics workflows, DREAM exhibited an 88% success rate, whereas GPT-4 failed to configure any workflows. In clinical dataset, DREAM was over 10,000 times more efficient than the average scientist with a single computer core, and capable of revealing new discoveries. As a self-evolving autonomous research system, DREAM provides an efficient and reliable solution for future biomedical research. This paradigm may also have a revolutionary impact on other data-driven scientific research fields.
- Publication:
-
arXiv e-prints
- Pub Date:
- July 2024
- DOI:
- 10.48550/arXiv.2407.13637
- arXiv:
- arXiv:2407.13637
- Bibcode:
- 2024arXiv240713637D
- Keywords:
-
- Quantitative Biology - Quantitative Methods
- E-Print:
- 11 pages, 4 figures, content added, typos in figure corrected, references revised and font changed