pathfinder: A Semantic Framework for Literature Review and Knowledge Discovery in Astronomy
Abstract
The exponential growth of astronomical literature poses significant challenges for researchers navigating and synthesizing general insights or even domain-specific knowledge. We present pathfinder, a machine learning framework designed to enable literature review and knowledge discovery in astronomy, focusing on semantic searching with natural language instead of syntactic searches with keywords. Utilizing state-of-the-art large language models (LLMs) and a corpus of 385,166 peer-reviewed papers from the Astrophysics Data System, pathfinder offers an innovative approach to scientific inquiry and literature exploration. Our framework couples advanced retrieval techniques with LLM-based synthesis to search astronomical literature by semantic context as a complement to currently existing methods that use keywords or citation graphs. It addresses complexities of jargon, named entities, and temporal aspects through time-based and citation-based weighting schemes. We demonstrate the tool's versatility through case studies, showcasing its application in various research scenarios. The system's performance is evaluated using custom benchmarks, including single-paper and multipaper tasks. Beyond literature review, pathfinder offers unique capabilities for reformatting answers in ways that are accessible to various audiences (e.g., in a different language or as simplified text), visualizing research landscapes, and tracking the impact of observatories and methodologies. This tool represents a significant advancement in applying artificial intelligence to astronomical research, aiding researchers at all career stages in navigating modern astronomy literature.
- Publication:
-
The Astrophysical Journal Supplement Series
- Pub Date:
- December 2024
- DOI:
- arXiv:
- arXiv:2408.01556
- Bibcode:
- 2024ApJS..275...38I
- Keywords:
-
- Astronomical reference materials;
- Astronomy web services;
- History of astronomy;
- Computational methods;
- Astronomy data visualization;
- 90;
- 1856;
- 1868;
- 1965;
- 1968;
- Astrophysics - Instrumentation and Methods for Astrophysics;
- Computer Science - Digital Libraries;
- Computer Science - Information Retrieval
- E-Print:
- 25 pages, 9 figures, submitted to AAS jorunals. Comments are welcome, and the tools mentioned are available online at https://pfdr.app