The bibliographic databases maintained by the NASA Astrophysics Data System are updated approximately biweekly with records gathered from over 125 sources all over the world. Data are either sent to us electronically, retrieved by our staff via semi-automated procedures, or entered in our databases through supervised OCR procedures. PERL scripts are run on the data to convert them from their incoming format to our standard format so that they can be added to the master database at SAO. Once new data has been added, separate index files are created for authors, objects, title words, and text word, allowing these fields to be searched for individually or in combination with each other. During the indexing procedure, discipline-specific knowledge is taken into account through the use of rule-based procedures performing string normalization, context-sensitive word translation, and synonym and stop word replacement. Once the master text and index files have been updated at SAO, an automated procedure mirrors the changes in the database to the ADS mirror site via a secure network connection. The use of a public domain software tool called rsync allows incremental updating of the database files, with significant savings in the amount of data being transferred. In the past year, the ADS Abstract Service databases have grown by approximately 30%, including 50% growth in Physics, 25% growth in Astronomy and 10% growth in the Instrumentation datasets. The ADS Abstract Service now contains over 1.4 million abstracts (475K in Astronomy, 430K in Physics, 510K in Instrumentation, and 3K in Preprints), 175,000 journal abstracts, and 115,000 full text articles. In addition, we provide links to over 40,000 electronic HTML articles at other sites, 20,000 PDF articles, and 10,000 postscript articles, as well as many links to other external data sources.
American Astronomical Society Meeting Abstracts #194
- Pub Date:
- May 1999