Modeling citation concentration through a mixture of Leimkuhler curves
Abstract
When a graphical representation of the cumulative percentage of total citations to articles, ordered from most cited to least cited, is plotted against the cumulative percentage of articles, we obtain a Leimkuhler curve. In this study, we noticed that standard Leimkuhler functions may not be sufficient to provide accurate fits to various empirical informetrics data. Therefore, we introduce a new approach to Leimkuhler curves by fitting a known probability density function to the initial Leimkuhler curve, taking into account the presence of a heterogeneity factor. As a significant contribution to the existing literature, we introduce a pair of mixture distributions (called PG and PIG) to bibliometrics. In addition, we present closed-form expressions for Leimkuhler curves. {Some measures of citation concentration are examined empirically for the basic models (based on the Power {and Pareto distributions}) and the mixed models derived from {these}.} An application to two sources of informetric data was conducted to see how the mixing models outperform the standard basic models. The different models were fitted using non-linear least squares estimation.
- Publication:
-
arXiv e-prints
- Pub Date:
- January 2024
- DOI:
- arXiv:
- arXiv:2401.07052
- Bibcode:
- 2024arXiv240107052G
- Keywords:
-
- Computer Science - Digital Libraries;
- Statistics - Applications;
- 62P25
- E-Print:
- 21 pages, 2 figures, 2 tables