Multicollinearity Resolution Based on Machine Learning: A Case Study of Carbon Emissions in Sichuan Province

doi:10.48550/arXiv.2309.01115

Multicollinearity Resolution Based on Machine Learning: A Case Study of Carbon Emissions in Sichuan Province

This study preprocessed 2000-2019 energy consumption data for 46 key Sichuan industries using matrix normalization. DBSCAN clustering identified 16 feature classes to objectively group industries. Penalized regression models were then applied for their advantages in overfitting control, high-dimensional data processing, and feature selection - well-suited for the complex energy data. Results showed the second cluster around coal had highest emissions due to production needs. Emissions from gasoline-focused and coke-focused clusters were also significant. Based on this, emission reduction suggestions included clean coal technologies, transportation management, coal-electricity replacement in steel, and industry standardization. The research introduced unsupervised learning to objectively select factors and aimed to explore new emission reduction avenues. In summary, the study identified industry groupings, assessed emissions drivers, and proposed scientific reduction strategies to better inform decision-making using algorithms like DBSCAN and penalized regression models.

Publication:

arXiv e-prints

Pub Date:

September 2023

DOI:

10.48550/arXiv.2309.01115

arXiv:

arXiv:2309.01115

Bibcode:

2023arXiv230901115Z

Keywords:

Computer Science - Machine Learning

E-Print:

21 pages,19 figures

NASA/ADS

Multicollinearity Resolution Based on Machine Learning: A Case Study of Carbon Emissions in Sichuan Province

Abstract