Use of machine learning algorithms to analyze billed electricity data. Case: Chile 2015 – 2021
Main Article Content
Published: Nov 9, 2022
Abstract
In the Chilean electricity market, end users are classified as free customers and regulated customers. The analysis of its behavior is important for the design and application of public policies in the sector. In this research, the monthly billed electricity data of Chilean regulated customers is studied during the 2015-2021 period, to detect patterns and predict the category to which they belong. K-Means algorithms are used for pattern detection, K-NN for customer category prediction, and principal component analysis to determine the most significant variables within the data set. With K-Means it was found that the data is grouped according to the type of client, with K-NN a model was obtained that allows predicting to which type of clients the data belongs, and with the analysis of principal components it was found that the variables customer type, year, and month, are the most important in the data set. More than 96% of the customers analyzed correspond to the residential type, who consumed 50% of the energy invoiced during the study period and imposed the monthly seasonality of the data. The machine learning algorithms applied to the data made it possible to generate models to group them, to predict their category, and to establish the most significant variables in terms of their variance.
Keywords
Clustering, principal component analysis, machine learning, billed energyDownloads
Article Details
References
(2) S. Arguello V. y N. García B., “Componentes y determinación de la tarifa eléctrica para los clientes regulados ,” Biblioteca del Congreso Nacional de Chile, Santiago de Chile, 2020.
(3) M. A. Azócar, “Estudio y análisis del Nuevo Decreto Tarifario 11 T. Aplicable a los suministros sujetos a precios,” Tesis de Pregrado, Pontificia Universidad Católica de Valparaíso, Valparaíso, 2018.
(4) Mercados Energéticos Consultores, “Análisis de consumo eléctrico en el corto, mediano y largo plazo,” Mercados Energéticos Consultores, Santiago de Chile, 2014.
(5) A. Rajabi, M. Eskandari, M. J. Ghadi, L. Li, J. Zhang y P. Siano, “A comparative study of clustering techniques for electrical load pattern segmentation,” Renewable and Sustainable Energy Reviews, vol. 120, 2020. https://doi.org/10.1016/j.rser.2019.109628.
(6) M. Lester, D. Carrizo, F. Ulloa-Vásquez y L. García-Santander, “Uso de algoritmo K-means para clasificar perfiles de clientes con datos de medidores inteligentes de consumo eléctrico: Un caso de estudio,” Ingeniare. Revista chilena de ingeniería, vol. 29, nº 4, pp. 778-787, 2021. http://dx.doi.org/10.4067/S0718-33052021000400778.
(7) T. Parhizkar, E. Rafieipour y A. Parhizkar, “Evaluation and improvement of energy consumption prediction models using principal component analysis based feature reduction,” Journal of Cleaner Production, vol. 279, 2021. https://doi.org/10.1016/j.jclepro.2020.123866
(8) M. K. M. Shapi, N. A. Ramli y L. J. Awalin, “Energy consumption prediction by using machine learning for smart building: Case study in Malaysia ,” Developments in the Built Environment, vol. 5, 2021. https://doi.org/10.1016/j.dibe.2020.100037.
(9) S. Yilmaz, J. Chambers, X. Li y M. K. Patel, “A comparative analysis of patterns of electricity use and flexibility potential of domestic and non-domestic building archetypes through data mining techniques,” Journal of Physics: Conference Series, vol. 2042, 2021. DOI:10.1088/1742-6596/2042/1/012021.
(10) E. Ruiz, R. Pacheco-Torres y J. Casillas, “Energy consumption modeling by machine learning from daily activity metering in a hospital,” 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), pp. 1-7, 2017. doi: 10.1109/ETFA.2017.8247667.
(11) S. Hosseini y R. Hafezi Fard, “Machine Learning Algorithms for Predicting Electricity Consumption of Buildings,” Wireless Personal Communications, vol. 121, pp. 3320-3341, 2021. ). https://doi.org/10.1007/s11277-021-08879-1.
(12) O. F. Núñez-Barrionuevo, E. A. Llanes-Cedeño, J. Martinez-Gomez, J. I. Guachimboza-Davalos y J. Lopez-Villada, “Clustering Analysis of Electricity Consumption of Municipalities in the Province of Pichincha-Ecuador Using the K-Means Algorithm,” Proceedings of ICCIS 2020 Springer, vol. 1273, pp. 187-195, 2020. https://doi.org/10.1007/978-3-030-59194-6_16.
(13) O. Valgaev, F. Kupzog y H. Schmeck, “Building power demand forecasting using K-nearest neighbours model – practical application in Smart City Demo Aspern project,” CIRED - Open Access Proceedings Journal, vol. 2017, nº 1, p. 1601 – 1604, 2017. DOI:10.1049/oap-cired.2017.0419.
(14) S. Pazi, C. M. Clohessy y G. D. Sharp, “A framework to select a classification algorithm in electricity fraud detection,” South African Journal of Science, vol. 116, nº 9-10, pp. 1-7, 2020. http://dx.doi.org/10.17159/sajs.2020/8189.
(15) D. Cielen, A. D. B. Meysman y M. Ali, Introducing Data Science, Shelter Island, NY: Manning Publications Co., 2016.
(16) C. A. Bernal, Metodología de la investigación, Bogotá: Pearson Educación, 2010.
(17) Comisión Nacional de Energía de Chile, “Energía Abierta,” 10 March 2022. (En línea). Available: http://energiaabierta.cl/categorias-estadistica/electricidad/?sf_paged=2. (Último acceso: 16 July 2022).
(18) W. McKinney, Python for Data Analysis, Sebastopol, CA: O’Reilly Media, Inc., 2018.
(19) M. A. Salazar Córdova, “Impactos de la emigración de clientes regulados al mercado libre. Catastro, evolución y efectos en los clientes y en las empresas proveedoras (generación y distribución),” Tesis de Maestría, Universidad Técnica Federico Santa María, Santiago de Chile, 2018.
(20) B. M. Mellado Leal, “Aplicaciones de Data Science para la mejora de la medición y cobro de la distribución de la energía eléctrica en contextos de pandemia mundial,” Tesis de Pregrado, Universidad de Chile, Santiago de Chile, 2018.
(21) L. Igual y S. Seguí, Introduction to Data Science - A Python Approach to Concepts, Techniques and Applications, Switzerland: Springer International Publishing, 2017.
(22) G. N. Pizarro Herrera, “Reconocimiento de patrones y pronóstico de consumo eléctrico,” Tesis de Pregrado, Pontificia Universidad Católica de Valparaiso, Valparaíso, 2017.
(23) J. Amat Rodrigo, “Ciencia de Datos, Estadística, Machine Learning y Programación,” Joaquin Amat Rodrigo, (En línea). Available: https://www.cienciadedatos.net/documentos/pystats05-correlacion-lineal-python.html. (Último acceso: 16 Julio 2022).
(24) E. Umargono, J. E. Suseno y V. Gunanwan S.K, “K-Means Clustering Optimization Using the Elbow Method and Early Centroid Determination Based on Mean and Median Formula,” Advances in Social Science, Education and Humanities Research, vol. 474, 2019. DOI:10.2991/assehr.k.201010.019.
(25) E. Russano y E. Ferreira Avelino, Fundamentals Of Machine Learning Using Python, Cánada: Arcler Press, 2020.
(26) W. Kong, Y. Wang, H. Dai, L. Zhao y C. Wang, “Analysis of energy consumption structure based on K-means clustering algorithm,” de E3S Web of Conferences 267, 01054 (2021), Beijing, 2021. https://doi.org/10.1051/e3sconf/202126701054
(27) W.-M. Lee, Python Machine Learning, Indianapolis: John Wiley & Sons, Inc., 2019.
(28) S. Raschka y V. Mirjalili, Python Machine Learning - Machine Learning and Deep Learning with Python, Scikit-Learn, and TensorFlow, Birmingham: Packt Publishing Ltd., 2017.
(29) M. E. Fenner, Machine Learning with Python for Everyone, Boston: Pearson Education, Inc., 2020.
(30) S. Shalev-Shwartz y S. Ben-David, Understanding Machine Learning - From Theory to Algorithms, Cambridge: Cambridge University Press, 2014.
(31) R. D. Dana, D. Soilihudin, R. H. Silalahi, D. Kurnia y U. Hayati, “Competency test clustering through the application of Principal Component Analysis (PCA) and the K-Means algorithm,” de IOP Conf. Series: Materials Science and Engineering 1088 (2021) 012038, Cirebon, 2021. doi:10.1088/1757-899X/1088/1/012038.

