Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy

Published in International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021

Recommended citation: Lelio Campanile, Pasquale Cantiello, Mauro Iacono, Roberta Lotito, Fiammetta Marulli, Michele Mastroianni, "Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy." International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings, 2021. https://www.scopus.com/inward/record.uri?eid=2-s2.0-85135227609&partnerID=40&md5=5a7c117fa01d0ba8d779b0e092bc0f63

Cited by: 4

Access paper here

Abstract: Local pollution is a problem that affects urban areas and has effects on the quality of life and on health conditions. In order to not develop strict measures and to better manage territories, the national authorities have applied a vast range of predictive models. Actually, the application of machine learning has been studied in the last decades in various cases with various declination to simplify this problem. In this paper, we apply a regression-based analysis technique to a dataset containing official historical local pollution and weather data to look for criteria that allow forecasting critical conditions. The methods was applied to the case study of Napoli, Italy, where the local environmental protection agency manages a set of fixed monitoring stations where both chemical and meteorological data are recorded. The joining of the two raw dataset was overcome by the use of a maximum inclusion strategy as performing the joining action with”outer” mode. Among the four different regression models applied, namely the Linear Regression Model calculated with Ordinary Least Square (LN-OLS), the Ridge regression Model (Ridge), the Lasso Model (Lasso) and Supervised Nearest Neighbors Regression (KNN), the Ridge regression model was found to better perform with an R2 (Coefficient of Determination) value equal to 0.77 and low value for both MAE (Mean Absolute Error) and MSE (Mean Squared Error), equal to 0.12 and 0.04 respectively. © 2021 by SCITEPRESS - Science and Technology Publications, Lda.

Author Keywords: Air Quality; Campania; Data Analysis; Forecasting; Machine Learning; Regression

Bibtex citation:

@CONFERENCE{Campanile2021354,
    author = "Campanile, Lelio and Cantiello, Pasquale and Iacono, Mauro and Lotito, Roberta and Marulli, Fiammetta and Mastroianni, Michele",
    title = "Applying Machine Learning to Weather and Pollution Data Analysis for a Better Management of Local Areas: The Case of Napoli, Italy",
    year = "2021",
    journal = "International Conference on Internet of Things, Big Data and Security, IoTBDS - Proceedings",
    volume = "2021-April",
    pages = "354 – 363",
    type = "Conference paper"
}

Download .bib file