Evaluation of Support Vector Machine Performance for Carbon Monoxide Prediction

Authors
Abstract
In Carbon monoxide (CO) is one of the main air pollutant parameters in the atmosphere of Tehran, Iran. Generally, it is difficult to predict and control CO concentration because it is essentially nonlinear time-varying system. Recently, in particular, environmental control such as CO concentration level control is regarded as one of the most important factors in environmental protections. This paper describes forecasting and more specifically uncertainty determination of CO concentration during the modeling process using a support vector machine (SVM) technique. Uncertainty of the air pollution modeling studies highly affected the simulation results. In this regards, it is very important to determine the uncertainty of air pollution models due to consequences on health of people exposed to the pollution. Therefore, this research aims to calibrate, verify, and also determine the uncertainty of support vector machine (SVM) in the process of air pollution modeling in the atmosphere of Tehran. To achive this goal, the SVM model was selected to predict arithmetic average of daily measured CO concentration in the atmosphere of Tehran. In this regards, the SVM model was calibrated and verified using six daily air pollutants include particulate matter with diameter equal or less than 10 micrometer (PM10), total hydrocarbons (THC), nitrogen oxides (NOx), methane (CH4), sulfur dioxide (SO2) and ozone (O3) and also six daily meteorological variables include pressure (Press), temperature (Temp), wind direction (WD), wind speed (WS) and relative humidity (Hum). The data was collected from Gholhak station located in the north of Tehran, Iran, during 2004-2005. Thereafter, the best developed SVM model for predicting the CO concentration was chosen based on determination of coefficient (R2). Finally, to determine the SVM uncertainty, the model was run many times with different calibration data. It led to many different results because of the model sensitivity to the selected calibration data. Then, the model uncertainty in the CO prediction process was evaluated using the width of uncertainty band (d-factor) and the percentage of measured data bracketed by the 95 percent prediction uncertainties (95PPU). Generally, the results confirmed the strong performance of the SVM model in predicting CO concentration in the atmosphere of Tehran. The predicted average daily CO concentrations by SVM model had a good agreement with the measured ones in the Gholahak air quality monitoring station. It was found that the determination of coefficient for calibration and validation of SVM model were equal to 0.89 and 0.88, respectively. Furthermore, the results indicated that the SVM model has an acceptable level of uncertainty in prediction of CO concentration in which the level of d-factor and the percentage of measured data bracketed by the 95PPU in the validation step were 0.74 and 76, respectively. Therefore, The obtained results indicated that the SVM model had an acceptable level of uncertainty in prediction of CO concentration. Therefore, it can be concluded that the SVM model is able to predict the CO concentration in the atmosphere of Tehran while it resulted an acceptable level of uncertainty. Finally, due to the proposed methodology is general, the authors suggest to apply it for analyzing the uncertainty of SVM model in other fields of science and engineering.

Keywords


[1] Hanna S. R. 1998 Air quality model evaluation and uncertainty. Journal Air Pollution Control Association, 38(4), 406-412.
[2] Kioutsioukis L., Tarantola S., Saltelli A. & Gatelli D. 2004 Uncertainty and global sensitivity analysis of road transport emission estimates. Atmospheric Environment, 38(38), 6609-6620.
[3] Özkaynak H., Frey H. C., Burke J. & Pinder R. W. 2009 Analysis of coupled model uncertainties in sourceto-dose modeling of human exposures to ambient air pollution: A PM2.5 case study. Atmospheric Environment, 43(9), 1641-1649.
[4] Noori R., Hoshyaripour G., Ashrafi K. & Rasti O. 2013 Introducing an appropriate model using support vector machine for predicting carbon monoxide daily concentration in Tehran atmosphere. Iranian Journal of Health and Environment, 6(1), 1-10 (In Persian).
[5] Noori R., Ashrafi K. & Ajdarpour A. 2008 Comparison of ANN and PCA based multivariate linear regression applied to predict the daily average concentration of CO: A case study of Tehran. Journal of the Earth and Space Physics, 34(1), 135-152 (In Persian).
[6] Noori R., Hoshiyaripour G., Ashrafi K. & Araabi B. N. 2010 Uncertainty analysis of developed ANN and ANFIS models in prediction of carbon monoxide daily concentration. Atmospheric Environment, 44(4), 476- 482.
[7] Lu W. Z. & Wang W. 2005 Potential assessment of the ‘‘support vector machine’’ method in forecasting ambient air pollutant trends. Chemosphere, 59(5), 693-701.
[8] Lu W. Z. & Wang D. 2008 Ground-level ozone prediction by support vector machine approach with a cost-sensitive classification scheme. Science of the Total Environment, 395(2), 109-116.
[9] Lu W. Z., Wang W. J., Fan H. Y., Leung A. Y. T., Xu Z. B. & Lo S. M. 2002 Air pollutant parameter forecasting using support vector machines. IEEE, 1, 630-635.
[10] Osowski S. & Garanty K. 2007 Forecasting of the daily meteorological pollution using wavelets and support vector machine. Engineering Application of Artificial Intelligent, 20(6), 745-755.
[11] Salazar-Ruiz E., Ordieres J. B., Vergara E. P. & Capuz-Rizo S. F. 2008 Development and comparative analysis of tropospheric ozone prediction models using linear and artificial intelligence-based models in Mexicali, Baja California (Mexico) and Calexico, California (US). Environmental Modelling and Software, 23(8), 1056-1069.
[12] Feng Y., Zhang W., Sun D. & Zhang L. 2011 Ozone concentration forecast method based on genetic algorithm optimized back propagation neural networks and support vector machine data classification. Atmospheric Environment, 45(11), 1979-1985.
[13] Yeganeh B., Motlagh M. S. P., Rashidi Y. & Kamalan H. 2012 Prediction of CO concentrations based on a hybrid Partial Least Square and Support Vector Machine model. Atmospheric Environment, 55, 357- 365.
[14] Singh K. P., Gupta S. & Rai P. 2013 Identifying pollution sources and predicting urban air quality using ensemble learning methods. Atmospheric Environment, 80, 426-437.
[15] Weizhen H., Zhengqiang L., Yuhuan Z., Hua X., Ying Z., Kaitao L., Donghui L., Peng W. & Yan M 2014 Using support vector regression to predict PM10 and PM2.5. In: IOP Conference Series: Earth and Environmental Science, IOP Publishing, 17(1), pp. 012268.
[16] Noori R., Karbassi A., Ashrafi K., Ardestani M., Mehrdadi N. & Bidhendi G. R. N. 2012 Active and online prediction of BOD5 in river systems using reduced-order support vector machine. Environmental Earth Sciences, 67(1), 141-149.
[17] Noori R., Yeh H. D., Abbasi M., Kachoosangi F. T. & Moazami S. 2015 Uncertainty analysis of support vector machine for online prediction of five-day biochemical oxygen demand. Journal of Hydrology, 527, 833-843.
 
[18] Moazami S., Noori R., Amiri B. J., Yeganeh B., Partani S. & Safavi S. 2016 Reliable prediction of carbon monoxide using developed support vector machine. Atmospheric Pollution Research, 7(3), 412-418.
[19] Vapnik V. N. 1998 Statistical Learning Theory. Wiley, New York.
[20] Fletcher R. 1987 Practical Methods of Optimization. Wiley, New York.
[21] Abe S. 2005 Support Vector Machines for Pattern Classification. Springer-Verlag, London.
[22] Noori R., Karbassi A. R., Moghaddamnia A., Han D., Zokaei-Ashtiani M. H., Farokhnia A. & Ghaffari-Goushe M. 2011 Assessment of input variables determination on the SVM model performance using PCA, Gamma test, and forward selection techniques for monthly stream flow prediction. Journal of Hydrology, 401(3), 177-189.
[23] Abbaspour K. C., Yang J., Maximov I., Siber R., Bogner K., Mieleitner J., Zobrista J. & Srinivasan R. 2007 Modeling hydrology and water quality in the pre alpine/alpine Thur watershed using SWAT. Journal of Hydrology, 333(2), 413-430.
[24] Noori R., Safavi S. & Shahrokni A. A. N. 2013 A reduced-order adaptive neuro-fuzzy inference system model as a software sensor for rapid estimation of fiveday biochemical oxygen demand. Journal of Hydrology, 495, 175-185.
[25] Noori R., Deng Z., Kiaghadi A. & Kachoosangi F. T. 2016 How reliable are ANN, ANFIS, and SVM techniques for predicting longitudinal dispersion coefficient in natural rivers?. Journal of Hydraulic Engineering, DOI:10.1061/(ASCE)HY.1943- 7900.0001062.
[26] Dehghani M., Saghafian B., Nasiri Saleh F., Farokhnia A. & Noori R. 2014 Uncertainty analysis of streamflow drought forecast using artificial neural networks and Monte‐Carlo simulation. International Journal of Climatology, 34(4), 1169-1180.
[27] Noori R., Ghiasi B., Sheikhian H., Adamowski J. F. 2017 Estimation of the dispersion coefficient in natural rivers using a granular computing model. Journal of Hydraulic Engineering, DOI:10.1061/(ASCE)HY.1943- 7900.0001276.