Presenter: Yavar Pourmohamad
Computing PhD Student, Data Science emphasis
Location: In person in CCP 259 or register to attend via Zoom
Abstract: Wildfires are increasingly impacting social and environmental systems in the United States. A majority of these wildfires are human-caused and preventable. The National Wildfire Coordination Group identifies 13 distinct fire causes. The most complete record of recent fire ignitions in the US indicated the cause of nearly 30% of ignitions as unknown due to ambiguities in the source data. Building on the hypothesis that fire ignition data have spatial and temporal structures, we developed machine learning models to classify causes of fire ignitions on the basis of 124 separate physical (e.g., weather, climate, distance to infrastructure), biological (e.g., land cover, vegetation greenness), and social (e.g., population density, social vulnerability index) attributes. We then used these models to probabilistically classify the cause of unknown ignitions. We implemented the probabilistic classification by (1) selecting fires with 12 known causes, (2) dividing the known-cause fires into training (65%), validation (15%), and test (20%) data, and (3) using GBoost, AdaBoost, Random Forest, deep neural networks, and XGBoost to classify all CONUS fires. We found that XGBoost yielded the highest accuracy (67.34% for the test data) among these models. Next, we applied the XGBoost model to the fire data from the western United States (WUS; 68.07% accuracy).