Skip to content


Track: Machine Learning

Improving Survey Design with Data-Driven Stratification: Predicting Fishing Likelihood with Machine Learning Models

Tuesday, April 13, 4:30-5:10pm EDT

To support sustainable fisheries, ECS improved recreational fishing effort estimation with a machine learning predictive model to predict recreational fishing likelihood based on state, month, resident county population, distance to the coast, and boat ownership. With five-year fishing survey results matched with fishing license registrations, boat registrations, and 2019 census results, model features were selected subjectively with the Boruta algorithm (a wrapper algorithm built around random forest). Various machine learning models were trained with the selected features, including random forest, XGB, C5.0 decision tree, Bayesian GLM, neural network, discriminant analysis, and logistic regression. Each model was trained with tenfold cross-validation and repeated three times. Models were evaluated by accuracy and kappa. Two models, an XGB and a C5.0 decision tree, were chosen for best performance, and final predictions were made by ensemble modeling. The fishing likelihood was estimated at the county level, and a probability greater than 0.5 was considered “fishing-likely.” Compared with the previous strata, the new fishing-likely strata were closer to the coastline during wintertime and more off-coast during summertime, reflecting long-term observations. The model also indicated the importance of having boat ownership information. In the reanalysis study, more than 60% of the cases showed improvement with the new stratification design. There was a decrease in produced estimation variations, resulting in less biased survey results. These survey results will help to make more precise fishing effort estimation, ultimately supporting sustainable fisheries in the United States.

Katherine Slater image

Katherine Slater

Katherine Slater

Research Associate at ECS Federal, LLC

Dr. Wencheng Katherine Slater holds a doctorate in oceanography from University of Maryland, where she specialized in ecosystem research and environmental statistics. She was awarded the NOAA John A. Knauss Marine Policy Fellowship in 2016 and worked on ecosystem fishery management during her fellowship year. Her current role at the ECS is to support the Fisheries Statistics Division in the Office of Science and Technology at NOAA fisheries by incorporating data science into fishery management.