Tools and Guidance for Applying Neural Networks to Eddy Covariance Data
Abstract
Eddy covariance (EC) is used to monitor fluxes of energy, water, and carbon in ecosystems across the globe. The assumptions underpinning EC are not valid under all meteorologic or operational conditions, which results in an observation bias that must be considered when working with EC data. Trace gas fluxes exhibit spatially and temporally variable, non-linear dependence upon multiple drivers. Multi-year EC data sets have hundreds of thousands of data points and flux time series contain both noise and data gaps. These factors make EC data well suited for analysis with neural network (NN) models, a flexible machine learning method for mapping relationships in large multivariate data sets with non-linear dependencies. Often NN models have been treated as black box models. However, careful inspection of model derivatives provides a method for ensuring that relationships mapped by a NN are physically plausible. Customizing the structure of a model can also help guarantee the model is emulating real world phenomena. Here we present guidance for applying NN models to EC data and provide a corresponding GitHub repository with functional NN examples written in Python. We use carbon dioxide and methane flux data from wetland sites in southwestern British Columbia to demonstrate how model derivatives can be used to detect, visualize, and rank the importance of functional relationships driving fluxes. We also show how NNs can be used for gap-filling and upscaling, and we offer comparisons to other common machine learning methods like random forests.