How to validate raw data to create Analysis Ready Data?

Description

The Framework for Improving Low-cost Technology Effectiveness and Reliability (FILTER) provides advanced quality control and correction algorithms for air pollutants (fine particulate matter, PM2.5) and noise. The advantage of this tool is that it is sensing device agnostic, this is, it works for all type of devices, and it does not requiere that the device is co-located against a reference station. In its current version FILTER does requiere that the sensing device is nearby other sensing devices (low-cost sensor network, LCS).

Why is this relevant?

Harmonized, quality-controlled, and corrected datasets from various low-cost sensor networks are generally lacking. In order to increase the uptake of data from sensing devices, it is important that the user knows the quality of the data. Lack of data standardization and data of unknow quality limits the use of citizen generated data for official environmental assessments, policymaking, and scientific research.

FILTER is a tool comprising a set of statistical algorithms for quality flagging and value correction. The FILTER tool is available on MATLAB and integrated as an open-source R package. The R package includes the full FILTER methodology along with detailed use cases to support applications to both historical and near real-time datasets.

Useful resources

Scientific paper describing the methodology: Hassani, A., Salamalikis, V., Schneider, P., Stebel, K., Castell, N. (2025). A scalable framework for harmonizing, standardization, and correcting crowd-sourced low-cost sensor PM2.5 data across Europe. Journal of Environmental Management, 380, 125100.

Validated Dataset for Europe: https://doi.org/10.6084/m9.figshare.27195720.v1

You might also be interested in

Last updated

Was this helpful?