Scoring your predictions
You can score your predictions using the EarthNet toolkit (
pip install earthnet)
Save your predictions for one test set in one folder in the following way:
Name your NDVI prediction variable as
Then use the
data_dir/dataset/split as the targets.
Then compute the normalized NSE over the full dataset:
import earthnet as entk scores = entk.score_over_dataset(Path/to/targets, Path/to/predictions) print(scores["veg_score"])
Alternatively you can score a single minicube:
import earthnet as entk df = entk.normalized_NSE(Path/to/target_minicube, Path/to/prediction_minicube) print(df.describe())
EarthNet2021x uses a vegetation score to benchmark different models.
- For each pixel compute the Nash-Sutcliffe Model Efficiency (NSE) at cloud-free observations
- Rescale this with
1 / (2-nse)to the range 0-1 for robust averaging
- Averaging over all natural vegetation pixels (Landcover class Trees, Scrub or Grassland)
- Scaling back with
2 - 1/mean_nnseto the range -Inf,1
The Vegetation Score is 1 if the prediction is perfect. It is 0 if on average predictions are as good as the mean over the target period. It is negative if on average predictions are worse than the mean of the target period.
In Pseudo-Code it is computed as follows:
nse = NSE(targ_ndvi, pred_ndvi).where(targ_ndvi has no clouds) nnse = 1 / (2-nse) veg_score = 2 - 1/mean(nnse.where(landcover == Trees, Scrub or Grassland))
Models can use a context length for spin-up and are benchmarked over a target length, which is specified for the different test sets (tracks) as follows (same as EarthNet2021):
- IID: 50 days context, 100 days target
- OOD: 50 days context, 100 days target
- Extreme: 100 days context, 200 days target
- Seasonal: 350 days context, 700 days target
Here, five days equal one Sen2 observation.