Paper published in Nature Communications
In our paper “Data splitting to avoid information leakage with DataSAIL”, we present an algorithm and Python package that facilitates leakage-reduced data splitting to enable realistic evaluation of ML models that are intended to be used in out-of-distribution scenarios.