Comunicazione
Harmonization of multi-center MRI data to improve the performance of machine learning models.
Saponaro S., Giuliano A., Bellotti R., Lombardi A., Tangaro S., Oliva P., Calderoni S., Retico A.
Multi-center data collection has favored an even greater diffusion of the use of ML-based analyses in medical imaging. However, they may suffer the batch effect which, especially in case of MRI studies, should be curated to avoid confounding effects for ML classifiers. This is particularly important in the study of barely separable populations according to MRI data, such as subjects with autism spectrum disorders (ASD) compared to controls with typical development (TD). In this study, we show how the implementation of a harmonization framework on brain structural features improves the ability of case-control ML separation in the analysis of a multi-center MRI dataset. This effect is demonstrated on the ABIDE data collection. After data harmonization, the overall ASD $vs.$ TD discrimination capability by a Random Forest (RF) classifier improves from a very low performance ({$AUC$ = 0.58\pm $0.04$) to a still low, but reasonably significant ${AUC}$ = 0.67\pm $0.03$. The performances of the classifier have been evaluated also in the age-specific subgroups. Peculiar and consistent patterns of anatomical differences related to the ASD condition have been identified.