Early life vaccination with whole cell or acellular pertussis vaccines shapes long term immune trajectories that influence responses to booster immunizations. In this study, we analyzed immune responses following tetanus, diphtheria, and acellular pertussis booster vaccination to investigate how infancy vaccination influences recall responses later in life. We applied machine learning to immune data from the CMI-PB Challenge, including gene expression, antibody titers, cytokine levels, and cell frequencies from annual donor cohorts collected from 2020 to 2022.
Each measurement type was treated as a separate modality. We applied cohort wise normalization and SHAP based feature selection within each modality, followed by feature level fusion to integrate selected features across modalities. A range of classifiers, including random forests, SVM, KNN, logistic regression, multilayer perceptrons, and XGBoost was applied on individual modalities, pairs, and fused datasets to distinguish between whole cell and acellular priming.
SHAP analysis identified IgG4 antibodies to filamentous hemagglutinin and pertussis toxin, along with cytokines such as CCL8, CCL2, IL1 alpha, and CXCL9, as key predictors, suggesting that repeated boosting may shape both antibody profiles and cytokine driven immune responses. In model evaluations, no individual modality consistently outperformed others across cohorts. For example, training on 2021 and 2022 and testing on 2020, gene expression achieved AUROC 0.859 while multimodal model reached 0.958. When testing on 2022, antibody features yielded AUROC 0.792 and the multimodal model achieved 0.866, potentially reflecting immune signatures shaped by COVID-19 vaccination. These findings highlight the importance of multimodal fusion for cohort generalizable immune response prediction.