MOMENTA — Mixture-of-Experts for Multimodal Misinformation Detection
- PyTorch
- multimodal
- mixture-of-experts
- misinformation
What it is
The full, reproducible implementation behind the MOMENTA paper. It detects multimodal misinformation by combining four ideas in one architecture:
- modality-specific Mixture-of-Experts to capture diverse misinformation patterns,
- bidirectional co-attention plus a discrepancy-aware branch that explicitly models when text and image disagree,
- temporal aggregation with drift and momentum encoding over overlapping windows, and
- domain-adversarial learning + a prototype memory bank for cross-dataset generalization.
Trained with a multi-objective loss (classification, alignment, contrastive, temporal consistency, domain robustness) on Fakeddit, MMCoVaR, Weibo, and XFacta.
Stack
PyTorch · transformer/vision backbones · LOSO evaluation · calibration & t-SNE analysis
Links
- GitHub repository
- Related paper: arXiv:2604.16172