Cross-Domain Application of Multimodal Models to Understand Real-World Performance
The goal of this project is to establish the use of accurate benchmarks of machine learning models on real-world data. Machine learning models are often trained and tested on simplified representations of data that cannot be replicated in the real-world. Reported results thus inflate machine learning performance, while obfuscating expected performance "in the wild". In order to combat this trend we deploy a multimodal fusion model that has been shown to generalize across a multitude of domains with reasonable accuracy. In each domain the model is implemented to minimize human intervention and to maximize the deployability on at-scale real-world data. Additionally, we continue to investigate the cross-domain generalization capabilities of the model pipeline and to increase possible utilization scenarios.