Nomic Embed Vision: Expanding the Latent Space

doi:10.48550/arXiv.2406.18587

Nomic Embed Vision: Expanding the Latent Space

This technical report describes the training of nomic-embed-vision, a highly performant, open-code, open-weights image embedding model that shares the same latent space as nomic-embed-text. Together, nomic-embed-vision and nomic-embed-text form the first unified latent space to achieve high performance across vision, language, and multimodal tasks.

Publication:

arXiv e-prints

Pub Date:

June 2024

DOI:

10.48550/arXiv.2406.18587

arXiv:

arXiv:2406.18587

Bibcode:

2024arXiv240618587N

Keywords:

Computer Science - Computer Vision and Pattern Recognition;
Computer Science - Artificial Intelligence

NASA/ADS

Nomic Embed Vision: Expanding the Latent Space

Abstract