VILA-13B
NVIDIAMassachusetts Institute of Technology (MIT)ChatVisual question answeringImage captioningLanguage modeling/generationQuestion answeringOpen weights
Developed by NVIDIA,Massachusetts Institute of Technology (MIT) in 2023, VILA-13B is a chat model with 13350839296.0 parameters with openly available weights.
About VILA-13B
Visual language models (VLMs) rapidly progressed with the recent success of large language models. There have been growing efforts on visual instruction tuning to extend the LLM with visual inputs, but lacks an in-depth study of the visual language p
Details
- Provider
- NVIDIA,Massachusetts Institute of Technology (MIT)
- Task
- Chat,Visual question answering,Image captioning,Language modeling/generation,Question answering
- Parameters
- 13350839296.0
- Released
- 2023-12-12
- Open weights
- Yes