Vision-LanguageSynthetic DataUnknown

LLaVA-Video-178K

by lmms-lab

Silver54
31.1Kdownloads
189likes

Description

Dataset Card for LLaVA-Video-178K Uses This dataset is used for the training of the LLaVA-Video model. We only allow the use of this dataset for academic research and education purpose. For OpenAI GPT-4 generated data, we recommend the users to check the OpenAI Usage Policy. Data Sources For the training of LLaVA-Video, we utilized video-language data from five primary sources: LLaVA-Video-178K: This dataset includes 178,510 caption entries, 960,792 open-ended… See the full description on the dataset page: https://huggingface.co/datasets/lmms-lab/LLaVA-Video-178K.

What can I do with this?

Tags

task_categories:visual-question-answeringtask_categories:video-text-to-textlanguage:ensize_categories:1M<n<10Mmodality:textmodality:videoarxiv:2410.02713region:usvideo