Image RecognitionUnknown
FineVision
by HuggingFaceM4
149.6Kdownloads
478likes
10M<n<100MDescription
Fine Vision
FineVision is a massive collection of datasets with 17.3M images, 24.3M samples, 88.9M turns, and 9.5B answer tokens, designed for training state-of-the-art open Vision-Language-Models.
More detail can be found in the blog post: https://huggingface.co/spaces/HuggingFaceM4/FineVision
Load the data
from datasets import load_dataset, get_dataset_config_names
# Get all subset names and load the first one
available_subsets =… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceM4/FineVision.
What can I do with this?
Tags
size_categories:10M<n<100Mformat:parquetmodality:imagemodality:textlibrary:datasetslibrary:dasklibrary:mlcroissantlibrary:polarsarxiv:2510.17269region:us