Vision-LanguagePretrainingNon-Commercial

cc12m-wds

by pixparse

Silver50

30.7Kdownloads

38likes

10M<n<100M

Description

Dataset Card for Conceptual Captions 12M (CC12M) Dataset Summary Conceptual 12M (CC12M) is a dataset with 12 million image-text pairs specifically meant to be used for visionand-language pre-training. Its data collection pipeline is a relaxed version of the one used in Conceptual Captions 3M (CC3M). Usage This instance of Conceptual Captions is in webdataset .tar format. It can be used with webdataset library or upcoming releases of Hugging Face datasets.… See the full description on the dataset page: https://huggingface.co/datasets/pixparse/cc12m-wds.

cc12m-wds

Description

What can I do with this?

Tags