Science & ResearchCommercial OK

NewsWire

by dell-research-harvard

Bronze48
5.5Kdownloads
90likes
1M<n<10M

Description

Dataset Card for NewsWire Dataset Summary NewsWire contains 2.7 million unique public domain U.S. news wire articles, written between 1878 and 1977. Locations in these articles are georeferenced, topics are tagged using customized neural topic classification, named entities are recognized, and individuals are disambiguated to Wikipedia using a novel entity disambiguation model. Languages English (en) Dataset Structure Each year in the dataset is… See the full description on the dataset page: https://huggingface.co/datasets/dell-research-harvard/newswire.

What can I do with this?

Tags

task_categories:text-classificationtask_categories:text-generationtask_categories:text-retrievaltask_categories:summarizationtask_categories:question-answeringlanguage:enlicense:cc-by-4.0size_categories:1M<n<10Mformat:jsonmodality:tabularmodality:textlibrary:datasetslibrary:dasklibrary:mlcroissantarxiv:2406.09490doi:10.57967/hf/2423region:ussocial scienceeconomicsnews