Text - GeneralUnknown

tldr

by trl-lib

Bronze43
2.5Kdownloads
30likes

Description

TL;DR Dataset Summary The TL;DR dataset is a processed version of Reddit posts, specifically curated to train models using the TRL library for summarization tasks. It leverages the common practice on Reddit where users append "TL;DR" (Too Long; Didn't Read) summaries to lengthy posts, providing a rich source of paired text data for training summarization models. Data Structure Format: Standard Type: Prompt-completion Columns: "pompt": The unabridged Reddit… See the full description on the dataset page: https://huggingface.co/datasets/trl-lib/tldr.

What can I do with this?

Tags

size_categories:100K<n<1Mformat:parquetmodality:textlibrary:datasetslibrary:pandaslibrary:mlcroissantlibrary:polarsregion:ustrl