CodePretraining, Synthetic DataUnknown

LongBench

by zai-org

Silver57
91.8Kdownloads
175likes
1K<n<10K

Description

LongBench is a comprehensive benchmark for multilingual and multi-task purposes, with the goal to fully measure and evaluate the ability of pre-trained language models to understand long text. This dataset consists of twenty different tasks, covering key long-text application scenarios such as multi-document QA, single-document QA, summarization, few-shot learning, synthetic tasks, and code completion.

What can I do with this?

Tags

task_categories:question-answeringtask_categories:text-generationtask_categories:summarizationtask_categories:text-classificationlanguage:enlanguage:zhsize_categories:1K<n<10Karxiv:2308.14508arxiv:2108.00573arxiv:1712.07040arxiv:2105.03011arxiv:2104.02112arxiv:2104.05938arxiv:2305.05280arxiv:2303.09752arxiv:1910.10683arxiv:2306.14893arxiv:2306.03091region:usLong Context