Benchmarks & EvaluationPPOOther
OPUS EUconst
by Helsinki-NLP
6.9Kdownloads
8likes
10K<n<100KDescription
Dataset Card for OPUS EUconst
Dataset Summary
A parallel corpus collected from the European Constitution.
EUconst's Numbers:
Languages: 21
Bitexts: 210
Number of files: 986
Number of tokens: 3.01M
Sentence fragments: 0.22M
Supported Tasks and Leaderboards
The underlying task is machine translation.
Languages
The languages in the dataset are:
Czech (cs)
Danish (da)
German (de)
Greek (el)
English (en)
Spanish (es)
Estonian (et)
Finnish (fi)
French… See the full description on the dataset page: https://huggingface.co/datasets/Helsinki-NLP/euconst.
What can I do with this?
Tags
task_categories:translationannotations_creators:foundlanguage_creators:foundmultilinguality:multilingualsource_datasets:originallanguage:cslanguage:dalanguage:delanguage:ellanguage:enlanguage:eslanguage:etlanguage:filanguage:frlanguage:galanguage:hulanguage:itlanguage:ltlanguage:lvlanguage:mt