CodeHuman AnnotatedOther

sts22-crosslingual-sts

by mteb

Silver53
453.5Kdownloads
10likes

Description

STS22.v2 An MTEB dataset Massive Text Embedding Benchmark SemEval 2022 Task 8: Multilingual News Article Similarity. Version 2 filters updated on STS22 by removing pairs where one of entries contain empty sentences. Task category t2t Domains News, Written Reference https://competitions.codalab.org/competitions/33835 How to evaluate on this task You can evaluate an embedding model on this dataset using the following code: import mteb task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/sts22-crosslingual-sts.

What can I do with this?

Tags

task_categories:sentence-similaritytask_ids:semantic-similarity-scoringannotations_creators:human-annotatedmultilinguality:multilingualsource_datasets:mteb/sts22-crosslingual-stslanguage:aralanguage:cmnlanguage:deulanguage:englanguage:fralanguage:italanguage:pollanguage:ruslanguage:spalanguage:turlicense:unknownsize_categories:10K<n<100Kformat:jsonmodality:textlibrary:datasets