CodeHuman AnnotatedOther
sts22-crosslingual-sts
by mteb
453.5Kdownloads
10likes
Description
STS22.v2
An MTEB dataset
Massive Text Embedding Benchmark
SemEval 2022 Task 8: Multilingual News Article Similarity. Version 2 filters updated on STS22 by removing pairs where one of entries contain empty sentences.
Task category
t2t
Domains
News, Written
Reference
https://competitions.codalab.org/competitions/33835
How to evaluate on this task
You can evaluate an embedding model on this dataset using the following code:
import mteb
task =… See the full description on the dataset page: https://huggingface.co/datasets/mteb/sts22-crosslingual-sts.
What can I do with this?
Tags
task_categories:sentence-similaritytask_ids:semantic-similarity-scoringannotations_creators:human-annotatedmultilinguality:multilingualsource_datasets:mteb/sts22-crosslingual-stslanguage:aralanguage:cmnlanguage:deulanguage:englanguage:fralanguage:italanguage:pollanguage:ruslanguage:spalanguage:turlicense:unknownsize_categories:10K<n<100Kformat:jsonmodality:textlibrary:datasets