Instruction FollowingSynthetic DataCommercial OK
HuggingFaceFW/finephrase
by HuggingFaceFW
546.2Kdownloads
91likes
n>1MDescription
Dataset Card for HuggingFaceFW/finephrase
Dataset Summary
Synthetic data generated by DataTrove:
Model: HuggingFaceTB/SmolLM2-1.7B-Instruct (main)
Source dataset: HuggingFaceFW/fineweb-edu, config sample-350BT, split train
Generation config: temperature=1.0, top_p=1.0, top_k=50, max_tokens=2048, model_max_context=8192
Speculative decoding: {"method":"suffix","num_speculative_tokens":32}
System prompt: None
Input column: text
Prompt families:
faq prompt
Rewrite the… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/finephrase.
What can I do with this?
Tags
task_categories:text-generationtask_ids:language-modelingannotations_creators:machine-generatedlanguage_creators:foundsource_datasets:HuggingFaceFW/fineweb-edu/sample-350BTlanguage:enlicense:odc-bysize_categories:1B<n<10Bmodality:tabularmodality:textregion:usSmolLM2-1.7B-Instructfineweb-edusyntheticdatatrove