Text Generation & ChatUnknown

tiny-shakespeare

by Trelis

Bronze42

5.1Kdownloads

11likes

n<1K

Description

Data source Downloaded via Andrej Karpathy's nanogpt repo from this link Data Format The entire dataset is split into train (90%) and test (10%). All rows are at most 1024 tokens, using the Llama 2 tokenizer. All rows are split cleanly so that sentences are whole and unbroken.

tiny-shakespeare

Description

What can I do with this?

Tags