CodeCommercial OK

github-code-clean

by codeparrot

Silver53

23.7Kdownloads

136likes

Description

The GitHub Code clean dataset in a more filtered version of codeparrot/github-code dataset, it consists of 115M code files from GitHub in 32 programming languages with 60 extensions totaling in almost 1TB of text data.

github-code-clean

Description

What can I do with this?

Tags