Find the perfect fine-tuned model for your project
The curated directory of fine-tuned AI models and training datasets. No ML expertise required — browse by use case, modality, or training method.
3,452
Fine-tuned models
10,000
Training datasets
100
Uncensored models
47
Base model families
Browse by category
Top models
Fine-Tuned Models
Kokoro-82M
hexgrad
WanVideo_comfy
Kijai
Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled
Jackrong
Qwen3.5-35B-A3B-GGUF
unsloth
Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive
HauhauCS
all-MiniLM-L12-v2
sentence-transformers
Top datasets
Training Datasets
Grade School Math 8K
openai
Dataset Card for GSM8K Dataset Summary GSM8K (Grade School Math 8K) is a dataset of 8.5K high quality linguistically diverse grade school math word problems. The dataset was created to support the task of question answering on basic mathematical problems that require multi-step reasoning. These problems take between 2 and 8 steps to solve. Solutions primarily involve performing a sequence of elementary calculations using basic arithmetic operations (+ − ×÷) to reach the… See the full description on the dataset page: https://huggingface.co/datasets/openai/gsm8k.
PhysicalAI-Autonomous-Vehicles
nvidia
PHYSICAL AI AUTONOMOUS VEHICLES The PhysicalAI-Autonomous-Vehicles dataset provides one of the largest, geographically diverse collections of multi-sensor data empowering AV researchers to build the next generation of Physical AI based end-to-end driving systems. This dataset is ready for commercial/non-commercial AV use per the license agreement. Data Collection Method Automatic/Sensor Labeling Method Automatic/Sensor This dataset has a total of 1700 hours of driving… See the full description on the dataset page: https://huggingface.co/datasets/nvidia/PhysicalAI-Autonomous-Vehicles.
FineWeb
HuggingFaceFW
🍷 FineWeb 15 trillion tokens of the finest data the 🌐 web has to offer What is it? The 🍷 FineWeb dataset consists of more than 18.5T tokens (originally 15T tokens) of cleaned and deduplicated english web data from CommonCrawl. The data processing pipeline is optimized for LLM performance and ran on the 🏭 datatrove library, our large scale data processing library. 🍷 FineWeb was originally meant to be a fully open replication of 🦅 RefinedWeb, with a release… See the full description on the dataset page: https://huggingface.co/datasets/HuggingFaceFW/fineweb.
WikiText
Salesforce
Dataset Card for "wikitext" Dataset Summary The WikiText language modeling dataset is a collection of over 100 million tokens extracted from the set of verified Good and Featured articles on Wikipedia. The dataset is available under the Creative Commons Attribution-ShareAlike License. Compared to the preprocessed version of Penn Treebank (PTB), WikiText-2 is over 2 times larger and WikiText-103 is over 110 times larger. The WikiText dataset also features a far larger… See the full description on the dataset page: https://huggingface.co/datasets/Salesforce/wikitext.
prompts.chat
fka
a.k.a. Awesome ChatGPT Prompts This is a Dataset Repository mirror of prompts.chat — a social platform for AI prompts. 📢 Notice This Hugging Face dataset is a mirror. For the latest prompts, features, and community contributions, please visit: 🌐 Website: prompts.chat 📦 GitHub: github.com/f/awesome-chatgpt-prompts About prompts.chat is an open-source platform where users can share, discover, and collect AI prompts from the community. The project can be… See the full description on the dataset page: https://huggingface.co/datasets/fka/prompts.chat.
Xperience-10M
ropedia-ai
⚠️ Important: If you have already submitted an access request but have not completed the required DocuSign agreement, your request will remain pending. Please complete signing and we will grant access once verified. Interactive Intelligence from Human Xperience Xperience-10M Dataset Summary Xperience-10M is a large-scale egocentric multimodal dataset of human experience for embodied AI, robotics, world models, and spatial… See the full description on the dataset page: https://huggingface.co/datasets/ropedia-ai/xperience-10m.
What is fine-tuning?
Fine-tuning takes a pre-trained AI model and trains it further on specialized data — teaching a generalist to become an expert in your field. Instead of training from scratch (millions of dollars), you start from an existing model and adapt it with your own dataset.
This catalog helps you find both fine-tuned models others have created, and the training datasets you need for your own fine-tuning projects. We filter out pure quantizations and format conversions — only genuine fine-tunes that involved real training.
Built on generosity
Every model and dataset in this catalog exists because someone chose to share their work with the world. Behind each entry is real human expertise — researchers, engineers, and hobbyists who invested their knowledge, their time, and often significant compute resources to create something valuable, then gave it away freely.
Fine-tuning a model can take days of GPU time. Curating a training dataset can take months of careful annotation. These contributions represent a quiet, extraordinary act of generosity — people sharing the fruits of their labor so that others can build on them, learn from them, and push the boundaries of what's possible.
To every open-source contributor in this catalog: thank you.
