H2O LLM Studio
Create your own large language models, build enterprise-grade GenAI solutions
H2O LLM Studio was created by our top Kaggle Grandmasters and provides organizations with a no-code fine-tuning framework to make their own custom state-of-the-art LLMs for enterprise applications.
Data stages for H2O-Danube3-4B. The model is trained over three different stages with different data mixes. The first data stage consist of 90.6% of web data which is gradually decreasing to 81.7% at the second stage, and to 51.6% at the third stage. The first two stages include the majority of the tokens: 4.6T and 1.35T tokens respectively, while the third stage comprises of 0.05T tokens.
Train SLM foundation models
Deepspeed distributed training on GPU clusters
SLMs are cheaper to train and operate than LLMs (fewer GPUs)
SLMs are faster than LLMs (more tokens/sec, lower latency)
SLMs are more customizable than LLMs (faster to fine-tune)
Fine-tune state of the art large language models using LLM Studio, a no-code GUI framework
Fine-tune SLMs for NLP use cases
Distill LLMs into SLMs (with H2O LLM Data Studio)
Instruction/chat fine-tuned custom GPTs for mobile and offline applications
Causal Classification and Regression fine-tuned SLMs for conversational use cases
DPO/IPO/KTO optimization and alignment
Fine-tuned SLMs can be more accurate than LLMs for specific use cases
Lower TCO with fine-tuned SLMs compared to using large general-purpose LLMs