Instructions to use AI-MO/Kimina-Prover-Preview-Distill-1.5B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use AI-MO/Kimina-Prover-Preview-Distill-1.5B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="AI-MO/Kimina-Prover-Preview-Distill-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("AI-MO/Kimina-Prover-Preview-Distill-1.5B") model = AutoModelForCausalLM.from_pretrained("AI-MO/Kimina-Prover-Preview-Distill-1.5B") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use AI-MO/Kimina-Prover-Preview-Distill-1.5B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "AI-MO/Kimina-Prover-Preview-Distill-1.5B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AI-MO/Kimina-Prover-Preview-Distill-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/AI-MO/Kimina-Prover-Preview-Distill-1.5B
- SGLang
How to use AI-MO/Kimina-Prover-Preview-Distill-1.5B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "AI-MO/Kimina-Prover-Preview-Distill-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AI-MO/Kimina-Prover-Preview-Distill-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "AI-MO/Kimina-Prover-Preview-Distill-1.5B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "AI-MO/Kimina-Prover-Preview-Distill-1.5B", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use AI-MO/Kimina-Prover-Preview-Distill-1.5B with Docker Model Runner:
docker model run hf.co/AI-MO/Kimina-Prover-Preview-Distill-1.5B
Kimina-Prover-Preview-Distill-1.5B
Kimina-Prover-Preview-Distill-1.5B is a theorem proving model developed by Project Numina and Kimi teams, focusing on competition style problem solving capabilities in Lean 4. It is a distillation of Kimina-Prover-Preview, a model trained via large scale reinforcement learning. It achieves SOTA results on MiniF2F-test in its model size and compute budget at the time of release.
Quick Start with vLLM
You can easily do inference using vLLM:
from vllm import LLM, SamplingParams
from transformers import AutoTokenizer
model_name = "AI-MO/Kimina-Prover-Preview-Distill-1.5B"
model = LLM(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)
problem = "The volume of a cone is given by the formula $V = \frac{1}{3}Bh$, where $B$ is the area of the base and $h$ is the height. The area of the base of a cone is 30 square units, and its height is 6.5 units. What is the number of cubic units in its volume?"
formal_statement = """import Mathlib
import Aesop
set_option maxHeartbeats 0
open BigOperators Real Nat Topology Rat
/-- The volume of a cone is given by the formula $V = \frac{1}{3}Bh$, where $B$ is the area of the base and $h$ is the height. The area of the base of a cone is 30 square units, and its height is 6.5 units. What is the number of cubic units in its volume? Show that it is 65.-/
theorem mathd_algebra_478 (b h v : ℝ) (h₀ : 0 < b ∧ 0 < h ∧ 0 < v) (h₁ : v = 1 / 3 * (b * h))
(h₂ : b = 30) (h₃ : h = 13 / 2) : v = 65 := by
"""
prompt = "Think about and solve the following problem step by step in Lean 4."
prompt += f"\n# Problem:{problem}"""
prompt += f"\n# Formal statement:\n```lean4\n{formal_statement}\n```\n"
messages = [
{"role": "system", "content": "You are an expert in mathematics and Lean 4."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
sampling_params = SamplingParams(temperature=0.6, top_p=0.95, max_tokens=8096)
output = model.generate(text, sampling_params=sampling_params)
output_text = output[0].outputs[0].text
print(output_text)
Citation
If you find our work helpful, you can cite our paper: https://github.com/MoonshotAI/Kimina-Prover-Preview
@article{kimina_prover_2025,
title = {Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning},
author = {Wang, Haiming and Unsal, Mert and Lin, Xiaohan and Baksys, Mantas and Liu, Junqi and Santos, Marco Dos and Sung, Flood and Vinyes, Marina and Ying, Zhenzhe and Zhu, Zekai and Lu, Jianqiao and Saxcé, Hugues de and Bailey, Bolton and Song, Chendong and Xiao, Chenjun and Zhang, Dehao and Zhang, Ebony and Pu, Frederick and Zhu, Han and Liu, Jiawei and Bayer, Jonas and Michel, Julien and Yu, Longhui and Dreyfus-Schmidt, Léo and Tunstall, Lewis and Pagani, Luigi and Machado, Moreira and Bourigault, Pauline and Wang, Ran and Polu, Stanislas and Barroyer, Thibaut and Li, Wen-Ding and Niu, Yazhe and Fleureau, Yann and Hu, Yangyang and Yu, Zhouliang and Wang, Zihan and Yang, Zhilin and Liu, Zhengying and Li, Jia},
year = {2025},
url = {http://arxiv.org/abs/2504.11354},
}
- Downloads last month
- 1,226
Model tree for AI-MO/Kimina-Prover-Preview-Distill-1.5B
Base model
Qwen/Qwen2.5-1.5B