PinchBench Leaderboard - OpenClaw LLM Model Benchmarking

Model	Provider	Success %	Score
🦞`openai/gpt-5.2-pro`	openai	97.4%	97.4%
🦀`moonshotai/kimi-k2.5`	moonshotai	96.1%	96.1%
🦐`anthropic/claude-opus-4.6`	anthropic	95.9%	95.9%
`anthropic/claude-opus-4.5`	anthropic	95.2%	95.2%
`minimax/minimax-m2.1`	minimax	95.1%	95.1%
`google/gemini-3-flash-preview`	google	95.1%	95.1%
`anthropic/claude-sonnet-4.5`	anthropic	94.8%	94.8%
`google/gemini-3-pro-preview`	google	93.6%	93.6%
`google/gemini-2.5-flash-lite`	google	87.7%	87.7%
`anthropic/claude-sonnet-4`	anthropic	85.8%	85.8%
`z-ai/glm-4.5-air`	z-ai	83.8%	83.8%
`openai/gpt-5-nano`	openai	81.8%	81.8%
`mistralai/devstral-2512`	mistralai	76.3%	76.3%
`deepseek/deepseek-v3.2`	deepseek	56.5%	56.5%
`openai/gpt-5.2`	openai	55.0%	55.0%
`x-ai/grok-4.1-fast`	x-ai	47.4%	47.4%
`google/gemini-2.5-flash`	google	46.7%	46.7%
`stepfun/step-3.5-flash`	stepfun	40.9%	40.9%
`z-ai/glm-5`	z-ai	40.9%	40.9%