Local Models vs Cloud: A Tool-Calling Reality Check

February 2, 2026

After getting OpenClaw running in a VM (see Part 1), I set out to use local LLMs via Ollama. What followed was an educational journey through the current limitations of local model tool-calling.

The Memory Search API Key Problem

The first issue: OpenClaw's memory_search tool requires embeddings, which by default need an OpenAI API key. I tried several approaches:

Option 1: Disable the memory plugin entirely

{
  "plugins": {
    "slots": {
      "memory": "none"
    }
  }
}

Option 2: Use local embeddings

{
  "agents": {
    "defaults": {
      "memorySearch": {
        "provider": "local"
      }
    }
  }
}

I also tried configuring Ollama for embeddings, but the memorySearch.provider only accepts "openai" or "local", not custom endpoints.

Model Tool-Calling Issues

With memory sorted, I tested multiple local models. All had issues with OpenClaw's complex tool schemas:

Model	Result
`qwen2.5:32b`	Called tools but with wrong parameters
`qwen3:30b-a3b`	Reasoning model, tool format issues
`gemma3:27b-it-qat`	"does not support tools" error
`qwen2.5-coder:14b`	Still had parameter issues

The Gemma3 error was explicit:

"errorMessage": "400 registry.ollama.ai/library/gemma3:27b-it-qat does not support tools"

The Qwen models had a subtler problem: they would recognize they needed to call a tool but send empty {} for complex nested parameters like cron job specifications. They'd get stuck in loops trying repeatedly with the same empty parameters.

Trying Cloud Models

I switched to OpenCode Zen as the primary provider:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "opencode/kimi-k2.5",
        "fallbacks": ["ollama/qwen2.5-coder:14b"]
      }
    }
  }
}

Rate Limiting with Free Tier

The free tier (opencode/kimi-k2.5-free) hit rate limits quickly:

"errorMessage": "429 Request didn't generate first token before the given deadline, the service is overloaded"

Switching to the paid tier opencode/kimi-k2.5 resolved this.

Kimi Still Struggled with Complex Tools

Even the cloud Kimi model had issues with complex tool schemas. When trying to add cron jobs, it repeatedly sent empty job objects despite its "thinking" showing it knew what parameters were needed.

The Solution: Claude Haiku

I finally switched to opencode/claude-haiku-4-5 for reliable tool calling:

{
  "agents": {
    "defaults": {
      "model": {
        "primary": "opencode/claude-haiku-4-5",
        "fallbacks": ["ollama/qwen2.5-coder:14b"]
      }
    }
  }
}

This worked reliably. The local Ollama model serves as a fallback for simpler tasks or when the cloud is unavailable.

Lessons Learned

Local model tool-calling is immature. Even capable models like Qwen 2.5 32B struggle with complex nested tool schemas. Simple chat works; complex agentic workflows don't.
Ollama tool support varies by model. Some models (Gemma3) explicitly don't support tools. Check before assuming.
Memory search requires embeddings. Either disable it, use local embeddings, or accept the OpenAI API key requirement.
Claude models handle tools reliably. When tool-calling reliability matters, Claude (even Haiku) is a safer choice than experimental alternatives.

Final Model Configuration

Component	Value
Primary Model	`opencode/claude-haiku-4-5`
Fallback Model	`ollama/qwen2.5-coder:14b`
Memory Search	Disabled

OpenClaw VM Setup Series Part 1: Setting Up OpenClaw in an Isolated VM
Part 2: Local Models vs Cloud: A Tool-Calling Reality Check (this post)
Part 3: Running OpenClaw: Security, Automation & Maintenance