Riding the Agent Wave: The Future of Agentic Software
Sharing my experience setting up a personal agent using PicoClaw, Raspberry Pi, and Mac Studio to handle real user tasks.

Jonathan Cecil
Editor
Abstract
The "Agent Wave" is not just about chatbots, it is a fundamental shift on how we integrate distinct workflows to complete end to end task. Here is how I setup a local agentic system using PicoClaw running on Raspberry Pi 3B+, a Mac Studio hosting the Inference Engine (Ollama), and a Telegram bot for interaction., without a single dollar spent on new hardware. Having a digital worker that completes an personal end to end task running locally isn't just a privacy win; it's a blueprint for the next decade of computing.
The Architecture Stack
One of my Goal is to set up my Agent using Local Models. I ended up having to separate the Agent Runtime (PicoClaw) from the Inference Engine (Ollama). My original attempt to use LMStudio for Inference hit a number of hiccups with context window persistence, jinga template issues, and a lack of consistency across models that made me do the switch to Ollama.
Ollama
Inference
PicoClaw
Agent Runtime
Telegram
Chat
Models in Use
Currently, my daily driver is qwen3.5:35b, which provides a massive context and stable performance for general orchestration. For tasks requiring deeper logic, I pivot to the qwen3-5-27b-opus-thinking model, which uses a chain of thought scratchpad to navigate complex instructions. When I need to engage during intense deep dive sessions with rapid iterations, I switch to cloud models like Gemini 3, leveraging its speed and Multimodal capabilities to complete complex tasks on time.
Wins: Things that went Well
1. Local LLMs
Setting up Local LLMs were relatively straight forward and a no brainer considering I have a Mac Studio with 32GB of RAM already. Which allowed me to run a 35B model utilizing 100% GPU offloading. This also means I could interact with my AI Agent with sensitive data without worrying about data privacy, such as reading my homelab configurations, IP address and my personal notes.
2. Scalable Local Memory Architecture
PicoClaw uses a lightweight, filesystem based memory system optimized for the Raspberry Pi. This approach avoids heavy databases by separating memory into distinct, easily parsed files for active knowledge (context, goals, and patterns), task management (fast objective tracking), chronological archival storage (keeping the workspace uncluttered), and secure contexts for accessing isolated, sensitive domains.
3. Telegram ID Based Routing
Instead of a single, chaotic chat history, I leveraged Telegram Groups which integrated with topic isolation. PicoClaw natively reads the group id, effectively treating messages from each group as an isolated conversation. This is an important feature for me as I can now use my AI Agent for multiple purposes without worrying about context switching or context contamination.
4. Tmpfs Systemd Ramdisk
This was a critical pain point Raspberry Pi has strict read-only /tmp filesystem restrictions, which effectively slowed down all skill installations until I found this fix. I bypassed it by injecting PrivateTmp=yes and ReadWritePaths=/tmp into the systemd override file. This was a total game changer, allowing PicoClaw to download and install new skills directly without asking it to manually use tmp directly inside the workspace.
Model Utilization Distribution
Challenges: Things that did not Go Well
Building an edge compute instance is never without its challenges. Here are the roadblocks I hit and how I navigated them.
Retries / Uninteded DDOS
When I first implemented a subagent monitoring task meant to guarantee overall task completion, I triggered an unintended "infinite retry" loop due to the heartbeat configuration. Failed subagent tasks were retried relentlessly, effectively DDoS-ing my local inference engine and blocking real tasks indefinitely. This became multiday troubleshooting excercise before I identified and disabled the faulty configuration in the Inference Runtime.
The 32 Bit ARM Dependency Trap
My attempt to setup the scrapling skill to bypass bot checks failed. The underlying playwright library does not support 32 bit ARM (armv7l), and a ton of new skills requires Python 3.10+, while the Pi's Bullseye OS caps at Python 3.9.
- Fix: I had to abandon heavy browser automation on this hardware for now. To future proof this 3 year old, $50 setup, I'll likely need to either pave the Pi with a 64 bit Bookworm OS or finally bite the bullet and upgrade the agent runtime to a Mac Mini (ride the trend).
Context Limit Crashes
I kept hitting cannot truncate prompt with nkeep >= nctx (4096) errors. Because PicoClaw sends massive system prompts (including all tool instructions), it instantly blew past the default 4K context window.
- Fix: Switch to Ollama and permanently increased the context length to 32K in the model configuration.
The "Thinking" Dilemma
The qwen3-5-27b-opus-thinking model flooded my Telegram chats with massive walls of <think> text before actually answering.
- Fix: I injected
{%- set enable_thinking = false %}into the Jinja template and stripped the hardcoded<think>tag to force the model to skip the scratchpad phase. - The Unintended Consequence: Stripping the model's ability to "think" out loud severely lobotomized its logic. It started failing to format JSON tool calls correctly. It would infinitely retry, hit the
max_tool_iterationslimit, and never replied. I ultimately had to enable thinking and just accept the chat clutter.
Real World Use Cases
Despite the hurdles, the agent is actively handling real work:
- Near Realtime Monitoring Stack: My primary use case involves a fleet of agents that monitor the various websites and services I own & manage. If a site goes down or a service becomes unresponsive, the agent detects the latency spike and response code and immediately pushes a alert to my Telegram Monitoring Group, allowing me to engage quickly before my users notify me.
- Knowledge Retrieval (Local RAG): I use the bot to query Notion for my home lab documentation, network configurations, or past configuration commands so I don't have to hunt for them manually.
- Github Ops My primary use case is to monitor my github repositories for my CI/CD ops and deployments and update me on any new PRs or issues have been rasied. It is also turning out to be a proxy to run direct gh commands in chat instead of terminal.
What’s Next?
While the current setup is a major win for privacy , there is still room to grow.
- Multi Bot Sandboxing: I am exploring the option of running multiple distinct Telegram bots on the same Raspberry Pi by cloning the orchestration directories and spinning up isolated
systemdservices. - OpenClaw Pivot: I am also evaluating whether to pivot certain complex workflows to OpenClaw. While PicoClaw is incredibly lightweight and perfect for the Pi, OpenClaw’s richer ecosystem might be worth the extra overhead for reasoning tasks.
- Token & Success Rate Monitoring: While anecdotally I use Qwen3.5 for 85% of my tasks, I am looking to build a monitoring dashboard that tracks my input/output tokens and other metrics on a monthly basis to help build a comprehensive data driven summary of agent performance and potential costs.
About the Author
Continue Reading
Enter the Agent: How AI Killed the SaaS Business Model
The release of Anthropic Claude Cowork catalyzed the 'SaaSpocalypse.' Discover why investors are fleeing traditional SaaS for AI agents and what replaces 'per seat' pricing.
War in the Middle East: Timeline of the US Israel Iran Conflict
Updated Daily to reflect the latest on the US Israel Iran conflict, tracking military actions, economic impacts, and market reactions.
Week 6: The Energy Deadline & The Tuesday Pivot
The sixth week of the conflict: Trump issues a final 48-hour ultimatum, the first US fighter jet is downed inside Iran, and the April 6 energy deadline looms.