I run 16 automated tasks every day. They monitor dependencies, check vault health, scan for ecosystem updates, review commits, reindex semantic search, and send me Telegram summaries. The total API cost is $0.
The trick is local models. An AMD RX 9070 XT with 16GB VRAM running ollama-rocm can handle qwen3:8b at 85 tok/s and qwen2.5-coder:14b at 56 tok/s. That's fast enough for maintenance tasks that don't need frontier-model reasoning.
The Heartbeat System
The scheduler is a Markdown file — heartbeat/schedule.md — that defines tasks declaratively:
### deps-monitor
- Schedule: daily 06:15
- Model: local:qwen3:8b
- Script: deps-monitor.sh
- Allowed tools: Read, Bash
- Last run: 2026-02-19 20:12
- Last result: completed in 56sA cron job reads this file, checks which tasks are due, and runs them. Each task gets a shell script that either calls a local model via ollama or runs pure computation (no model needed). Results are written back to the schedule file and batched into AM/PM Telegram digest notifications.
No framework. No orchestration platform. Just cron, bash, and ollama.
The Task Roster
Here's what runs daily, split by category:
Infrastructure (No Model Needed)
| Task | Schedule | What It Does | Time |
|---|---|---|---|
| vault-reindex | 03:00 | Re-embeds new/modified files for semantic search | 1s |
| daily-context | 06:30 | Aggregates project status, inbox, session states | 1s |
| am-digest | 09:15 | Batches morning task results into one Telegram message | 9s |
| pm-digest | 22:15 | Batches evening task results into one Telegram message | 0s |
| weekly-benchmark | Sun 08:00 | Compares week-over-week model performance stats | 0s |
| reweave | 04:00 | Scans for semantic links between vault notes | 6s |
These are pure scripts — no LLM inference. They read files, compute diffs, and format output.
Model-Powered (Local GPU)
| Task | Model | What It Does | Time |
|---|---|---|---|
| vault-health | qwen3:8b | Checks file counts, stale sessions, empty files | 13s |
| morning-briefing | qwen3:8b | Summarizes overnight commits, open tasks, action items | 3s |
| deps-monitor | qwen3:8b | Checks npm registries for dependency updates | 56s |
| ecosystem-radar | qwen3:8b | Scans for Claude Code updates, MCP changes | 25s |
| research-radar | qwen3:8b | Checks research feeds for relevant papers | 5s |
| sentry-alerts | qwen3:8b | Pulls production error summaries | 4s |
| inbox-digest | qwen3:8b | Summarizes unread messages across projects | 12s |
| industry-scan | coder:14b | Scans for industry-relevant AI tooling updates | 15s |
| rule-effectiveness | coder:14b | Analyzes which CLAUDE.md rules are preventing bugs | 78s |
| rule-decay | coder:14b | Flags rules that haven't been hit recently | 55s |
| knowledge-review | coder:14b | Reviews knowledge files for staleness | 158s |
| commit-review | coder:14b | Audits recent commits against project conventions | 29s |
The split between qwen3:8b and qwen2.5-coder:14b is deliberate. General reasoning tasks (briefings, digests, health checks) use the smaller model — it's faster and plenty smart for summarization. Code-heavy tasks (commit review, rule analysis) use the 14B coder model for better pattern matching.
Both models fit in 16GB VRAM simultaneously with the embedding model (qwen3-embedding:0.6b at 639MB), so there's no model swapping overhead.
Self-Improving Rules
The most interesting tasks are rule-effectiveness and rule-decay. My vault has a RULES.md file with coding rules — things like "always deploy after adding Convex functions" or "no Date.now() in query handlers." Each rule tracks evidence:
### No Date.now() in Convex queries
- Last verified: 2026-02-19
- Hit count: 3
- Evidence: getInventoryAlerts fix (linesheet), ...rule-effectiveness runs weekly to check if rules are actually preventing mistakes. It scans recent commits and code changes for patterns that match existing rules. If a rule catches something, the hit count increments.
rule-decay flags rules that haven't been hit in over 30 days. A rule that never fires might be too specific, already internalized, or no longer relevant. It doesn't auto-delete — it just surfaces candidates for human review.
The result is a rule set that improves over time. Rules that prove their value stay; rules that don't get questioned.
Weekly Stats
The benchmark task tracks performance across weeks. Last week (Feb 13-19):
| Metric | Value |
|---|---|
| Total runs | 31 |
| Total execution time | 909s (~15 minutes) |
| Local model runs | 18 (58%) |
| API model runs | 6 (19%) |
| Script-only runs | 7 (23%) |
| Generation speed | 80.8 tok/s |
| Prompt processing | 0.297 ms/token |
15 minutes of GPU time per day for continuous infrastructure monitoring. The GPU sits idle the other 23 hours and 45 minutes.
What Makes This Work
A few design decisions that matter:
Markdown-as-config. The schedule file is human-readable and version-controlled. No database, no YAML parser, no config format to learn. Edit the Markdown, cron picks it up.
Telegram notifications. Results batch into AM and PM digests. WARN and ERROR results push immediately. I get a morning summary at 9:15 and an evening one at 10:15 — two messages per day instead of sixteen.
Graceful degradation. If ollama is down, model-powered tasks skip with a logged error. Script-only tasks still run. Nothing crashes, nothing blocks.
Evidence-based rules. The self-improving rule system means the heartbeat isn't just monitoring — it's actively improving the development workflow. Every week, the rules get a little better tuned. The total investment was a few days of scripting and a GPU I already had. The return is continuous, zero-cost infrastructure monitoring that catches dependency updates, stale sessions, and rule violations before they become problems. For a solo developer managing six production apps and 1,567 vault files, that's a force multiplier.