How many automated tasks run daily in the heartbeat system?

57+ tasks run daily, covering infrastructure monitoring (vault reindexing, health checks, semantic linking), dependency scanning (npm registries, ecosystem updates), and code analysis (commit review, rule effectiveness tracking). They split between model-powered tasks (35+ using various models) and pure scripts (22+).

What is the total API cost for this automation system?

The total API cost is $0. All tasks run entirely on local models using Ollama on an AMD RX 9070 XT GPU with 16GB VRAM. The system never calls external APIs for inference, making it suitable for offline environments or cost-sensitive deployments.

Which language models are used and what are their speeds?

Qwen3:8b runs at 85 tok/s for general reasoning tasks (morning briefings, health checks, inbox summaries). Qwen2.5-coder:14b runs at 56 tok/s for code-heavy tasks (commit review, rule analysis, knowledge review). Both fit simultaneously with the embedding model (qwen3-embedding:0.6b) in 16GB VRAM without swapping.

How much GPU time is actually required daily?

Only 15 minutes of GPU time per day is needed. The total execution time is 909 seconds (all 31 runs combined, with local model runs taking 18 runs and pure scripts taking 7). The GPU sits idle for the remaining 23 hours and 45 minutes, allowing it to cool and power-down.

What happens if Ollama or the GPU becomes unavailable?

The system gracefully degrades — script-only tasks (7 out of 16) continue running normally since they don't require inference. Model-powered tasks skip with logged errors, preventing cascading failures. Nothing crashes or blocks, maintaining partial observability even during GPU maintenance or outages.

Zero-Cost Automation: 16 Tasks on a Local GPU

I run 16 automated tasks every day. They monitor dependencies, check vault health, scan for ecosystem updates, review commits, reindex semantic search, and send me Telegram summaries. The total API cost is $0.

The trick is local models. An AMD RX 9070 XT with 16GB VRAM running ollama-rocm can handle qwen3:8b at 85 tok/s and qwen2.5-coder:14b at 56 tok/s. That's fast enough for maintenance tasks that don't need frontier-model reasoning.

The Heartbeat System

The scheduler is a Markdown file — heartbeat/schedule.md — that defines tasks declaratively:

### deps-monitor
- Schedule: daily 06:15
- Model: local:qwen3:8b
- Script: deps-monitor.sh
- Allowed tools: Read, Bash
- Last run: 2026-02-19 20:12
- Last result: completed in 56s

A cron job reads this file, checks which tasks are due, and runs them. Each task gets a shell script that either calls a local model via ollama or runs pure computation (no model needed). Results are written back to the schedule file and batched into AM/PM Telegram digest notifications.

No framework. No orchestration platform. Just cron, bash, and ollama.

The Task Roster

Here's what runs daily, split by category:

Infrastructure (No Model Needed)

Task	Schedule	What It Does	Time
vault-reindex	03:00	Re-embeds new/modified files for semantic search	1s
daily-context	06:30	Aggregates project status, inbox, session states	1s
am-digest	09:15	Batches morning task results into one Telegram message	9s
pm-digest	22:15	Batches evening task results into one Telegram message	0s
weekly-benchmark	Sun 08:00	Compares week-over-week model performance stats	0s
reweave	04:00	Scans for semantic links between vault notes	6s

These are pure scripts — no LLM inference. They read files, compute diffs, and format output.

Model-Powered (Local GPU)

Task	Model	What It Does	Time
vault-health	qwen3:8b	Checks file counts, stale sessions, empty files	13s
morning-briefing	qwen3:8b	Summarizes overnight commits, open tasks, action items	3s
deps-monitor	qwen3:8b	Checks npm registries for dependency updates	56s
ecosystem-radar	qwen3:8b	Scans for Claude Code updates, MCP changes	25s
research-radar	qwen3:8b	Checks research feeds for relevant papers	5s
sentry-alerts	qwen3:8b	Pulls production error summaries	4s
inbox-digest	qwen3:8b	Summarizes unread messages across projects	12s
industry-scan	coder:14b	Scans for industry-relevant AI tooling updates	15s
rule-effectiveness	coder:14b	Analyzes which CLAUDE.md rules are preventing bugs	78s
rule-decay	coder:14b	Flags rules that haven't been hit recently	55s
knowledge-review	coder:14b	Reviews knowledge files for staleness	158s
commit-review	coder:14b	Audits recent commits against project conventions	29s

The split between qwen3:8b and qwen2.5-coder:14b is deliberate. General reasoning tasks (briefings, digests, health checks) use the smaller model — it's faster and plenty smart for summarization. Code-heavy tasks (commit review, rule analysis) use the 14B coder model for better pattern matching.

Both models fit in 16GB VRAM simultaneously with the embedding model (qwen3-embedding:0.6b at 639MB), so there's no model swapping overhead.

Self-Improving Rules

The most interesting tasks are rule-effectiveness and rule-decay. My vault has a RULES.md file with coding rules — things like "always deploy after adding Convex functions" or "no Date.now() in query handlers." Each rule tracks evidence:

### No Date.now() in Convex queries
- Last verified: 2026-02-19
- Hit count: 3
- Evidence: getInventoryAlerts fix (linesheet), ...

rule-effectiveness runs weekly to check if rules are actually preventing mistakes. It scans recent commits and code changes for patterns that match existing rules. If a rule catches something, the hit count increments.

rule-decay flags rules that haven't been hit in over 30 days. A rule that never fires might be too specific, already internalized, or no longer relevant. It doesn't auto-delete — it just surfaces candidates for human review.

The result is a rule set that improves over time. Rules that prove their value stay; rules that don't get questioned.

Weekly Stats

The benchmark task tracks performance across weeks. Last week (Feb 13-19):

Metric	Value
Total runs	31
Total execution time	909s (~15 minutes)
Local model runs	18 (58%)
API model runs	6 (19%)
Script-only runs	7 (23%)
Generation speed	80.8 tok/s
Prompt processing	0.297 ms/token

15 minutes of GPU time per day for continuous infrastructure monitoring. The GPU sits idle the other 23 hours and 45 minutes.

What Makes This Work

A few design decisions that matter:

Markdown-as-config. The schedule file is human-readable and version-controlled. No database, no YAML parser, no config format to learn. Edit the Markdown, cron picks it up.

Telegram notifications. Results batch into AM and PM digests. WARN and ERROR results push immediately. I get a morning summary at 9:15 and an evening one at 10:15 — two messages per day instead of sixteen.

Graceful degradation. If ollama is down, model-powered tasks skip with a logged error. Script-only tasks still run. Nothing crashes, nothing blocks.

Evidence-based rules. The self-improving rule system means the heartbeat isn't just monitoring — it's actively improving the development workflow. Every week, the rules get a little better tuned. The total investment was a few days of scripting and a GPU I already had. The return is continuous, zero-cost infrastructure monitoring that catches dependency updates, stale sessions, and rule violations before they become problems. For a solo developer managing six production apps and 1,567 vault files, that's a force multiplier.

Zero-Cost Automation: 16 Tasks on a Local GPU

The Heartbeat System

The Task Roster

Infrastructure (No Model Needed)

Model-Powered (Local GPU)

Self-Improving Rules

Weekly Stats

What Makes This Work

Further Reading

Related Articles

Building a Fashion Trend Intelligence Pipeline for $3/Month

Private AI for Legal Work

Overnight Results: From 3.76 to 4.35

Sources

About the Author

Vache Sarkissian