<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://blogs.tusharsaurabh.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://blogs.tusharsaurabh.com/" rel="alternate" type="text/html" /><updated>2026-05-10T13:30:03+00:00</updated><id>https://blogs.tusharsaurabh.com/feed.xml</id><title type="html">What I learnt!</title><subtitle>Journal of concepts I learn while programming, building projects, or implementing solutions.</subtitle><entry><title type="html">Does a Code Assistant Need Large Models?</title><link href="https://blogs.tusharsaurabh.com/2026/03/18/does-code-assistant-need-large-models.html" rel="alternate" type="text/html" title="Does a Code Assistant Need Large Models?" /><published>2026-03-18T00:00:00+00:00</published><updated>2026-03-18T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2026/03/18/does-code-assistant-need-large-models</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2026/03/18/does-code-assistant-need-large-models.html"><![CDATA[<style>
    :root {
      --bg:          #0f1117;
      --surface:     #1a1d27;
      --surface2:    #22263a;
      --border:      #2e3450;
      --accent:      #6c8ef7;
      --accent2:     #a78bfa;
      --green:       #34d399;
      --yellow:      #fbbf24;
      --red:         #f87171;
      --text:        #e2e8f0;
      --muted:       #94a3b8;
      --code-bg:     #141720;
      --radius:      10px;
      --max-w:       780px;
    }

    * { box-sizing: border-box; margin: 0; padding: 0; }

    body {
      background: var(--bg);
      color: var(--text);
      font-family: "Inter", "Segoe UI", system-ui, sans-serif;
      font-size: 17px;
      line-height: 1.75;
      padding: 2rem 1rem 6rem;
    }

    .container {
      max-width: var(--max-w);
      margin: 0 auto;
    }

    /* ── Header ── */
    header {
      padding: 3rem 0 2rem;
      border-bottom: 1px solid var(--border);
      margin-bottom: 2.5rem;
    }

    .tag-row {
      display: flex;
      gap: 0.5rem;
      flex-wrap: wrap;
      margin-bottom: 1.2rem;
    }

    .tag {
      font-size: 0.72rem;
      font-weight: 600;
      letter-spacing: 0.08em;
      text-transform: uppercase;
      padding: 0.25rem 0.75rem;
      border-radius: 99px;
      background: var(--surface2);
      color: var(--accent);
      border: 1px solid var(--border);
    }

    h1 {
      font-size: clamp(1.8rem, 5vw, 2.6rem);
      font-weight: 800;
      line-height: 1.2;
      background: linear-gradient(135deg, #e2e8f0 30%, var(--accent));
      -webkit-background-clip: text;
      -webkit-text-fill-color: transparent;
      background-clip: text;
      margin-bottom: 1rem;
    }

    .subtitle {
      color: var(--muted);
      font-size: 1.05rem;
      max-width: 600px;
      line-height: 1.6;
    }

    .meta {
      display: flex;
      align-items: center;
      gap: 1rem;
      margin-top: 1.5rem;
      color: var(--muted);
      font-size: 0.875rem;
    }

    .meta-dot { color: var(--border); }

    /* ── Typography ── */
    h2 {
      font-size: 1.45rem;
      font-weight: 700;
      color: #fff;
      margin: 2.5rem 0 0.9rem;
      padding-left: 0.8rem;
      border-left: 3px solid var(--accent);
    }

    h3 {
      font-size: 1.1rem;
      font-weight: 600;
      color: var(--accent2);
      margin: 1.8rem 0 0.6rem;
    }

    p { margin-bottom: 1.1rem; color: var(--text); }

    a {
      color: var(--accent);
      text-decoration: none;
      border-bottom: 1px solid transparent;
      transition: border-color 0.2s;
    }
    a:hover { border-color: var(--accent); }

    strong { color: #fff; font-weight: 600; }

    /* ── Callout boxes ── */
    .callout {
      background: var(--surface);
      border: 1px solid var(--border);
      border-left: 3px solid var(--accent);
      border-radius: var(--radius);
      padding: 1rem 1.25rem;
      margin: 1.5rem 0;
      font-size: 0.95rem;
    }
    .callout.green  { border-left-color: var(--green);  }
    .callout.yellow { border-left-color: var(--yellow); }
    .callout.purple { border-left-color: var(--accent2);}

    .callout-title {
      font-weight: 700;
      font-size: 0.8rem;
      letter-spacing: 0.06em;
      text-transform: uppercase;
      margin-bottom: 0.4rem;
      color: var(--accent);
    }
    .callout.green  .callout-title { color: var(--green);  }
    .callout.yellow .callout-title { color: var(--yellow); }
    .callout.purple .callout-title { color: var(--accent2);}

    /* ── Code ── */
    code {
      background: var(--code-bg);
      border: 1px solid var(--border);
      border-radius: 4px;
      padding: 0.15em 0.45em;
      font-family: "JetBrains Mono", "Fira Code", "Cascadia Code", monospace;
      font-size: 0.84em;
      color: #c4b5fd;
    }

    pre {
      background: var(--code-bg);
      border: 1px solid var(--border);
      border-radius: var(--radius);
      padding: 1.2rem 1.4rem;
      overflow-x: auto;
      margin: 1.2rem 0;
      font-family: "JetBrains Mono", "Fira Code", monospace;
      font-size: 0.83rem;
      line-height: 1.7;
      color: #c4b5fd;
    }
    pre code { background: none; border: none; padding: 0; color: inherit; }

    /* ── Pipeline diagram ── */
    .pipeline {
      display: flex;
      align-items: center;
      flex-wrap: wrap;
      gap: 0.4rem;
      margin: 1.4rem 0;
      padding: 1rem 1.2rem;
      background: var(--surface);
      border: 1px solid var(--border);
      border-radius: var(--radius);
    }

    .phase {
      background: var(--surface2);
      border: 1px solid var(--border);
      border-radius: 6px;
      padding: 0.35rem 0.75rem;
      font-size: 0.78rem;
      font-weight: 600;
      color: var(--accent2);
      white-space: nowrap;
    }

    .arrow {
      color: var(--muted);
      font-size: 1rem;
    }

    /* ── Benchmark table ── */
    .table-wrap {
      overflow-x: auto;
      margin: 1.5rem 0;
      border-radius: var(--radius);
      border: 1px solid var(--border);
    }

    table {
      width: 100%;
      border-collapse: collapse;
      font-size: 0.875rem;
    }

    thead th {
      background: var(--surface2);
      color: var(--muted);
      font-weight: 600;
      font-size: 0.75rem;
      text-transform: uppercase;
      letter-spacing: 0.05em;
      padding: 0.75rem 1rem;
      text-align: left;
      border-bottom: 1px solid var(--border);
    }

    tbody tr {
      border-bottom: 1px solid var(--border);
      transition: background 0.15s;
    }
    tbody tr:last-child { border-bottom: none; }
    tbody tr:hover { background: var(--surface); }

    tbody td {
      padding: 0.7rem 1rem;
      color: var(--text);
      vertical-align: middle;
    }

    .model-badge {
      display: inline-block;
      font-size: 0.7rem;
      font-weight: 700;
      padding: 0.2rem 0.55rem;
      border-radius: 99px;
    }
    .badge-ca     { background: rgba(108,142,247,0.15); color: var(--accent); }
    .badge-claude { background: rgba(167,139,250,0.15); color: var(--accent2); }

    .val-good { color: var(--green);  font-weight: 600; }
    .val-warn { color: var(--yellow); font-weight: 600; }
    .val-bad  { color: var(--red);    font-weight: 600; }
    .val-muted{ color: var(--muted); }

    /* ── Architecture cards ── */
    .cards {
      display: grid;
      grid-template-columns: repeat(auto-fit, minmax(220px, 1fr));
      gap: 1rem;
      margin: 1.4rem 0;
    }

    .card {
      background: var(--surface);
      border: 1px solid var(--border);
      border-radius: var(--radius);
      padding: 1.1rem 1.2rem;
      transition: border-color 0.2s;
    }
    .card:hover { border-color: var(--accent); }

    .card-icon {
      font-size: 1.4rem;
      margin-bottom: 0.5rem;
    }

    .card-title {
      font-weight: 700;
      font-size: 0.92rem;
      color: #fff;
      margin-bottom: 0.35rem;
    }

    .card-body {
      font-size: 0.82rem;
      color: var(--muted);
      line-height: 1.55;
      margin: 0;
    }

    /* ── Config layer diagram ── */
    .config-layers {
      display: flex;
      flex-direction: column;
      gap: 0.4rem;
      margin: 1.2rem 0;
    }

    .config-layer {
      display: flex;
      align-items: center;
      gap: 0.8rem;
      padding: 0.6rem 1rem;
      border-radius: 6px;
      border: 1px solid var(--border);
      font-size: 0.85rem;
    }

    .layer-num {
      font-weight: 800;
      font-size: 0.75rem;
      width: 1.4rem;
      height: 1.4rem;
      border-radius: 50%;
      display: flex;
      align-items: center;
      justify-content: center;
      flex-shrink: 0;
    }

    .layer-1 { background: rgba(108,142,247,0.8);  color: #fff; }
    .layer-2 { background: rgba(108,142,247,0.6);  color: #fff; }
    .layer-3 { background: rgba(108,142,247,0.4);  color: #fff; }
    .layer-4 { background: rgba(108,142,247,0.25); color: #fff; }
    .layer-5 { background: rgba(108,142,247,0.1);  color: #fff; }

    .layer-label { font-weight: 600; color: var(--text); min-width: 160px; }
    .layer-desc  { color: var(--muted); font-size: 0.8rem; }

    /* ── Verdict section ── */
    .verdict {
      background: linear-gradient(135deg, rgba(108,142,247,0.08), rgba(167,139,250,0.08));
      border: 1px solid rgba(108,142,247,0.3);
      border-radius: var(--radius);
      padding: 1.5rem 1.8rem;
      margin: 2rem 0;
    }

    .verdict h3 {
      color: var(--accent);
      margin-top: 0;
      font-size: 1rem;
    }

    /* ── Divider ── */
    hr {
      border: none;
      border-top: 1px solid var(--border);
      margin: 2.5rem 0;
    }

    /* ── Hashtags ── */
    .hashtags {
      display: flex;
      flex-wrap: wrap;
      gap: 0.5rem;
      margin-top: 0.8rem;
    }

    .hashtag {
      color: var(--accent);
      font-size: 0.85rem;
      font-weight: 500;
    }

    /* ── Footer ── */
    footer {
      margin-top: 4rem;
      padding-top: 2rem;
      border-top: 1px solid var(--border);
      color: var(--muted);
      font-size: 0.85rem;
    }

    .github-links {
      display: flex;
      gap: 1.5rem;
      flex-wrap: wrap;
      margin-top: 0.8rem;
    }
  </style>

  <!-- ══ HEADER ══ -->
  <header>
    <div class="tag-row">
      <span class="tag">AI</span>
      <span class="tag">Open Source</span>
      <span class="tag">Developer Tools</span>
      <span class="tag">Rust</span>
    </div>

    <h1>Does a Code Assistant Need Large Models?</h1>

    <p class="subtitle">
      A curious engineer's journey from an OpenAI research paper to building a
      fully local, multi-agent coding assistant — and benchmarking it against Claude.
    </p>

    <div class="meta">
      <span>Tushar Saurabh</span>
      <span class="meta-dot">·</span>
      <span>March 2026</span>
      <span class="meta-dot">·</span>
      <span>12 min read</span>
    </div>
  </header>


  <!-- ══ SECTION 1 ══ -->
  <h2>The Question That Started It All</h2>

  <p>
    For a long time, I was puzzled by a fundamental question: how can an LLM — which is
    essentially just predicting the next token — write correct code? Coding is inherently
    logical. Logic shouldn't emerge from statistical word prediction, or so I thought.
  </p>

  <p>
    So I did what any curious engineer would do: I asked GPT and Claude. That conversation
    led me to a landmark paper —
    <a href="https://arxiv.org/pdf/2107.03374" target="_blank">
      <em>Evaluating Large Language Models Trained on Code</em>
    </a> by OpenAI. One result stood out immediately.
  </p>

  <div class="callout green">
    <div class="callout-title">Key Finding — Codex Paper (2021)</div>
    For a 12-billion parameter model trained on code, the percentage of problems solved
    increased from <strong>28.8%</strong> with a single sample to <strong>77%</strong>
    when 100 samples were generated and evaluated against unit tests. The model that
    knows how to test and what to test, through iteration, converges to correct code.
  </div>

  <p>
    This answered the first part of my question: a 12B model trained on code is good
    enough. But it still did not explain <em>why</em> logic can emerge from token
    prediction.
  </p>

  <p>
    The answer was simpler than I expected. A programming language is just another
    language — but with far fewer keywords and a strict, unambiguous grammar. Code on
    GitHub and Stack Overflow always appears with surrounding context: problem statement,
    comments, variable names, tests. As long as an LLM has learned that mapping, it can
    generate code that fits the context. Logic is just a very regular sub-language of
    human writing.
  </p>

  <p>
    This realisation led to a second thought: <em>if programming is just another language
    with fewer words, a smaller and more specialised model should be sufficient.</em>
  </p>

  <div class="callout yellow">
    <div class="callout-title">The Practical Motivation</div>
    I can currently afford Claude, but what if pricing changes? The best tools should
    remain accessible. I wanted a coding assistant that runs entirely on local hardware —
    no API keys, no subscription, no data leaving my machine.
  </div>


  <!-- ══ SECTION 2 ══ -->
  <h2>Building the Local Code Assistant</h2>

  <p>
    I chose <a href="https://ollama.com" target="_blank">Ollama</a> as the inference
    backend — it runs quantised models locally with a clean API — and started with the
    <strong>Qwen 2.5 Coder</strong> family (7B, 14B, and 32B). Rather than a chat
    interface, I wanted an agent that could actually <em>write files, edit them,
    and run shell commands</em> — the things that matter for real development work.
  </p>

  <p>
    I also believe strongly in specialisation. A single all-knowing model tends to be
    average at everything. Instead, I defined distinct personas with different
    instructions, each doing one thing well.
  </p>

  <h3>Three Execution Modes</h3>

  <p><strong>1. Interactive mode</strong> — a standard REPL where you can ask questions,
  request edits, and work iteratively. The assistant maintains session history and can
  be resumed across sessions.</p>

  <p><strong>2. Pipeline mode</strong> — you hand it a requirement document and walk
  away. The full 7-phase flow runs sequentially:</p>

  <div class="pipeline">
    <span class="phase">Architect</span>
    <span class="arrow">→</span>
    <span class="phase">Implementer</span>
    <span class="arrow">→</span>
    <span class="phase">Reviewer</span>
    <span class="arrow">→</span>
    <span class="phase">Implementer (fix)</span>
    <span class="arrow">→</span>
    <span class="phase">Tester ×3</span>
    <span class="arrow">→</span>
    <span class="phase">Docs</span>
  </div>

  <p><strong>3. Quick mode</strong> — a single, fast, no-tools response for questions
  like "what does <code>git reflog</code> do?"</p>

  <p>
    Because only one model needs to be in RAM at a time in pipeline mode, this works on a
    <strong>32 GB machine</strong> without VRAM. Each phase loads its model, runs, then
    releases memory before the next phase begins.
  </p>


  <!-- ══ SECTION 3 — ARCHITECTURE ══ -->
  <h2>Architecture Deep Dive</h2>

  <p>
    The assistant is built around four interlocking systems: a multi-agent core, a RAG
    retrieval layer, an AST symbol index, and a layered configuration engine.
  </p>

  <h3>Multi-Agent Core</h3>

  <div class="cards">
    <div class="card">
      <div class="card-icon">🏛️</div>
      <div class="card-title">Architect</div>
      <p class="card-body">Plans the approach, writes acceptance criteria, and classifies
      incoming intent (conversational vs implementation vs complex). Stays on a small,
      fast model — <code>7b</code> — even when the implementer is upgraded.</p>
    </div>
    <div class="card">
      <div class="card-icon">⚙️</div>
      <div class="card-title">Implementer</div>
      <p class="card-body">Writes and edits code using tool calls: <code>write_file</code>,
      <code>edit_file</code>, <code>read_file</code>, <code>run_shell</code>. The heaviest
      persona — benefits most from a larger model (<code>14b</code> or <code>32b</code>).</p>
    </div>
    <div class="card">
      <div class="card-icon">🔍</div>
      <div class="card-title">Reviewer</div>
      <p class="card-body">Reads the generated code and produces structured findings.
      The implementer then gets one more pass to fix the issues before tests run.</p>
    </div>
    <div class="card">
      <div class="card-icon">🧪</div>
      <div class="card-title">Tester</div>
      <p class="card-body">Runs acceptance criteria against the implementation — up to
      three rounds. Each failure feeds back into the implementer for a targeted fix.
      Inspired directly by the pass@k insight from the Codex paper.</p>
    </div>
  </div>

  <h3>RAG — Retrieval-Augmented Generation</h3>

  <p>
    Before answering any substantive query, the assistant embeds the question with
    <code>nomic-embed-text</code> and retrieves the top-K semantically relevant chunks
    from a local <strong>ChromaDB</strong> vector store. This means the model always has
    real project context — actual function signatures, file contents, module structure —
    injected into its prompt, rather than relying on what it learned during training.
  </p>

  <pre><code>/index src/          # embed your codebase into the RAG index
/index src-tauri/src # works with any language</code></pre>

  <h3>AST Symbol Index</h3>

  <p>
    RAG retrieves semantically similar <em>chunks</em> of text, but sometimes you need
    <em>structural</em> answers: "what functions exist in <code>state.rs</code>?" or
    "where is <code>TerminalState</code> defined?" For this, the assistant builds a
    lightweight symbol table using <strong>tree-sitter</strong> — supporting Python,
    JavaScript, TypeScript, and Rust — stored in a local SQLite database (~1 MB for
    large codebases).
  </p>

  <p>
    At session start, a compact outline is injected into context automatically:
  </p>

  <pre><code># Symbol Map [Rust: 67 · TypeScript: 5 · Python: 8]

## src-tauri/src/state.rs [Rust]
pub struct TerminalState :6 · impl TerminalState → [new, update, reset] :13

## src-tauri/src/commands/mod.rs [Rust]
execute_command(...) :65 · register_commands(...) :12</code></pre>

  <p>
    The model also has a <code>find_symbols</code> tool it can call mid-session for
    targeted structural queries — complementing the semantic RAG search.
  </p>

  <h3>Web Tools</h3>

  <p>
    Two tools give the model access to live information when local context isn't enough:
  </p>

  <ul style="margin: 0.5rem 0 1rem 1.5rem; color: var(--text);">
    <li style="margin-bottom: 0.5rem;">
      <strong>fetch_url</strong> — fetches and parses any URL using Python's stdlib
      (<code>urllib</code> + <code>html.parser</code>). No API key, always available.
      Useful for reading documentation, GitHub issues, or Stack Overflow answers.
    </li>
    <li>
      <strong>web_search</strong> — performs a web search using either
      <a href="https://serper.dev" target="_blank">Serper API</a> (fast, structured JSON)
      or <strong>DuckDuckGo</strong> (free, no key required). Toggled by
      <code>web_search_enabled = true</code> in config. Results are injected as context
      before the model responds.
    </li>
  </ul>

  <h3>Config-Driven Design</h3>

  <p>
    Everything is driven by a layered configuration system — the same assistant can run on
    a 16 GB laptop with a 7B model today and a 128 GB workstation with a 70B model
    tomorrow without changing a line of code.
  </p>

  <div class="config-layers">
    <div class="config-layer">
      <div class="layer-num layer-1">1</div>
      <div class="layer-label">CLI flags / runtime</div>
      <div class="layer-desc">Highest priority — overrides everything</div>
    </div>
    <div class="config-layer">
      <div class="layer-num layer-2">2</div>
      <div class="layer-label"><code>CA_*</code> env vars</div>
      <div class="layer-desc">e.g. <code>CA_IMPLEMENTER_MODEL=qwen2.5-coder:32b</code></div>
    </div>
    <div class="config-layer">
      <div class="layer-num layer-3">3</div>
      <div class="layer-label"><code>ca.config</code></div>
      <div class="layer-desc">Project-level TOML — auto-generated on first launch</div>
    </div>
    <div class="config-layer">
      <div class="layer-num layer-4">4</div>
      <div class="layer-label"><code>~/.code-assistant/config.toml</code></div>
      <div class="layer-desc">Machine-level defaults for all projects</div>
    </div>
    <div class="config-layer">
      <div class="layer-num layer-5">5</div>
      <div class="layer-label">Built-in defaults</div>
      <div class="layer-desc">Sized conservatively for a 32 GB CPU machine</div>
    </div>
  </div>

  <p>
    Sensitive settings — feedback storage, session directories, API keys — are enforced
    at machine scope and silently blocked from appearing in per-project config files.
    You cannot accidentally commit credentials.
  </p>


  <!-- ══ SECTION 4 — BENCHMARKS ══ -->
  <h2>Testing the Efficacy — Benchmarks</h2>

  <p>
    Inspired by the pass@k methodology from the Codex paper, I built a benchmark harness
    that runs both the local code-assistant <em>and</em> the Claude API against identical
    requirement documents, then compares the results side-by-side.
  </p>

  <p>
    Three requirements were tested, ranging in complexity:
  </p>

  <ul style="margin: 0.5rem 0 1rem 1.5rem; color: var(--text);">
    <li><strong>req_01</strong> — Python calculator with REPL and expression parsing</li>
    <li><strong>req_02</strong> — REST API for a todo web application (FastAPI + SQLite)</li>
    <li><strong>req_03</strong> — Log analyser CLI with multi-format parsing, aggregation, and alerting</li>
  </ul>

  <h3>Benchmark Results</h3>

  <div class="table-wrap">
    <table>
      <thead>
        <tr>
          <th>Task</th>
          <th>Runner</th>
          <th>Model</th>
          <th>Time (s)</th>
          <th>API Calls</th>
          <th>Lines Written</th>
          <th>Tests Passed</th>
          <th>Syntax Errors</th>
        </tr>
      </thead>
      <tbody>
        <!-- Calculator -->
        <tr>
          <td><strong>Calculator</strong></td>
          <td><span class="model-badge badge-ca">code-assistant</span></td>
          <td class="val-muted">7b + 14b</td>
          <td>1,790</td>
          <td class="val-good">21</td>
          <td class="val-muted">71</td>
          <td class="val-warn">0</td>
          <td class="val-warn">1</td>
        </tr>
        <tr>
          <td></td>
          <td><span class="model-badge badge-claude">Claude API</span></td>
          <td class="val-muted">claude-sonnet-4-6</td>
          <td class="val-good">1,137</td>
          <td>24</td>
          <td class="val-good">2,235</td>
          <td class="val-good">218</td>
          <td class="val-good">0</td>
        </tr>
        <!-- Todo Webapp -->
        <tr style="border-top: 1px solid var(--border);">
          <td><strong>Todo API</strong></td>
          <td><span class="model-badge badge-ca">code-assistant</span></td>
          <td class="val-muted">7b + 14b</td>
          <td>2,163</td>
          <td class="val-good">23</td>
          <td class="val-muted">117</td>
          <td class="val-warn">0</td>
          <td class="val-good">0</td>
        </tr>
        <tr>
          <td></td>
          <td><span class="model-badge badge-claude">Claude API</span></td>
          <td class="val-muted">claude-sonnet-4-6</td>
          <td>2,310</td>
          <td>41</td>
          <td class="val-good">2,803</td>
          <td class="val-good">4</td>
          <td class="val-good">0</td>
        </tr>
        <!-- Log Analyser -->
        <tr style="border-top: 1px solid var(--border);">
          <td><strong>Log Analyser</strong></td>
          <td><span class="model-badge badge-ca">code-assistant</span></td>
          <td class="val-muted">7b + 14b</td>
          <td class="val-good">1,362</td>
          <td class="val-good">17</td>
          <td class="val-muted">166</td>
          <td class="val-warn">0</td>
          <td class="val-good">0</td>
        </tr>
        <tr>
          <td></td>
          <td><span class="model-badge badge-claude">Claude API</span></td>
          <td class="val-muted">claude-sonnet-4-6</td>
          <td class="val-bad">5,771</td>
          <td class="val-bad">55</td>
          <td class="val-good">5,851</td>
          <td class="val-good">329</td>
          <td class="val-good">0</td>
        </tr>
      </tbody>
    </table>
  </div>

  <div class="callout purple">
    <div class="callout-title">Reading the Numbers</div>
    The local models (7b + 14b) consistently used <strong>fewer API calls</strong> and
    finished faster on simpler tasks — but produced significantly less code and no passing
    tests. The Claude API produced comprehensive implementations with full test suites,
    but at the cost of 6–14× more tokens and much longer runtimes on complex tasks.
    Crucially: the local assistant has <strong>zero per-token cost</strong> and runs
    entirely offline.
  </div>

  <p>
    The test-passing gap narrows considerably when the 32B model is used — larger models
    follow tool-use instructions far more reliably and write tests that actually compile
    and run. The architecture is already in place; it just needs a smarter model behind it.
  </p>


  <!-- ══ SECTION 5 — INTELLIGENT TERMINAL ══ -->
  <h2>Testing on a Real Project — Intelligent Terminal</h2>

  <p>
    Portability is a long-standing pain point in software development. The Unix terminal
    is rich and powerful; Windows Command Prompt falls short; PowerShell changed the
    entire command structure. Git Bash and MinGW work but are heavy installs for what
    is essentially a compatibility shim.
  </p>

  <p>
    So I started building something I call <strong>Intelligent Terminal</strong> — a
    cross-platform terminal where every command is implemented from scratch in Rust,
    giving identical behaviour on macOS, Linux, and Windows. Eventually it will connect
    to a local LLM so you can say "list all hidden directories by size and filter those
    matching a pattern" and it just works.
  </p>

  <p>
    This project became the real test bed for code-assistant.
  </p>

  <h3>First Test — Validating Existing Commands</h3>

  <p>
    I asked the assistant to write a Python script that would execute every implemented
    command with its <code>--help</code> flag, capture the output, and compare it against
    the requirement document.
  </p>

  <p>
    It produced the script correctly. When I ran it from the root directory, it failed
    with a "file not found" error — the instruction pointed at the wrong working
    directory. Once I navigated to the correct path and re-ran, the output matched
    the requirements exactly. The logic was correct; the working directory assumption
    was not. A lesson noted.
  </p>

  <h3>Second Test — Implementing <code>nslookup</code></h3>

  <p>
    The real challenge was asking the assistant to implement the <code>nslookup</code>
    command in Rust — a moderately complex task with multiple flags, option parsing,
    and DNS query logic.
  </p>

  <p>
    <strong>Run 1:</strong> The 14B model printed the Rust code as a markdown block
    without calling a single file-writing tool. Nothing was written to disk.
    This was a known limitation of smaller models — they "explain" instead of "act."
  </p>

  <p>
    <strong>Run 2:</strong> After fixing the flag parsing bug (the <code>--req-file</code>
    flag was passed with a single dash, causing the CLI parser to read it as
    <code>-r eq-file</code> — a session resume attempt rather than a file load), the 14B
    model was upgraded to 32B. The 32B model correctly used tool calls, wrote the file,
    and ran <code>cargo build</code>.
  </p>

  <p>
    <strong>Run 3:</strong> The build failed. The model had generated <code>clap</code>
    argument parser code with duplicate short flags: <code>-d</code> was used for both
    <code>debug</code> and <code>ndots</code>; <code>-r</code> for both
    <code>recurse</code> and <code>retry</code>. Clap rejects this at runtime.
    The shell output was truncated by the tool, so the model never saw the actual
    compiler error and eventually lost track of the filename — attempting to edit a
    file called <code>ns.rs</code> that did not exist.
  </p>

  <p>
    I switched to interactive mode, manually resolved the build errors, and also fixed
    two Rust-specific issues that the model had not caught:
  </p>

  <pre><code>// Model wrote:
let matches = Command::new("nslookup").get_matches_from(args);

// Correct (won't panic on bad args):
let matches = match Command::new("nslookup").try_get_matches_from(args) {
    Ok(m) => m,
    Err(e) => return Err(e.to_string()),
};</code></pre>

  <div class="verdict">
    <h3>Honest Assessment</h3>
    <p>
      The assistant reduced my coding effort by roughly <strong>70%</strong>. The
      remaining 30% was troubleshooting — reading compiler errors, fixing edge cases,
      and correcting the occasional hallucinated filename. For someone comfortable reading
      Rust, that trade-off is extremely worthwhile. The architecture and boilerplate were
      generated correctly; only the fine-grained logic needed human intervention.
    </p>
    <p style="margin-bottom:0;">
      Ironically, this tool was built using Claude. I am using Claude to create a tool
      that can eventually replace Claude for me — which reminds me of a tweet where
      someone said "it's time to replace GitHub" and GitHub replied asking them to share
      the GitHub link.
    </p>
  </div>


  <!-- ══ SECTION 6 — LESSONS ══ -->
  <h2>Lessons Learned</h2>

  <p><strong>Model size matters for tool use.</strong> The 7B and 14B models often
  describe what to do in markdown rather than calling the appropriate tool. The 32B model
  reliably uses tools. This aligns with the pass@k finding: bigger models are not just
  smarter, they are more disciplined at following structured instructions.</p>

  <p><strong>Truncated tool output breaks the feedback loop.</strong> If a compiler error
  is cut off, the model cannot fix the bug it cannot see. The tool must surface the
  <em>end</em> of the output (where errors appear), not the beginning.</p>

  <p><strong>Specialised agents outperform a generalist.</strong> Separating Architect,
  Implementer, Reviewer, and Tester into distinct personas with different system prompts
  produces noticeably better output than asking a single agent to do everything. Each
  persona has a focused objective and fewer distractions.</p>

  <p><strong>Infrastructure beats intelligence, sometimes.</strong> RAG, AST indexing,
  layered config, and per-project memory dramatically improve the quality of local model
  output — not by making the model smarter, but by giving it better context. A 14B model
  with good context often outperforms a 32B model working blind.</p>

  <hr />

  <!-- ══ LINKS ══ -->
  <h2>Try It</h2>

  <p>Both projects are open source. The code-assistant is built to be forked and
  configured for your own hardware and preferred models.</p>

  <div class="github-links">
    <a href="https://github.com/tusharacc/code-assistant" target="_blank">
      ⭐ code-assistant on GitHub
    </a>
    <a href="https://github.com/tusharacc/intelligent_terminal" target="_blank">
      ⭐ intelligent_terminal on GitHub
    </a>
  </div>

  <!-- ══ FOOTER ══ -->
  <footer>
    <p>Written by Tushar Saurabh · Corrected and embellished by Claude · March 2026</p>
    <p style="margin-top: 0.4rem; color: var(--muted);">
      Built with curiosity, Ollama, Rust, and a healthy distrust of API bills.
    </p>
  </footer>]]></content><author><name></name></author><summary type="html"><![CDATA[:root { --bg: #0f1117; --surface: #1a1d27; --surface2: #22263a; --border: #2e3450; --accent: #6c8ef7; --accent2: #a78bfa; --green: #34d399; --yellow: #fbbf24; --red: #f87171; --text: #e2e8f0; --muted: #94a3b8; --code-bg: #141720; --radius: 10px; --max-w: 780px; }]]></summary></entry><entry><title type="html">Claude Orchestrator: Multi-Agent Software Development</title><link href="https://blogs.tusharsaurabh.com/2026/02/13/claude-orchestrator-report.html" rel="alternate" type="text/html" title="Claude Orchestrator: Multi-Agent Software Development" /><published>2026-02-13T00:00:00+00:00</published><updated>2026-02-13T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2026/02/13/claude-orchestrator-report</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2026/02/13/claude-orchestrator-report.html"><![CDATA[<div class="container">
    <p style="text-align: center; font-style: italic; color: #8b949e; margin-top: 20px;"><em>Generated by Claude
            Code</em>
    </p>

    <p style="font-size: 1.25em; color: var(--text-muted); margin-bottom: 40px;">
        Building production software with six AI agents, zero human code, and a state machine that refuses to let bad
        code ship.
    </p>
</div>

<div class="container">
    <article>
        <h2>The Problem</h2>
        <p>AI can write code. That's no longer news. What AI struggles with is writing <em>entire
                applications</em>&mdash;
            the kind with architecture decisions,
            test suites,
            build pipelines,
            and documentation that all have to agree with each other.</p>
        <p>Ask a single AI session to build a complex project and you'll hit these walls:</p>

        <ul>
            <li><strong>Context collapse.</strong>By the time you're debugging test #47, the AI has forgotten the
                architectural decisions it made 80,
                000 tokens ago.</li>
            <li><strong>Role confusion.</strong>The same AI that wrote the code is now reviewing it&mdash;
                and it's not
                going to challenge its own decisions.</li>
            <li><strong>No quality gates.</strong>There's nobody to reject bad output. The AI generates, you
                receive,
                and you debug.</li>
            <li><strong>Monolithic sessions.</strong>If anything fails halfway through,
                you start over.</li>
        </ul>
        <p>Claude Orchestrator solves this by splitting the problem across <strong>six specialized agents</strong>,
            each with its own persona,
            tools,
            and system prompt. A state machine coordinates them,
            and human approval checkpoints prevent bad output from cascading downstream.</p>
        <div class="callout">
            <div class="callout-title">The core idea</div>
            <p>Instead of one AI doing everything,
                six AIs each do one thing well. A Product Owner writes requirements. An Architect designs the
                system. A Story Author writes testable acceptance criteria. A Developer codes. An Executor runs
                tests. A Tester writes integration tests. Each agent reviews the previous agent's work.
            </p>
        </div>

        <h2>How It Works</h2>
        <h3>The Workflow State Machine</h3>
        <p>The orchestrator is a <strong>17-state machine</strong>that drives six agents through a structured
            software development lifecycle. Every transition is deterministic&mdash;
            there's no ambiguity about what happens next.
        </p>
        <div class="flow"><span class="flow-node">PO</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Approve</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Architect</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Approve</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node active">Stories</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Dev</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Execute</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Review</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Test</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Execute</span><span class="flow-arrow">&rarr;
            </span><span class="flow-node">Final Review</span></div>
        <p>Each working state maps to exactly one agent. The orchestrator calls
            <code>_execute_working_state()</code>,
            which resolves artifacts,
            invokes the agent,
            and transitions to the next state based on the result:
        </p>
        <p class="code-label">orchestrator.py &mdash;
            state execution loop</p>
        <pre><span class="kw">def</span><span class="fn">_execute_working_state</span>(self, state, input_artifacts): agent_name=self.workflow.get_next_agent() <span class="cmt"># Record git state before Developer touches the project</span><span class="kw">if</span>self.improvement_mode <span class="kw">and</span>state==WorkflowState.DEV_WORKING: self._record_git_head() result=self._execute_agent(agent_name, input_artifacts) <span class="kw">if</span>result[<span class="str">"status" </span>]==<span class="str">"success" </span>: <span class="cmt"># Collect what the Developer changed via git diff</span><span class="kw">if</span>self.improvement_mode <span class="kw">and</span>state==WorkflowState.DEV_WORKING: self._collect_project_changes() next_state=self.workflow.get_next_state(state) self.workflow.transition(next_state)</pre>
        <h3>Artifact Passing&mdash;
            Not Message Passing</h3>
        <p>Agents don't talk to each other through messages. They communicate through <strong>versioned
                files</strong>.
            The Product Owner writes <code>requirements.md</code>. The Architect reads it and writes
            <code>architecture.md</code>. The Developer reads both and writes source code. Each artifact is
            stored with metadata and version history.
        </p>
        <p>The orchestrator resolves which artifacts each agent needs:</p>
        <p class="code-label">orchestrator.py &mdash;

            artifact resolution</p>
        <pre><span class="kw">def</span><span class="fn">_build_agent_input</span>(self, state): artifacts= {}

    <span class="kw">if</span>state==WorkflowState.DEV_WORKING: <span class="cmt"># Developer needs stories+architecture+skills+constraints</span>artifacts[<span class="str">"stories" </span>]=self.artifact_store.list_artifacts(STORIES)[<span class="num">0</span>] artifacts[<span class="str">"architecture" </span>]=self.artifact_store.list_artifacts(ARCHITECTURE)[<span class="num">0</span>] artifacts[<span class="str">"skills" </span>]=self.artifact_store.list_artifacts(SKILL)[<span class="num">0</span>] artifacts[<span class="str">"constraints" </span>]=self.artifact_store.list_artifacts(CONSTRAINTS)[<span class="num">0</span>] <span class="kw">return</span>artifacts <span class="cmt"># Agents receive filenames,
    not content</span></pre>
        <div class="callout">
            <div class="callout-title">Key design decision</div>
            <p>Agents receive <strong>filenames,
                    not file content</strong>. Each agent reads the files it needs using its own tools. This
                keeps the orchestrator lightweight and lets agents decide how much context to load. </p>
        </div>
        <h3>Phased Builds</h3>
        <p>For complex projects,
            the Architect splits work into phases. Each phase cycles through the full Dev &rarr;
            Execute &rarr;
            Review &rarr;
            Test loop independently. The orchestrator tracks phase state and passes cumulative artifacts
            forward,
            so Phase 3 builds on the code from Phases 1 and 2.</p>
        <p>The portable_terminal project was built in <strong>8 phases</strong>,
            each adding a layer of shell functionality.</p>

        <h2>Case Study: Portable Terminal</h2>
        <p>To validate the orchestrator,
            we pointed it at a non-trivial task: <strong>build a cross-platform terminal emulator in
                Rust</strong>with a Tauri frontend,
            implementing 23+Unix shell commands from scratch,
            including piping,
            globbing,
            environment variables,
            tab completion,
            and command history.</p>
        <p>One input description. Zero human-written code. Here's what came out.</p>

        <div class="stats-grid">
            <div class="stat-card">
                <div class="number">23</div>
                <div class="label">Shell Commands</div>
            </div>
            <div class="stat-card">
                <div class="number">38</div>
                <div class="label">Rust Source Files</div>
            </div>
            <div class="stat-card">
                <div class="number">912</div>
                <div class="label">Total Tests</div>
            </div>
            <div class="stat-card">
                <div class="number">87%</div>
                <div class="label">Test Pass Rate</div>
            </div>
        </div>
        <h3>What Was Built</h3>
        <table>
            <thead>
                <tr>
                    <th>Category</th>
                    <th>Commands Implemented</th>
                    <th>Source File</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>File Operations</td>
                    <td>cat,
                        cp,
                        mv,
                        rm,
                        touch,
                        mkdir,
                        rmdir</td>
                    <td>7 files (3-17 KB each)</td>
                </tr>
                <tr>
                    <td>Text Processing</td>
                    <td>grep,
                        sort,
                        head,
                        tail,
                        wc,
                        diff</td>
                    <td>6 files (11-24 KB each)</td>
                </tr>
                <tr>
                    <td>Navigation</td>
                    <td>ls,
                        cd,
                        pwd,
                        find</td>
                    <td>4 files (2-19 KB each)</td>
                </tr>
                <tr>
                    <td>Environment</td>
                    <td>echo,
                        export,
                        unset,
                        env</td>
                    <td>4 files (2-4.5 KB each)</td>
                </tr>
                <tr>
                    <td>Shell Features</td>
                    <td>help,
                        history</td>
                    <td>2 files (6-32 KB each)</td>
                </tr>
                <tr>
                    <td>Infrastructure</td>
                    <td>parser,
                        pipeline,
                        router,
                        glob,
                        completions</td>
                    <td>14 core .rs files</td>
                </tr>
            </tbody>
        </table>
        <h3>Iteration History</h3>
        <p>The project evolved across 11 orchestrator sessions over 3 days. Here's how it progressed:</p>

        <div class="timeline">
            <div class="timeline-item fail">
                <div class="timeline-label">Feb 10 &middot;
                    Sessions 1-2</div>
                <div class="timeline-title">False starts</div>
                <p class="muted">SSL connectivity issues and early termination. No code generated. Cost: 0
                    tokens wasted.</p>
            </div>
            <div class="timeline-item success">
                <div class="timeline-label">Feb 10-11 &middot;
                    Session 3 (Initial Build)</div>
                <div class="timeline-title">Full 8-phase build: 23 commands,
                    453 unit tests</div>
                <p class="muted">2 hours 50 minutes. Product Owner generated 41K chars of requirements.
                    Architect designed 8 phases. Developer wrote all 38 source files across 8 phases.
                    Executor ran cargo test. 14 files exported to project root. Required one manual resume
                    after a pause.</p>
            </div>
            <div class="timeline-item success">
                <div class="timeline-label">Feb 11 &middot;
                    Session 4 (Improvement #1)</div>
                <div class="timeline-title">Shell infrastructure: piping,
                    redirections,
                    globbing</div>
                <p class="muted">Added pipeline execution,
                    glob expansion,
                    environment variable support,
                    tab completion. Requirements expanded to 70K chars. Generated 8 additional integration
                    test files. Test count jumped from 453 to 912.</p>
            </div>
            <div class="timeline-item warn">
                <div class="timeline-label">Feb 12 &middot;
                    Sessions 5-7</div>
                <div class="timeline-title">Improvement mode debugging</div>
                <p class="muted">Three sessions hit orchestrator bugs: path explosion (filenames exceeding
                    Windows MAX_PATH),
                    stale Claude session IDs,
                    and SSL drops not triggering failure states. Each bug was fixed in the orchestrator
                    code.</p>
            </div>
            <div class="timeline-item success">
                <div class="timeline-label">Feb 12 &middot;
                    Session 8 (Improvement #2)</div>
                <div class="timeline-title">Parser upgrade+infrastructure hardening</div>
                <p class="muted">Largest architecture doc (49K chars). Upgraded command parser,
                    improved piping,
                    added cross-platform build scripts. Generated the most comprehensive skill profiles (12K
                    developer, 15K tester).</p>
            </div>
        </div>
        <h3>Test Results Breakdown</h3>
        <table>
            <thead>
                <tr>
                    <th>Test Category</th>
                    <th>Pass</th>
                    <th>Fail</th>
                    <th>Pass Rate</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>Library unit tests (inline)</td>
                    <td>453</td>
                    <td>0</td>
                    <td><span class="badge badge-green">100%</span></td>
                </tr>
                <tr>
                    <td>Integration tests (real impl)</td>
                    <td>173</td>
                    <td>0</td>
                    <td><span class="badge badge-green">100%</span></td>
                </tr>
                <tr>
                    <td>Integration tests (stub impl)</td>
                    <td>31</td>
                    <td>70</td>
                    <td><span class="badge badge-red">31%</span></td>
                </tr>
                <tr>
                    <td>Pre-existing failures</td>
                    <td>&mdash;
                    </td>
                    <td>2</td>
                    <td><span class="badge badge-orange">N/A</span></td>
                </tr>
                <tr style="font-weight: 700; border-top: 2px solid var(--border)">
                    <td>Total</td>
                    <td>792</td>
                    <td>106</td>
                    <td><span class="badge badge-green">87%</span></td>
                </tr>
            </tbody>
        </table>
        <p>The 100% pass rate on real unit and integration tests is the headline number. The 106 failures
            all trace back to <strong>test quality issues</strong>,
            not code bugs&mdash;
            stub test files that never called real code,
            and a few incomplete implementations.</p>
        <h3>Defects Leaked to Production</h3>
        <p>After the orchestrator finished,
            a manual review identified <strong>8 defects</strong>:</p>
        <table>
            <thead>
                <tr>
                    <th>ID</th>
                    <th>Defect</th>
                    <th>Severity</th>
                    <th>Root Cause</th>
                    <th>Status</th>
                </tr>
            </thead>
            <tbody>
                <tr>
                    <td>D-001</td>
                    <td>Test stubs not replaced with real implementations</td>
                    <td><span class="badge badge-red">High</span></td>
                    <td>Tester agent</td>
                    <td>Partially fixed</td>
                </tr>
                <tr>
                    <td>D-002</td>
                    <td>Rust lifetime errors in test code</td>
                    <td><span class="badge badge-red">High</span></td>
                    <td>Tester agent</td>
                    <td>Fixed</td>
                </tr>
                <tr>
                    <td>D-003</td>
                    <td>Stdin not forwarded between piped commands</td>
                    <td><span class="badge badge-red">Critical</span></td>
                    <td>Developer TODO</td>
                    <td>Open</td>
                </tr>
                <tr>
                    <td>D-004</td>
                    <td>Variable expansion bypassed for simple commands</td>
                    <td><span class="badge badge-red">High</span></td>
                    <td>Developer shortcut</td>
                    <td>Open</td>
                </tr>
                <tr>
                    <td>D-005</td>
                    <td>No variable expansion in double-quoted strings</td>
                    <td><span class="badge badge-red">High</span></td>
                    <td>Developer incomplete</td>
                    <td>Open</td>
                </tr>
                <tr>
                    <td>D-006</td>
                    <td>$SHELL not read-only</td>
                    <td><span class="badge badge-orange">Medium</span></td>
                    <td>Developer incomplete</td>
                    <td>Open</td>
                </tr>
                <tr>
                    <td>D-007</td>
                    <td>Duplicate test files (stub + real)</td>
                    <td><span class="badge badge-orange">Low</span></td>
                    <td>Multi-phase artifact duplication</td>
                    <td>Open</td>
                </tr>
                <tr>
                    <td>D-008</td>
                    <td>Pre-existing test failures</td>
                    <td><span class="badge badge-orange">Low</span></td>
                    <td>Pre-existing</td>
                    <td>N/A</td>
                </tr>
            </tbody>
        </table>
        <div class="callout danger">
            <div class="callout-title">The critical defect</div>
            <p>D-003 is the most revealing. The Developer agent implemented the entire pipeline
                architecture&mdash;
                parser recognition of <code>|</code>,
                pipeline struct,
                execution loop&mdash;
                but left a <code> // TODO: Pass stdin to router</code> on the function that forwards output
                between piped
                commands. The function accepts <code>stdin</code>as a parameter but silently discards it.
                All 18 piping tests fail because of this single TODO.</p>
        </div>

        <h2>Pitfalls and Lessons Learned</h2>
        <h3>1. AI Will Leave TODOs on Critical Code Paths</h3>
        <p>The Developer agent's most dangerous behavior is implementing <em>around</em> a hard problem.
            It built the
            entire piping architecture but left the actual stdin forwarding as a TODO. The code
            compiles. Some tests even pass (the ones that don't need piping). But the core feature
            doesn' t work.</p>
        <p><strong>Fix applied:</strong> Added a CRITICAL section to the Developer prompt banning TODOs,
            requiring every parameter to be used, and prohibiting "fast path" shortcuts that bypass core
            logic.</p>
        <p class="code-label">developer_base.txt &mdash; the anti-TODO rule</p>
        <pre><span class="cmt" >## CRITICAL: No Incomplete Implementations</span> <span class="num" >1.</span> No TODOs, FIXMEs, or "implement later" comments for required functionality. <span class="num" >2.</span> Every parameter must be used. A function that accepts `stdin` but silently discards it is a critical defect. <span class="num" >3.</span> No shortcut code paths that bypass core logic. <span class="num" >4.</span> No stub functions that return hardcoded values. <span class="num" >5.</span> All code paths must work.</pre>
        <h3>2. The Tester Agent Will Write Fake Tests</h3>
        <p>The Tester generated test files with helper functions like this:</p>
        <pre><span class="cmt" > // BAD: This "tests" nothing&mdash;it always returns empty string</span>

        <span class="kw" >fn</span> <span class="fn" >execute_command</span>(cmd: &amp; <span class="cls" >str</span>) -&gt; <span class="cls" >String</span> {
            <span class="cls" >String</span>::new() <span class="cmt" > // Never calls real code</span>
        }

        <span class="cmt" > // 56 tests used this helper. All "passed." None tested anything.</span></pre>

        <p><strong>Fix applied:</strong> Added a CRITICAL section to the Tester prompt requiring all
            helpers to import and call actual codebase functions, and requiring a compile check before
            marking tests complete.</p>
        <h3>3. The Story Author Approved Failing Code</h3>
        <p>The Story Author's prompt said "reject if any tests fail." But with 792/912 tests passing, it
            approved
            anyway. The 106 failures were buried in stub test files that it couldn't distinguish from
            real failures.</p>

        <p><strong>Fix applied:</strong> Added failure categorization (code bugs vs. test stubs vs.
            compilation errors) and a 95% pass-rate threshold to the Story Author prompt.</p>
        <h3>4. Windows Path Length Explosion</h3>
        <p>Improvement mode collects changed files via <code>git diff</code> and saves them as
            artifacts. The original code flattened paths by replacing <code>/</code> with
            <code>_</code>:
        </p>
        <pre><span class="cmt" ># Session 1: .orchestrator/sessions/old/code/file.rs</span> <span class="cmt" ># Saved as: .orchestrator_sessions_old_code_file.rs</span> <span class="cmt" ># Session 2 collects Session 1's flattened files:</span>
 <span class="cmt" ># Saved as: .orchestrator_sessions_new_code_.orchestrator_sessions_old_code_file.rs</span> <span class="cmt" ># Session 3: the name doubles again...</span> <span class="cmt" ># Eventually: EXCEEDS WINDOWS 260-CHAR PATH LIMIT</span></pre>
        <p><strong>Fix applied:</strong> Three changes&mdash; filter <code>.orchestrator/</code> from
            git diffs, preserve directory structure instead of flattening, and add a safety filter in
            baseline loading.</p>
        <h3>5. Agent Failure Didn't Stop the Workflow</h3>

        <p>When an SSL connectivity drop caused the Story Author to fail, the orchestrator caught the
            error but didn't
            transition to FAILED. It continued to the documentation pass, wasting API calls on a broken
            session.</p>
        <p class="code-label">orchestrator.py &mdash; the fix</p>
        <pre><span class="cmt" ># Before: retry_current_step() returned False from working states, </span> <span class="cmt" ># but the return value was ignored</span> retried=(self.workflow.can_retry() <span class="kw" >and</span> self.workflow.retry_current_step()) <span class="kw" >if not</span> retried: self.workflow.fail_workflow(result[<span class="str" >"message" </span>])</pre>
        <h3>6. Context Window Overflow in Doc Generation</h3>
        <p>The documentation pass inlined all artifacts into the prompt. For an 8-phase project, this
            totaled <strong>5.9 million characters</strong>&mdash; far exceeding Claude's context
            window.
        </p>
        <p><strong>Fix applied:</strong> In improvement mode, skip artifact inlining (Claude reads files
            directly from the project root). Added a 600K character budget with truncation as a safety
            net for normal mode.</p>

        <h2>Architecture Decisions That Worked</h2>
        <h3>Agents as Isolated Claude CLI Sessions</h3>
        <p>Each agent runs as a separate <code>claude</code> CLI process with its own system prompt
            written to <code>.claude/CLAUDE.md</code>. This means agents don't share
            context&mdash;the Developer doesn' t know what the Product Owner thought about, only
            what it wrote down in <code>requirements.md</code>. This forces communication through
            artifacts, which is exactly how real teams work. </p>
        <p class="code-label">base_agent.py &mdash; Claude CLI invocation</p>
        <pre>response=self.claude_cli.call(prompt=full_prompt,
            system_prompt=system_prompt, <span class="cmt" ># From .claude/CLAUDE.md</span> working_dir=effective_dir, <span class="cmt" ># Session workspace or project root</span> model=self.model, <span class="cmt" ># "sonnet" by default</span> allowed_tools=self._register_tools(), <span class="cmt" ># Per-agent tool restrictions</span> timeout=<span class="kw" >None</span>, <span class="cmt" ># Wait indefinitely</span>)</pre>
        <h3>Session Resumability</h3>
        <p>All workflow state persists in SQLite. If the process crashes, the network drops, or you
            close your laptop,
            you can resume from exactly where you left off:</p>
        <pre>$ orchestrator resume session-20260210-200304-22029dec <span class="cmt" ># Reloads state machine from DB, continues from EXECUTOR_WORKING</span></pre>
        <h3>Improvement Mode</h3>
        <p>The orchestrator can improve existing projects, not just build new ones. In improvement
            mode, build agents (Developer, Executor, Tester) work directly in the project root. The
            orchestrator snapshots git HEAD before the Developer runs and collects changes via
            <code>git diff</code> afterward. Regression testing compares new test results against
            the baseline.
        </p>

        <h2>What's Next</h2>

        <h3>Near-Term</h3>
        <ul>
            <li><strong>TODO scanning as a workflow gate.</strong> After the Developer finishes,
                the orchestrator should scan for TODO/FIXME comments on code paths required by
                acceptance criteria. If found, reject and send back to the Developer&mdash;
                don't wait for tests to fail.</li>
            <li><strong>Compilation gate in the Executor.</strong> The Executor prompt now
                requires a compile check before running tests, but this should be enforced at
                the orchestrator level: if <code>cargo check</code> fails, don't waste time
                running 912 tests.</li>
            <li><strong>Test deduplication.</strong> Multi-phase builds create duplicate test
                files (stubs from Phase 1,
                real tests from Phase 3). The artifact store should track test coverage by
                feature and replace stubs when real implementations arrive.</li>
        </ul>
        <h3>Longer-Term</h3>
        <ul>
            <li><strong>Parallel agent execution.</strong> The Developer and Tester could work
                in parallel if the test suite is structured correctly. Currently all agents are
                sequential.</li>
            <li><strong>Cost tracking.</strong> Each Claude CLI call returns token usage. The
                orchestrator should aggregate and display total cost per session and per phase.
            </li>
            <li><strong>Self-healing loops.</strong> When the Executor reports test failures,
                automatically route back to the Developer with the failure output. Currently
                this requires a Story Author rejection and manual re-entry.</li>
            <li><strong>Multi-model routing.</strong> Use Opus for architecture decisions and
                Haiku for simple file operations. Currently all agents use the same model.</li>
        </ul>

        <h2>By the Numbers</h2>
        <div class="stats-grid">
            <div class="stat-card">
                <div class="number">11</div>
                <div class="label">Sessions Run</div>
            </div>
            <div class="stat-card">
                <div class="number">6</div>
                <div class="label">AI Agents</div>
            </div>
            <div class="stat-card">
                <div class="number">17</div>
                <div class="label">Workflow States</div>
            </div>
            <div class="stat-card">
                <div class="number">8</div>
                <div class="label">Build Phases</div>
            </div>
        </div>
        <div class="stats-grid">
            <div class="stat-card">
                <div class="number">23</div>
                <div class="label">Commands Built</div>
            </div>
            <div class="stat-card">
                <div class="number">38</div>
                <div class="label">Source Files</div>
            </div>
            <div class="stat-card">
                <div class="number">792</div>
                <div class="label">Tests Passing</div>
            </div>
            <div class="stat-card">
                <div class="number">8</div>
                <div class="label">Defects Found</div>
            </div>
        </div>
        <div class="callout success">
            <div class="callout-title">Bottom line</div>
            <p>A multi-agent orchestrator can build real, working software from a
                natural-language description. The 87% test pass rate isn't perfect&mdash;but
                the 100% pass rate on real (non-stub) tests shows the code itself
                is solid. The remaining defects are in the orchestrator's quality gates, not
                in the generated code' s fundamental correctness. Every defect we found led
                to a prompt or workflow fix that prevents it from happening again.</p>
        </div>
    </article>
</div>
<footer>
    <div class="container">
        <p>Claude Orchestrator &middot; Built with Claude Sonnet &amp; Opus &middot; February 2026</p>
        <p style="margin-top: 8px;">23 shell commands. Zero human-written lines of Rust.</p>
    </div>
</footer>]]></content><author><name></name></author><summary type="html"><![CDATA[Generated by Claude Code]]></summary></entry><entry><title type="html">Hiring — One Commit at a Time</title><link href="https://blogs.tusharsaurabh.com/2025/10/07/Hiring-One-Commit-at-a-Time.html" rel="alternate" type="text/html" title="Hiring — One Commit at a Time" /><published>2025-10-07T00:00:00+00:00</published><updated>2025-10-07T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2025/10/07/Hiring%20-%20One%20Commit%20at%20a%20Time</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2025/10/07/Hiring-One-Commit-at-a-Time.html"><![CDATA[<p>I have not been happy with the hiring process for a very long time.<br />
It’s time-consuming, repetitive, and worst of all — it doesn’t help a candidate grow.<br />
Somewhere between résumé polishing and HR follow-ups, the whole thing loses its soul.<br />
And let’s be honest — the process is not for the faint-hearted (or the faintly caffeinated).</p>

<p>Given the amount of data available publicly, hiring a developer <em>should</em> be simple.<br />
A LinkedIn profile, a GitHub repo, a few LeetCode submissions, some Stack Overflow karma, maybe a personal blog — that’s practically a developer’s autobiography.<br />
Why ask them to draft a résumé (and definitely not a cover story — we’re hiring devs, not novelists)?</p>

<hr />

<h3 id="the-current-saga-a-ta-tale">The Current Saga: A TA Tale</h3>

<p>Once the hiring requisition is approved, the clock starts ticking.</p>

<ul>
  <li>A JD is posted.</li>
  <li>The TA team is informed.</li>
  <li>They search LinkedIn, Naukri, or the internal database for potential candidates.</li>
  <li>They reach out to a few people, collect résumés, talk to some of them.</li>
  <li>Profiles are shared with the hiring manager.</li>
  <li>Hiring manager approves or rejects.</li>
</ul>

<p>Meanwhile, somewhere out there, a candidate has spent hours writing a résumé, possibly rehearsing answers, and maybe even buying a new shirt.<br />
Then… <em>silence</em>.<br />
No one knows why they were rejected. No one gets better. The process just repeats.</p>

<hr />

<h3 id="enter-smart-hire">Enter: Smart Hire</h3>

<p>I decided to change this loop.<br />
Meet <strong>Smart Hire</strong> — a developer hiring system that doesn’t rely on “gut feeling” but rather on <em>Git activity</em>.</p>

<p>Here’s how it works:<br />
The company posts a job, a candidate applies using their LinkedIn profile.<br />
Smart Hire automatically analyzes the profile against the job description and gives it a <strong>similarity score</strong>.</p>

<p>Now, here’s the fun part — candidates can earn <strong>bonus points</strong>:</p>
<ul>
  <li>Active GitHub repos? ✅</li>
  <li>Consistent LeetCode submissions? ✅</li>
  <li>Helping others on Stack Overflow? ✅</li>
  <li>Writing tech blogs? Double ✅</li>
</ul>

<p>If the final score is above the threshold (set by the hiring manager), the candidate can <em>immediately</em> book an interview slot.<br />
No recruiter ping-pong, no “we’ll get back to you.”</p>

<p>And if the score doesn’t meet the mark?<br />
Smart Hire gently (and smartly) explains what’s missing — maybe a specific skill, a side project idea, or a Udemy course to bridge the gap.<br />
So even if you don’t qualify <em>today</em>, you walk away <em>wiser</em>.</p>

<hr />

<h3 id="what-makes-it-smart-without-saying-ai">What Makes It “Smart” (Without Saying AI)</h3>

<p>Smart Hire doesn’t just scrape profiles — it <em>understands</em> them.<br />
It can tell that “React”, “ReactJS”, and “React.js” are the same thing (unlike some job portals that think they’re three separate careers).<br />
It knows when a candidate is diversifying tech stacks or continuously learning.<br />
It even writes empathetic feedback — the kind that says “you’re doing great” <em>before</em> pointing out the gaps.</p>

<p>Under the hood, it’s powered by some serious tech:</p>

<ul>
  <li><strong>Frontend</strong>: Angular</li>
  <li><strong>Backend</strong>: Express.js</li>
  <li><strong>Database</strong>: SQLite (for the POC)</li>
  <li><strong>Brain</strong>: GPT (but we’re not saying that out loud)</li>
  <li><strong>Integrations</strong>: LinkedIn, GitHub, Stack Overflow, LeetCode, Blog</li>
</ul>

<p>Basically, it’s like a hiring assistant that actually reads your code instead of your résumé.</p>

<hr />

<h3 id="why-bother">Why Bother?</h3>

<p>Because rejection shouldn’t feel like a void.<br />
Because a developer’s work should speak louder than bullet points.<br />
Because recruiters deserve tools that save them from copy-pasting JD lines into LinkedIn search bars.</p>

<p>Smart Hire isn’t just automating hiring — it’s making it <em>fair</em>, <em>transparent</em>, and dare I say, a little <em>human</em>.</p>

<p>It’s hiring — one commit at a time.</p>

<hr />]]></content><author><name></name></author><summary type="html"><![CDATA[I have not been happy with the hiring process for a very long time. It’s time-consuming, repetitive, and worst of all — it doesn’t help a candidate grow. Somewhere between résumé polishing and HR follow-ups, the whole thing loses its soul. And let’s be honest — the process is not for the faint-hearted (or the faintly caffeinated).]]></summary></entry><entry><title type="html">PolyGlot AI</title><link href="https://blogs.tusharsaurabh.com/2025/08/20/PolyGlot-AI.html" rel="alternate" type="text/html" title="PolyGlot AI" /><published>2025-08-20T00:00:00+00:00</published><updated>2025-08-20T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2025/08/20/PolyGlot-AI</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2025/08/20/PolyGlot-AI.html"><![CDATA[<p>Vibe coding—or pair programming with an AI assistant—is here to stay.</p>

<p>My first experience with <strong>Cursor</strong> was… well, terrible. The very first prompts gave me clean, concise code, which lulled me into a false sense of confidence. I started making lots of changes, only to realize I couldn’t recover from the faulty logic it had produced. In the end, I rewrote the code myself—this time with some help from Claude (via its web app).</p>

<p>That experience left me wary. I stopped using Cursor for a long time. Meanwhile, I kept learning about LLM engineering, agentic AI, and other buzzwords being thrown around. Over time, I picked up a few tricks about prompts and context management.</p>

<p>When I eventually gave Cursor a second chance, it was a hesitant rendezvous. This time, I didn’t finish the project either—but unlike before, I made real progress. That felt like a win.</p>

<p>But then something else caught my attention: <strong>Claude Code</strong>. I decided to build an application with it.</p>

<p>My workflow had always been a patchwork of tools:</p>
<ul>
  <li>For technical or coding tasks, I leaned on Claude.</li>
  <li>If Claude failed me, I turned to Google and used Gemini’s results.</li>
  <li>For everything else, I relied on ChatGPT.</li>
  <li>Occasionally, when I felt bold, I let ChatGPT handle code (though Claude usually came to the rescue when things broke).</li>
</ul>

<p>At one point, I thought: <em>What if I could talk to all three at once? What if they could each contribute, summarize their thoughts, and then I continue the conversation with that collective intelligence?</em></p>

<p>That was the seed of my iPhone app: <strong>PolyGlot AI</strong>.</p>

<p>Now, I’m not a mobile developer by training. I’ve only been learning iOS development on Udemy. I can navigate Xcode, I understand SwiftUI basics like <code class="language-plaintext highlighter-rouge">HStack</code> and <code class="language-plaintext highlighter-rouge">VStack</code>, but not nearly enough to build something as ambitious as PolyGlot AI without help.</p>

<p>So, I turned to Claude Code—not just for designing the UI, but also for writing the logic. This time, I took it slow. No more reckless vibe coding; just small, careful steps forward.</p>

<p>One of the fun parts was the back-and-forth with Claude. I often asked it to suggest different approaches, and it always obliged. To be polite, I usually agreed with what it recommended—but sometimes I nudged things in my own direction.</p>

<p>Of course, the journey wasn’t smooth. Claude occasionally defaulted to older models, or gave me incorrect API URLs. Maybe it could have fixed those issues if I had pressed harder, but I didn’t want to rely on it blindly. I double-checked the documentation myself, asked questions where needed, and patched the code. And yes—when I asked Claude directly to fix something, more often than not, it actually did.</p>

<hr />

<h2 id="what-i-learned">What I Learned</h2>

<ul>
  <li><strong>Claude Code is a great assistant, not a replacement.</strong> It shines when nudged in the right direction.</li>
  <li><strong>My role is to provide the solutions, not just the code.</strong> The AI handles syntax; I handle clarity of thought.</li>
  <li><strong>The future of programming might really be natural language.</strong> As someone once said, coding may eventually look more like conversation than typing symbols—and after this project, I believe that, at least to some extent.</li>
</ul>

<hr />

<h2 id="whats-next">What’s Next</h2>

<ul>
  <li><strong>Multimodal Expansion:</strong> The next step is to add multimodal support—so text, images, maybe even speech can flow through the app.</li>
  <li><strong>Smarter Coding Conversations:</strong> I’d like to refine how the app handles code-related prompts, letting it act not just as a “vote and summarize” system, but as a true collaborative partner for debugging and design.</li>
</ul>

<p>PolyGlot AI started as a simple <em>“what if”</em> thought. Now, it feels like a glimpse into how we’ll all be coding—and thinking—tomorrow.</p>

<hr />

<h2 id="polyglot-ai">PolyGlot AI</h2>

<p><img src="/what-i-learnt/assets/landing_screen.png" alt="LANDING SCREEN" /></p>

<p><img src="/what-i-learnt/assets/api_keys.png" alt="UPDATE API KEYS" /></p>

<p><img src="/what-i-learnt/assets/ask_question.png" alt="ASK QUESTION" /></p>

<p><img src="/what-i-learnt/assets/response_in_progress.png" alt="WAITING FOR RESPONSE" /></p>

<p><img src="/what-i-learnt/assets/response_complete.png" alt="RESPONSE COMPLETE" /></p>

<p><img src="/what-i-learnt/assets/can_summarise.png" alt="OPTION TO SUMMARIZE" /></p>

<p><img src="/what-i-learnt/assets/summary.png" alt="SUMMARY" /></p>]]></content><author><name></name></author><summary type="html"><![CDATA[Vibe coding—or pair programming with an AI assistant—is here to stay.]]></summary></entry><entry><title type="html">Syntax Trees for Enterprise Code</title><link href="https://blogs.tusharsaurabh.com/2025/06/06/syntax-model.html" rel="alternate" type="text/html" title="Syntax Trees for Enterprise Code" /><published>2025-06-06T00:00:00+00:00</published><updated>2025-06-06T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2025/06/06/syntax-model</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2025/06/06/syntax-model.html"><![CDATA[<p>Any enterprise application would have multiple repositories, and each repository would contain lines of code that may not be easy to understand when examined individually.</p>

<p>There was a time when I used to Unit Test the module to understand the logic. Now, it’s not possible.</p>

<p>Is there a way to understand the code and dependencies within the code and across repositories?</p>

<p>That’s the question I have decided to answer.</p>

<p>For this question, I will be working on an ASP.NET code. The toy example is the simplest code I could write.</p>

<p><a href="https://github.com/tusharacc/syntax_analyzer">Link to Code</a></p>

<p>To understand the dependencies, we need to determine if it is possible to create an Abstract Syntax Tree for the code. The answer is a resounding yes. Microsoft has open-sourced its .NET compiler called <a href="https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/">Roslyn</a>.</p>

<p>The document mentions <strong>SYNTAX</strong>, which is fundamental to Roslyn. Roslyn generates a SyntaxTree. One can extract  Methods, Variables, etc. from the SyntaxTree.</p>

<p>I am using the Microsoft package <code class="language-plaintext highlighter-rouge">CodeAnalysis.</code> The details about SyntaxTree can be found <a href="https://learn.microsoft.com/en-us/dotnet/api/microsoft.codeanalysis.syntaxtree?view=roslyn-dotnet-4.13.0">here</a></p>

<p>The code to generate the syntax tree is <a href="https://github.com/tusharacc/Analyzer">here</a></p>

<h3 id="steps-to-generate-a-syntax-tree">Steps to generate a syntax tree</h3>

<ul>
  <li>The program receives a path to a <code class="language-plaintext highlighter-rouge">repo</code> (local path). It walks through the entire path looking for any file ending in extension <code class="language-plaintext highlighter-rouge">*.cs.</code></li>
  <li>Obviously, I had to create a data structure that holds the entire Syntax tree detail for the given repo.
    <div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>...
public class ClassDetail
{
 public required string ClassName {get; set;}
 public required string FilePath {get; set;}
 public required string SourceCode {get; set;}
 ...
}
</code></pre></div>    </div>
  </li>
</ul>

<p>Refer to file <a href="https://github.com/tusharacc/Analyzer/blob/main/Analyzer/DataStructure.cs">DataStructure.cs</a></p>

<ul>
  <li>
    <p>[Optional] If working on Visual Studio, one can download the plugin <code class="language-plaintext highlighter-rouge">Syntax Visualiser</code> to help view the tree. Since I wrote this on Mac, I used a debugger to view the <code class="language-plaintext highlighter-rouge">Syntax Tree.</code> Refer to <a href="https://learn.microsoft.com/en-us/dotnet/csharp/roslyn-sdk/get-started/media/walkthrough-csharp-syntax-figure1.png">Microsoft Docs</a> for more details.</p>
  </li>
  <li>
    <p>In my case, I am more interested in Classes, Methods, local variables, and using statements. The relevant types included ClassDeclarationSyntax, MethodDeclarationSyntax, and others.</p>
  </li>
  <li>
    <p>Let’s say the WalkRepo function has identified a file called <code class="language-plaintext highlighter-rouge">AddItem.cs</code>. To determine all the class declarations, use the syntax -</p>
  </li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>var code = File.ReadAllText(file);
var syntaxTree = CSharpSyntaxTree.ParseText(code);
var root = syntaxTree.GetRoot();
foreach (var classDecl in root.DecendantNodes().OfType&lt;ClassDeclarationSyntax&gt;().ToList())
{
 var className = classDecl.Identifier.Text;
 ....
}
</code></pre></div></div>

<ul>
  <li>The idea is simple: if I want to get all the methods defined for <code class="language-plaintext highlighter-rouge">ClassDeclarationSyntax classDecl,</code> use the below command</li>
</ul>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>var methods = classDecl.DescendantNodes().OfType&lt;MethodDeclarationSyntax&gt;().ToList();
</code></pre></div></div>

<ul>
  <li>For each Syntax Type, Microsoft has provided the available properties in its document. The second option is to refer to Syntax Visualiser. If both fail, try GPT, Claude, etc.</li>
</ul>

<p>Based on the requirements, one can generate a SyntaxTree according to the expected data model. In my case, the following is a sneak peek into the syntax tree.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code> "ClassName": "ItemController",
 "FilePath": "C:\\Users\\tusharsaurabh\\Documents\\syntax_analyzer\\AddItems\\Controller\\ItemController.cs",
 "SourceCode": "[ApiController]\r\n    [Route(\u0022api/[controller]\u0022)]\r\n    public class ItemController : ControllerBase\r\n    {\r\n        private readonly InsertItem _insertItem;\r\n\r\n        public ItemController()\r\n        {\r\n            _insertItem = new InsertItem(\u0022Data Source=mydatabase.db\u0022);\r\n        }\r\n\r\n        [HttpPost(\u0022add_item\u0022)]\r\n        public IActionResult AddItem([FromBody] ItemModel item)\r\n        {\r\n            try\r\n            {\r\n                bool isInserted = _insertItem.Insert(item);\r\n                if (isInserted)\r\n                {\r\n                    return Ok(new { message = \u0022item Added Successfully\u0022, item });\r\n                }\r\n                else\r\n                {\r\n                    return BadRequest(new { message = \u0022Failed to add item\u0022 });\r\n                }\r\n            }\r\n            catch (Exception ex)\r\n            {\r\n                return BadRequest(new { message = \u0022Error Occured\u0022, error = ex.Message });\r\n            }\r\n        }\r\n    }",
 "Properties": null,
 "Methods": [
 {
 "MethodName": "AddItem",
 "SourceCode": "[HttpPost(\u0022add_item\u0022)]\r\n        public IActionResult AddItem([FromBody] ItemModel item)\r\n        {\r\n            try\r\n            {\r\n                bool isInserted = _insertItem.Insert(item);\r\n                if (isInserted)\r\n                {\r\n                    return Ok(new { message = \u0022item Added Successfully\u0022, item });\r\n                }\r\n                else\r\n                {\r\n                    return BadRequest(new { message = \u0022Failed to add item\u0022 });\r\n                }\r\n            }\r\n            catch (Exception ex)\r\n            {\r\n                return BadRequest(new { message = \u0022Error Occured\u0022, error = ex.Message });\r\n            }\r\n        }",
 "LocalVariables": [
 {
 "VariableType": "bool",
 "VariableName": "bool isInserted = _insertItem.Insert(item)"
 }
 ],
 "Arguments": [
 {
 "VariableType": "ItemModel",
 "VariableName": "item"
 }
 ],
</code></pre></div></div>

<ul>
  <li>
    <p>I have used minimal heuristics in this code, but heuristics will play a significant part in such endeavors. For example, if a folder <code class="language-plaintext highlighter-rouge">Controller</code> contains endpoints, retrieve all the methods from the files stored in a folder <code class="language-plaintext highlighter-rouge">Controller</code></p>
  </li>
  <li>
    <p>Microsoft document covers <code class="language-plaintext highlighter-rouge">SemanticModel</code> which can be used to get meaning. I will be using it in next article. Furthermore, these dependencies can be stored in any graph database such as <code class="language-plaintext highlighter-rouge">Neo4j</code>. <code class="language-plaintext highlighter-rouge">CypherQueries</code> can be used to understand the dependencies. This will be covered in the tird installment of the series.</p>
  </li>
</ul>]]></content><author><name></name></author><summary type="html"><![CDATA[Any enterprise application would have multiple repositories, and each repository would contain lines of code that may not be easy to understand when examined individually.]]></summary></entry><entry><title type="html">Time Square — A Raspberry Pi Clock</title><link href="https://blogs.tusharsaurabh.com/2025/01/12/time-square.html" rel="alternate" type="text/html" title="Time Square — A Raspberry Pi Clock" /><published>2025-01-12T00:00:00+00:00</published><updated>2025-01-12T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2025/01/12/time-square</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2025/01/12/time-square.html"><![CDATA[<p>I had three LCD screens (16x2), one 3.5” RPI display, and two Raspberry Pis lying around and gathering dust. It had been three years since I had worked on an Arduino/Raspberry Pi project. It was time to see how much I still remembered about making circuits work.</p>

<p>As usual, I decided to challenge myself. I put forth two challenges,</p>

<ol>
  <li>Use Art and Craft materials to make something different</li>
  <li>Use Vim to write/edit the code</li>
</ol>

<p>I was able to use some of my creative juice to beautify my usual digital clock, but I gave up on my second challenge.</p>

<p>A few years back, I read a tweet,</p>

<blockquote>
  <p>I couldn’t exit Vim, so I learned it.</p>
</blockquote>

<p>Well, I did learn to split the screen vertically and horizontally, copy from one file to another, and use commands such as <code class="language-plaintext highlighter-rouge">p, P, d, dd, set nu or set compatible</code>, etc., but it was high time I learned to exit Vim and speed up the development process.</p>

<p>The idea was to develop a digital clock to show the date, time, and temperature.</p>

<p>But it had to be different. I decided to align the LCD screen vertically to show the dates.</p>

<p>Check the image below to get the idea.</p>

<p><img src="/what-i-learnt/assets/time_square.jpg" alt="TIME SQUARE" /></p>

<p>The concepts I learned were -</p>

<ol>
  <li>Not all 16x2 LCD screens are the same</li>
  <li>Creating custom characters</li>
  <li><code class="language-plaintext highlighter-rouge">kivy</code> for desktop app creation</li>
  <li>Connecting the RPI screen not using the GPIO provided at the back but using a minimum number of jumper wires</li>
  <li>Connecting multiple LCDs to Raspberry Pi</li>
  <li>and finally, a few things about Raspberry Pi</li>
</ol>

<p>Let’s take one at a time.</p>

<h3 id="raspberry-pi">RASPBERRY PI</h3>

<p><code class="language-plaintext highlighter-rouge">RTFM</code> has been the mantra for becoming a good programmer, but in the days of 30-second YouTube shorts, it has been reduced to <code class="language-plaintext highlighter-rouge">TL; DR.</code> However, sometimes, it is good to refer to documentation.</p>

<p>For example, I used to Google for the <code class="language-plaintext highlighter-rouge">pin</code> header diagram, but Raspberry Pi has an out-of-the-box command.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pinout

J8:
 3V3  (1) (2)  5V    
 GPIO2  (3) (4)  5V    
 GPIO3  (5) (6)  GND   
 GPIO4  (7) (8)  GPIO14
 GND  (9) (10) GPIO15
GPIO17 (11) (12) GPIO18
GPIO27 (13) (14) GND   
GPIO22 (15) (16) GPIO23
 3V3 (17) (18) GPIO24
GPIO10 (19) (20) GND   
 GPIO9 (21) (22) GPIO25
GPIO11 (23) (24) GPIO8 
 GND (25) (26) GPIO7 
 GPIO0 (27) (28) GPIO1 
 GPIO5 (29) (30) GND   
 GPIO6 (31) (32) GPIO12
GPIO13 (33) (34) GND   
GPIO19 (35) (36) GPIO16
GPIO26 (37) (38) GPIO20
 GND (39) (40) GPIO21

J2:
GLOBAL ENABLE (1)
 GND (2)
 RUN (3)

J14:
TR01 TAP (1) (2) TR00 TAP
TR03 TAP (3) (4) TR02 TAP

</code></pre></div></div>

<p>Secondly, the RPI is loaded with three LCDs and one RPI screen. The board would become hot, and to avoid overheating, I had to put in heat sinks and a fan (5v). This meant I needed a way to check the current voltage, amperage, and temperature.</p>

<p>Here comes <code class="language-plaintext highlighter-rouge">vcgencmd</code>.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tushar@raspberrypi:~ $ vcgencmd measure_temp
temp=50.6'C
tushar@raspberrypi:~ $ vcgencmd measure_volts core
volt=0.8563V
tushar@raspberrypi:~ $ vcgencmd get_throttled
throttled=0x0
</code></pre></div></div>

<p>There was no way to check the exact amperage without buying additional hardware, so I had to chuck it. I could have used a multimeter, but I didn’t see RPI complaining about overheating, so I abandoned it.</p>

<h3 id="lcd-screens">LCD Screens</h3>

<h4 id="not-all-lcd-screens-are-the-same">Not all LCD screens are the same</h4>

<p>Initially, I was using Adafruit’s <a href="https://docs.circuitpython.org/projects/charlcd/en/latest/">CircuitPython_CharLCD</a>. This works if we connect the LCD’s individual pin to RPI, but it fails when I use an LCD with an <code class="language-plaintext highlighter-rouge">i2c</code> interface. I was on the verge of giving up when I came across <a href="https://rplcd.readthedocs.io/en/stable/">RPLCD</a>. The document mentioned that -</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Supported I²C Port Expanders

PCF8574 (used by a lot of I²C LCD adapters on Ali Express)
MCP23008 (used in Adafruit I²C LCD backpack)
MCP23017
</code></pre></div></div>

<p>That’s it. My LCDs were PCF8574. I found this by using a magnifying glass to check the <code class="language-plaintext highlighter-rouge">i2c</code> adapter.</p>

<p>Check the image below.</p>

<p><img src="/what-i-learnt/assets/i2c.jpeg" alt="TIME SQUARE" /></p>

<h4 id="custom-character">Custom Character</h4>

<p>The LCD has 16 columns and 2 rows. The library has a wrapper to write each character(predefined and baked into the library) in one of the cells. I used custom characters for all 16 columns and 2 rows to display one character. For that, I had to know that each cell has 5 tiny lights in a row and 8 such rows. Each row could be on or off using <code class="language-plaintext highlighter-rouge">0</code> or <code class="language-plaintext highlighter-rouge">1</code>. Using <code class="language-plaintext highlighter-rouge">b00111</code> will light up the three 
lights on the right, and two will be turned off. There is an excellent website to generate such binary codes for 1 cell.</p>

<p><a href="https://maxpromer.github.io/LCD-Character-Creator/">LCD Custom Code Generator</a></p>

<h4 id="connecting-multiple-lcd">Connecting Multiple LCD</h4>

<p>Since the LCDs use <code class="language-plaintext highlighter-rouge">i2c</code>, it should be enabled (by default, it is disabled on Raspberry Pi) by navigating to <code class="language-plaintext highlighter-rouge">raspi-config</code> and then to <code class="language-plaintext highlighter-rouge">display</code> and turning on the <code class="language-plaintext highlighter-rouge">i2c</code> option.</p>

<p>By default, the LCD port was <code class="language-plaintext highlighter-rouge">x27</code>; one can check by using the command -</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>tushar@raspberrypi:~ $ i2cdetect -y 1
 0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f
00:                         -- -- -- -- -- -- -- -- 
10: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
20: -- -- -- 23 -- -- 26 27 -- -- -- -- -- -- -- -- 
30: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
40: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
50: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
60: -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- 
70: -- -- -- -- -- -- -- --                         
tushar@raspberrypi:~ $ 
</code></pre></div></div>

<blockquote>
  <p>I have 3 LCDs connected, it shows ports <code class="language-plaintext highlighter-rouge">x23,26 &amp; 27</code>.</p>
</blockquote>

<p><code class="language-plaintext highlighter-rouge">i2c</code> uses SDA and SCL to transmit data, and there is only 1 such pin in the Raspberry Pi (GPIO 2 &amp; GPIO 3). To individually manage each LCD, it should listen to three different ports. This can be done by soldering <code class="language-plaintext highlighter-rouge">A0, A1 &amp; A2</code> pins. Check out the image.</p>

<p><a href="/what-i-learnt/assets/i2c_port.jpeg">LCD PORT MANIPULATION</a></p>

<p>This refers to a <code class="language-plaintext highlighter-rouge">3-bit</code> offset; if two boxes in column <code class="language-plaintext highlighter-rouge">A2</code> are soldered, it represents binary <code class="language-plaintext highlighter-rouge">100</code> (the index number represents the position; the two in A2 mean the second position from the right, starting with zero). Since <code class="language-plaintext highlighter-rouge">100</code> is <code class="language-plaintext highlighter-rouge">4</code> in decimal, the new port is <code class="language-plaintext highlighter-rouge">27-4=23.</code></p>

<h3 id="35-rpi-screen">3.5” RPI Screen</h3>

<p>The RPI screen is a plug-and-play screen; the display will latch comfortably onto the RPI GPIO pins. Since I had other wires from LCDs, I had to use jumper wires to connect them. Now I can take 26 wires and connect each pin, but then what’s the fun in that. Refer to the excellent wiki -</p>

<p><a href="http://www.lcdwiki.com/3.5inch_RPi_Display">LCD Wiki</a></p>

<p>The section on the interface refers to the mapping and function of each pin. I used only one power connection, one ground, and multiple pins except those categorized as <code class="language-plaintext highlighter-rouge">NC</code> or <code class="language-plaintext highlighter-rouge">Not Connected.</code> I didn’t need the touch ability, so I removed the touch-related pins, but the display didn’t work, so I added them back.</p>

<h3 id="kivy"><code class="language-plaintext highlighter-rouge">kivy</code></h3>

<p>I find <code class="language-plaintext highlighter-rouge">tkinter</code> not so fancy. <code class="language-plaintext highlighter-rouge">qt</code> is fancy but too complicated. Here comes <code class="language-plaintext highlighter-rouge">kivy</code>!! I am surprised that I didn’t know about <code class="language-plaintext highlighter-rouge">kivy</code>, an excellent library for creating desktop apps.</p>

<p><a href="https://kivy.org/doc/stable/">Kivy Documentation</a></p>

<p>The project has been a roller coaster ride for almost a month, but it felt worth it when I completed it today.</p>

<p>Now, I’m moving on to another project: converting my LED strips to a grow light specification. Until then, have fun!!</p>

<blockquote>
  <p>Code can be found at - https://github.com/tusharacc/time-square</p>
</blockquote>]]></content><author><name></name></author><summary type="html"><![CDATA[I had three LCD screens (16x2), one 3.5” RPI display, and two Raspberry Pis lying around and gathering dust. It had been three years since I had worked on an Arduino/Raspberry Pi project. It was time to see how much I still remembered about making circuits work.]]></summary></entry><entry><title type="html">Abstractions, Abstractions every where…</title><link href="https://blogs.tusharsaurabh.com/2024/12/20/abstractions.html" rel="alternate" type="text/html" title="Abstractions, Abstractions every where…" /><published>2024-12-20T00:00:00+00:00</published><updated>2024-12-20T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2024/12/20/abstractions</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2024/12/20/abstractions.html"><![CDATA[<p>I decided to create a small application with two components: the Chrome extension to trap all the <code class="language-plaintext highlighter-rouge">xhr</code> calls and the desktop that opens a web socket to which the Chrome extension sends all the details.</p>

<p>I wrote the desktop app using Electronjs. For reasons unknown, I decided to use Typescript with Electronjs. Well, that’s what pushed me down the rabbit hole of transpiling with Typescript, to using bundlers such as Webpack, and finally electron-forge using template webpack-typescript.</p>

<p>Let’s deal with each issue at a time.</p>

<h1 id="plain-typescript-nothing-fancy">Plain Typescript, nothing fancy!!</h1>

<p>When Electronjs loads the <code class="language-plaintext highlighter-rouge">renderer</code> process, it throws an error -</p>

<blockquote>
  <p>Uncaught ReferenceError: exports is not defined
 at renderer.js:5:23
(anonymous) @ renderer.js:5</p>
</blockquote>

<p>On inspection of <code class="language-plaintext highlighter-rouge">renderer.js,</code> the culprit is the first line, which tries to define a property on object <code class="language-plaintext highlighter-rouge">exports</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Object.defineProperty(exports, "__esModule", { value: true });
</code></pre></div></div>

<p>Check more about <a href="https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Object/defineProperty">Object.defineProperty</a></p>

<p><em>Why is this line generated?</em>, well ChatGpt says to provide ‘interoperability’. This line is trying to set the renderer process as <code class="language-plaintext highlighter-rouge">esm</code>, but there is no object defined at the top called <code class="language-plaintext highlighter-rouge">exports</code></p>

<p>Adding a line <code class="language-plaintext highlighter-rouge">var exports = {}</code> at the top in <code class="language-plaintext highlighter-rouge">index.js,</code> resolves the issue.</p>

<blockquote>
  <p>The discussion around this issue on Typescript &amp; Electronjs forum mentioned two solution, one mentioned above and second to comment the offending line</p>
</blockquote>

<p><em>What happens if it’s just plain old typescript?</em>, In that case, the line <code class="language-plaintext highlighter-rouge">Object.defineProperty</code> is also present, but then the keyword <code class="language-plaintext highlighter-rouge">exports</code> is defined and is an empty object. What’s the issue that Electronjs has? Electronjs is a web application encapsulated and presented as a desktop app. It loads the renderer process with a script tag. When a node process executes in a debug mode, vs code will show under local variables <code class="language-plaintext highlighter-rouge">exports</code></p>

<p><img src="/what-i-learnt/assets/nodejs_exports.png" alt="Local Variable Exports" /></p>

<p>The global Object on a browser is called <code class="language-plaintext highlighter-rouge">window</code>. <code class="language-plaintext highlighter-rouge">Object.keys</code> will not list anything that is <code class="language-plaintext highlighter-rouge">exports</code>.</p>

<blockquote>
  <p>The github discussion (<a href="https://github.com/electron/electron/issues/2863">click here</a>) was very helpful.</p>
</blockquote>

<h1 id="goodol-webpack">Good’ol Webpack</h1>

<p>Well, here’s the twist, when I started writing the article, I didn’t face any issue. In fact, I feel <code class="language-plaintext highlighter-rouge">webpack</code> could be the best option for making an electronjs app. As long as <code class="language-plaintext highlighter-rouge">webpack.config.ts</code> is correctly written, the app works like a charm.</p>

<blockquote>
  <p><code class="language-plaintext highlighter-rouge">webpack</code> has a config key called <code class="language-plaintext highlighter-rouge">electron-main</code> and <code class="language-plaintext highlighter-rouge">electron-renderer</code>. The entry points and target should be correct, and then electron works like a charm.</p>
</blockquote>

<h1 id="electron-forge-the-swiss-knife">Electron Forge, the swiss knife</h1>

<p><code class="language-plaintext highlighter-rouge">webpack</code> has toooo… many features, such as <code class="language-plaintext highlighter-rouge">webpack dev</code> or <code class="language-plaintext highlighter-rouge">hot reloading</code>, configuring it for the first time is no joke, but electron forge makes it easy. What’s difficult is, to understand how electron-forge works. The template that is generated for <code class="language-plaintext highlighter-rouge">index.ts</code> has a comment which says -</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// This allows TypeScript to pick up the magic constants that's auto-generated by Forge's Webpack
// plugin that tells the Electron app where to look for the Webpack-bundled app code (depending on
// whether you're running in development or production).
</code></pre></div></div>

<p>Initially, I felt it was entirely magic, untill I started peeling onion.</p>

<p>Few things that I learnt during the whole process -</p>

<ol>
  <li>To enable debugging, execute the command <code class="language-plaintext highlighter-rouge">electron-forge start -l</code></li>
  <li>Electron-forge uses <code class="language-plaintext highlighter-rouge">[debug](https://www.npmjs.com/package/debug)</code> module to list the messages while building. The messages can be read at <code class="language-plaintext highlighter-rouge">http://localhost:9000</code></li>
  <li>The renderer process is served using webserver. So, the electronjs’s renderer html can be viewed at <code class="language-plaintext highlighter-rouge">http://localhost:3000</code>. Electron-Forge is using <code class="language-plaintext highlighter-rouge">expressjs</code> to serve the artifacts related to rederer process.</li>
  <li>The standard <code class="language-plaintext highlighter-rouge">index.html</code> doesn’t have the <code class="language-plaintext highlighter-rouge">renderer</code> file path mentioned. Similarly, the main process javascipt doesn’t have <code class="language-plaintext highlighter-rouge">html</code> file path hard coded. These are generated using webpack. Check out <code class="language-plaintext highlighter-rouge">@electron-forge\webpack-plugin</code></li>
  <li>The <code class="language-plaintext highlighter-rouge">index.js</code> of main process is everything but english. To understand the code, change <code class="language-plaintext highlighter-rouge">devtool</code> option to false. It can be done by navigating to <code class="language-plaintext highlighter-rouge">\node_modules\@electron-forge\plugin-webpack\src\WebpackConfig.ts</code></li>
  <li>Debug can be enabled by setting environment variable <code class="language-plaintext highlighter-rouge">DEBUG=*</code>, (source from readme of debug package)</li>
  <li>In <code class="language-plaintext highlighter-rouge">forge.config.ts</code>, renderer option has <code class="language-plaintext highlighter-rouge">entryPoints</code>. This is an array. So, one can include multiple renderer process by defining multiple items in the array.</li>
</ol>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>entryPoints: [
          {
            html: './src/index.html',
            js: './src/renderer.ts',
            name: 'some_window',
            preload: {
              js: './src/preload.ts',
            },
          },
          {
            html: './src/another.html',
            js: './src/another.ts',
            name: 'another_window',
            preload: {
              js: './src/preload.ts',
            },
          },
        ],
</code></pre></div></div>

<p>The magic constants, would be -</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SOME_WINDOW_WEBPACK_ENTRY
ANOTHER_WINDOWS_WEBPACK_ENTRY
</code></pre></div></div>]]></content><author><name></name></author><summary type="html"><![CDATA[I decided to create a small application with two components: the Chrome extension to trap all the xhr calls and the desktop that opens a web socket to which the Chrome extension sends all the details.]]></summary></entry><entry><title type="html">The baffling case of Multiprocessing in Python</title><link href="https://blogs.tusharsaurabh.com/2024/12/06/multiprocessing.html" rel="alternate" type="text/html" title="The baffling case of Multiprocessing in Python" /><published>2024-12-06T00:00:00+00:00</published><updated>2024-12-06T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2024/12/06/multiprocessing</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2024/12/06/multiprocessing.html"><![CDATA[<p>On a fateful day, I had to analyze 50GB of application logs. Although structured, the application logs were chaotic at best because the request and response for external calls could be XML or JSON, fields could be missing, etc.</p>

<p>Python works best because I can quickly load the request or response as JSON or XML without breaking a sweat. Furthermore, add multiprocessing, and it becomes fast as well. I sweated for almost 2-3 days before ditching multiprocessing and executing each supposed ChildProcess from the console.</p>

<p>The symptom was that the execution seemed stuck, not moving forward. I allowed the code to execute on my first try because it processes 50 GB of files. I went to sleep, expecting it to be complete by the time I woke up. On further analysis, I realized the execution was stuck. Well, let’s get cracking. Below is the minimal code that will get stuck</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>import multiprocessing as mp

def foo(q):
    with open('shakespear.txt','r') as f:
        for l in f:
            q.put(l)
    print ("Completed q")

def read(q):
    while not q.empty():
        q.get()

if __name__ == '__main__':
    q = mp.Queue()
    p = mp.Process(target=foo,args=(q,))
    p.start()

    p.join()
</code></pre></div></div>

<p>When we execute this script, it won’t exit.</p>

<p>My initial assumption was that I would create child processes for each server and read the file simultaneously. While reading, it would serialize the log entries as JSON and put them into a queue. Then, I would read from the queue one at a time and be done. I would go home happy and satisfied.</p>

<p>My first mistake was <code class="language-plaintext highlighter-rouge">p.start()</code>. I assumed the main Thread would stay at this line until the child processes exit. Looking back, it makes no sense. Why will the execution on the main Thread be stuck at <code class="language-plaintext highlighter-rouge">p.start()</code>? A child process has been created; it will move ahead.</p>

<p>The second mistake, <code class="language-plaintext highlighter-rouge">p.join()</code>, will exit once the function returns. This is a big mistake. It doesn’t work that way, especially if it is sharing a pipe/queue with a parent process.</p>

<p>To make the program work, I added additional logic to <code class="language-plaintext highlighter-rouge">p.join()</code> -</p>
<ol>
    <li>adding a timeout value to the `join()` method .</li>
    <li>checking if the child process has exited (python Process object has exitcode property).</li>
    <li>checking if the queue is fully read.</li>
</ol>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>   while True:
        print (f"Process details {p.pid}, {p.is_alive()}, {p.exitcode}")
        print (p.join(2))
        if p.exitcode != None:
            break
        if q.qsize() &gt; 0:
            read(q)
</code></pre></div></div>

<p>I have <code class="language-plaintext highlighter-rouge">procmon</code> (sysinternal suites) running in background. Take a look at the image below</p>

<p><img src="/what-i-learnt/assets/process_created.png" alt="PROCESS CREATED" /></p>

<p>PID 1692 is the parent process. A few lines below, there is a row for Create Process, and in the details section, PID 2612 is mentioned. This is followed by Process Start for 2612, which is the child process created. The immediate next line is Thread create with thread ID 7348.</p>

<p>If we check the properties of row <code class="language-plaintext highlighter-rouge">Process Start</code>, procmon shows the command line to be</p>

<blockquote>
  <p>“python.exe” “-c” “from multiprocessing.spawn import spawn_main; spawn_main(parent_pid=1692, pipe_handle=448)” “–multiprocessing-fork”</p>
</blockquote>

<p>The main thread in Python is calling module spawn present in multiprocessing folder. The function called is spawn_main.</p>

<p>The code is -</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>def spawn_main(pipe_handle, parent_pid=None, tracker_fd=None):
    '''
    Run code specified by data received over pipe
    '''
    assert is_forking(sys.argv), "Not forking"
    if sys.platform == 'win32':
        import msvcrt
        import _winapi

        if parent_pid is not None:
            source_process = _winapi.OpenProcess(
                _winapi.SYNCHRONIZE | _winapi.PROCESS_DUP_HANDLE,
                False, parent_pid)
        else:
            source_process = None
        new_handle = reduction.duplicate(pipe_handle,
                                         source_process=source_process)
        fd = msvcrt.open_osfhandle(new_handle, os.O_RDONLY)
        parent_sentinel = source_process
</code></pre></div></div>

<p>multiprocessing calls <code class="language-plaintext highlighter-rouge">OpenProcess</code> with <code class="language-plaintext highlighter-rouge">PROCESS_DUP_HANDLE</code>, basically creating a bridge between parent process and child process.</p>

<p>Let’s look at the definition of <code class="language-plaintext highlighter-rouge">join</code> method on python doc</p>

<blockquote>
  <p>If the optional argument timeout is None (the default), the method blocks until the process whose join() method is called terminates. If timeout is a positive number, it blocks at most timeout seconds. Note that the method returns None if its process terminates or if the method times out. Check the process’s exitcode to determine if it terminated.</p>
</blockquote>

<blockquote>
  <p>A process can be joined many times.</p>
</blockquote>

<p>If <code class="language-plaintext highlighter-rouge">join</code> is called without timeout(default), it will wait till child process exits (blocking the execution of main thread).</p>

<p><img src="/what-i-learnt/assets/process_exit.png" alt="PROCESS EXIT" /></p>

<p>There is a row for thread  exit and process exit. Thread <code class="language-plaintext highlighter-rouge">8868</code> is the main thread and it issues the <code class="language-plaintext highlighter-rouge">ExitProcess</code></p>

<p>However, in the <code class="language-plaintext highlighter-rouge">never-ending</code> version of program, there is no thread exit process for main thread and by extension there is no <code class="language-plaintext highlighter-rouge">ExitProcess</code>. This is because, the queue is full, has not been read and hence a pipe is still available between child process and parent process, as a result, Child cannot exit.</p>

<p>Refer to the image below, there is process getting created and the main thread. But the main thread never exits, and by extension process doesn’t exit. It is waiting for <code class="language-plaintext highlighter-rouge">p.join()</code> to return, however in this case it is blocking. The console will not even accept <code class="language-plaintext highlighter-rouge">CTRL+C</code>, we need to kill the console or the parent process from another terminal.</p>

<p><img src="/what-i-learnt/assets/no_exit.png" alt="PROCESS DOESNT EXIT" /></p>]]></content><author><name></name></author><summary type="html"><![CDATA[On a fateful day, I had to analyze 50GB of application logs. Although structured, the application logs were chaotic at best because the request and response for external calls could be XML or JSON, fields could be missing, etc.]]></summary></entry><entry><title type="html">Typing, Strong vs Weak, Static vs Dynamic</title><link href="https://blogs.tusharsaurabh.com/2024/11/24/type.html" rel="alternate" type="text/html" title="Typing, Strong vs Weak, Static vs Dynamic" /><published>2024-11-24T00:00:00+00:00</published><updated>2024-11-24T00:00:00+00:00</updated><id>https://blogs.tusharsaurabh.com/2024/11/24/type</id><content type="html" xml:base="https://blogs.tusharsaurabh.com/2024/11/24/type.html"><![CDATA[<blockquote>
  <p>I thought Python was a weakly typed programming language because we didn’t need to define the variable type.</p>
</blockquote>

<p>Let’s define Type Systems!</p>

<p>The type system is a set of rules governing the allocation of memory, operations allowed, etc. Programming language can be -</p>

<ol>
  <li>Statically typed vs Dynamically typed</li>
  <li>Strongly typed vs weakly typed</li>
</ol>

<p><code class="language-plaintext highlighter-rouge">STATICALLY TYPED</code> languages are those where the programmer defines the type of variable, such as in <code class="language-plaintext highlighter-rouge">C\C++.</code></p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>int x = 0
</code></pre></div></div>

<p>My understanding of <code class="language-plaintext highlighter-rouge">statically typed</code> language is wrong because in <code class="language-plaintext highlighter-rouge">Go,</code> someone can define variables as</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>x := 0
</code></pre></div></div>

<p>As per my understanding, this should NOT be statically typed, but <code class="language-plaintext highlighter-rouge">Go</code> is a statically typed language. The correct interpretation of statically typed language is that the variable type is known at compile time.</p>

<p>This definition of static typing makes the definition of dynamic typing obvious, which is when the type is known as run-time, which is the case in <code class="language-plaintext highlighter-rouge">python</code> and <code class="language-plaintext highlighter-rouge">javascript.</code></p>

<p>What about <code class="language-plaintext highlighter-rouge">strong</code> typing? These are the rules governing operations for a type. In Python, one cannot add an <code class="language-plaintext highlighter-rouge">int</code> with a <code class="language-plaintext highlighter-rouge">string</code>; hence, Python is strongly typed, while in Javascript, one can go crazy, which makes it weakly typed.</p>

<p>That’s all for now. Have fun!!</p>

<p>PS: The article is an extremely watered-down version of Typing. Type Systems have a chapter on their own in any book related to program analysis or compilers.</p>]]></content><author><name></name></author><summary type="html"><![CDATA[I thought Python was a weakly typed programming language because we didn’t need to define the variable type.]]></summary></entry></feed>