Architecture Guide

A comprehensive, developer-friendly breakdown of the Imposter desktop application. This document explains how every core feature operates under the hood, the major technical challenges faced during development, and solutions that may help you build similar systems.

Electron 33+Node.js 20+Chromium RendererContext IsolationAssemblyAI V3Tesseract.js

Why Electron?

A regular web app can't do what Imposter does. Browsers sandbox JavaScript — they block access to OS-level APIs, enforce CORS on every HTTP request, and have no mechanism for content protection or global keyboard shortcuts.

Electron gives us two worlds: a full Node.js runtime (Main Process) for OS-level operations, and a Chromium browser (Renderer Process) for a rich UI. The cost is the ~150MB binary size, but the benefit is complete OS control with a web-tech UI — exactly what a stealth overlay application needs.

0ms
CORS Overhead
3
Renderer Windows
100%
Local Data Storage

1. The Multi-Process Model

Electron uses a multi-process architecture modeled after Chromium. Each process is isolated — a crash in one renderer doesn't take down the main process or other windows. Understanding this separation is key to understanding every feature in Imposter.

M

Main Process

src/main/ • Node.js Runtime

  • Window Management — Creating and destroying the transparent glass windows (Main Chat, Snipper overlay, Dynamic Island). Each is a separate BrowserWindow instance.
  • Global Shortcuts — Registering system-wide keyboard shortcuts via globalShortcut.register(). These fire even when Imposter is not focused.
  • Network I/O — All external HTTP and WebSocket traffic (Ollama, OpenRouter, AssemblyAI) flows through Main to bypass browser CORS.
  • OCR Engine — Runs tesseract.js in-process for pixel-to-text extraction.
  • Lifecycle — App bootstrap, graceful shutdown, and crash recovery.
R

Renderer Process

src/renderer/ • Chromium Sandbox

  • Chat Interface — Main window. Markdown rendering (GFM), syntax-highlighted code blocks, conversation history, settings panel.
  • Dynamic Island — A tiny pill-shaped overlay window for real-time transcription text. Separate renderer, separate lifecycle.
  • Snipper Window — Temporary fullscreen transparent overlay for region selection. Spawned on-demand, destroyed after crop.
  • Audio Worklets — Custom AudioWorkletProcessor for slicing raw PCM audio into 100ms buffered chunks.
  • Local Storage — Personas, resume, job description, and API keys are stored client-side. Never sent to any server.
B

The Bridge

preload.js • Context Isolation

  • The Renderer cannot import Node modules directly — contextIsolation: true enforces this boundary.
  • preload.js uses contextBridge.exposeInMainWorld() to inject a safe window.electronAPI object.
  • Each method maps to a specific ipcRenderer.invoke() or ipcRenderer.on() channel.
  • Channels are hard-coded — the UI cannot call arbitrary IPC channels. This prevents prototype pollution attacks.

Architecture Diagram

Imposter Architecture Diagram - OS layer, Main Process modules, Preload Bridge, Renderer windows, and External services

Data flows downward from OS-level triggers through the Main Process, across the Preload bridge into Renderer UIs. External services (AssemblyAI, Ollama, OpenRouter) are accessed only from Main Process via native Node.js networking.

IPC Communication Patterns

All cross-process communication in Imposter uses Electron's IPC module. There are three distinct patterns depending on the direction and nature of the message:

Request → Response

Renderer calls ipcRenderer.invoke(channel, data) and awaits a promise from Main.

Used for: AI chat queries, OCR results, model list fetching.

Fire & Forget

Renderer calls ipcRenderer.send(channel, data) with no response expected.

Used for: Audio chunk streaming, window state changes, stopping transcription.

Main → Renderer Push

Main Process calls win.webContents.send(channel, data) to push data into the UI.

Used for: Live transcription text, screenshot delivery, settings sync.

2. Feature Deep-Dives

Step-by-step breakdowns of how each major subsystem works internally.

A

The Stealth Mode UI System

Goal: Create a UI that floats above everything, doesn't show up in screen sharing, and hides from the taskbar.

1

Window Creation

The Main Process creates a BrowserWindow with transparent: true, frame: false, and skipTaskbar: true. This makes the window completely invisible in system UI — no title bar, no taskbar entry.

2

Z-Order Pinning

win.setAlwaysOnTop(true, 'screen-saver') pins the window at the screen-saver z-level — above almost every other OS element, including fullscreen applications and system dialogs.

3

Content Protection (DRM)

win.setContentProtection(true) leverages OS-level DRM to prevent screen recording tools (OBS, Teams, Zoom, Discord) from seeing the window. The window renders as a black rectangle or is completely invisible to screen capture APIs. This is a Windows DXGI hardware flag, not a software trick.

4

Glassmorphism Layer

With opacity: 0.9 and CSS backdrop-filter: blur(), the window blends into whatever is behind it. Combined with transparency, the app feels like a floating glass panel rather than a separate application.

B

Screen Snipping & OCR Pipeline

Goal: Let the user draw a box on the screen, capture the text inside it, and insert it into the prompt.

1

Global Shortcut Trigger

User presses Ctrl+Shift+S. The Main Process intercepts via globalShortcut.register().

2

Desktop Capture

Main Process uses desktopCapturer.getSources() to take a full-screen screenshot as a NativeImage buffer. This is immediately passed via IPC to a new temporary fullscreen Snipper Window.

3

Region Selection

The user draws a rectangle on the frozen screenshot overlay. The Snipper captures coordinates (x, y, width, height) and sends them back to Main via IPC. Pressing Esc cancels the operation.

4

Crop & OCR

Main Process crops the original screenshot using nativeImage.crop(), then feeds the pixel buffer to tesseract.js which runs entirely locally — no data leaves the machine.

5

Text Delivery

Extracted text is sent via IPC to the Chat Renderer and auto-appended to the prompt textarea. The Snipper Window is destroyed. Total round-trip: ~2 seconds for most text.

C

Real-Time Voice Transcription

Goal: Capture system audio, stream it to AssemblyAI for live transcription, and display rolling text in the Dynamic Island overlay.

1

Audio Capture

User clicks "Live." The Renderer uses navigator.mediaDevices.getDisplayMedia({ audio: true }) to capture system audio. This is the only API that captures what the speakers output, not the microphone.

2

AudioWorklet Processing

Raw audio is piped through an AudioContext into a custom AudioWorkletProcessor that buffers 100ms of 16kHz mono PCM samples. This buffer size is critical — AssemblyAI rejects chunks shorter than ~100ms (Error 3007: Input Duration Violation).

3

IPC Relay

The Worklet posts base64-encoded audio chunks to the Renderer, which immediately relays them to Main Process via ipcRenderer.send('audio-chunk', data). Fire-and-forget pattern — no await, no backpressure.

4

WebSocket to AssemblyAI

Main Process manages a persistent WSS connection to AssemblyAI's Streaming Transcription V3 endpoint. Audio chunks are funneled into the socket. A 15-second connection timeout and isConnecting flag guard against hung connections.

5

Dual-Window Broadcast

AssemblyAI replies with partial and final transcription JSON. Main Process broadcasts the text to both the Chat Window and the Dynamic Island Window simultaneously using webContents.send().

Dev Tip: Press F10 to grab the last finalized transcript and send it directly to the AI chat engine as a prompt — voice-to-AI in one keystroke.
D

Multi-Provider AI Orchestration

Goal: Query local models (Ollama) or cloud models (OpenRouter) seamlessly, enriched with contextual data.

1. Context Assembly

The Renderer pulls from LocalStorage: active Persona (12 built-in interview types), System Prompt, user Resume, and target Job Description. These are merged into a hidden system message prepended to every request.

2. Provider Selection

The UI determines which backend to hit based on user settings — Ollama for fully local, CORS-free inference, or OpenRouter for cloud-hosted models. Both are called the same way through IPC.

3. CORS-Free Fetch

The payload is passed to Main Process which uses native net.fetch() — a Node.js network call that bypasses all browser CORS policies. A 120-second AbortController timeout protects against hanging requests.

4. Response Delivery

The AI response (Markdown string) is pushed back to the Renderer. The UI renders it with syntax highlighting, one-click copy for code blocks, and full GFM support including tables and task lists.

3. Window Lifecycle Management

Imposter manages three distinct window types, each with different creation timing, persistence, and destruction behavior:

WindowCreatedLifetimeContent ProtectionAlways On Top
Chat (Main)App bootPersistentYes (Stealth)screen-saver level
Dynamic IslandApp bootPersistentYesscreen-saver level
SnipperOn shortcutEphemeralNoFullscreen

4. Security Model

Privacy isn't a feature — it's the architectural foundation. Every design choice is made to ensure data stays local.

Context Isolation

Renderers cannot access Node.js. All OS interactions go through hard-coded IPC channels in preload.js. No dynamic channel creation is possible.

Zero Cloud Storage

Personas, resumes, API keys, and conversation history are stored in browser LocalStorage. No telemetry, no analytics, no server-side persistence.

Network Minimalism

Only two outbound connections exist: LLM API calls (user-configured) and AssemblyAI WebSocket (optional). Both require user-provided API keys.

Volatile Runtime

No GPU cache, no disk-level session persistence. When the app closes, in-memory state is destroyed. Only explicit LocalStorage data survives restarts.

5. Development Challenges & Solutions

Building a robust desktop overlay app is fundamentally different from web development. Here are the real problems encountered and how they were solved:

Silent crashes on API failures or network timeouts

Deep, process-level crash guards: process.on('uncaughtException') and process.on('unhandledRejection').

Every IPC handler and Window creation call is wrapped in try/catch. A 120-second AbortController prevents AI requests from hanging indefinitely.

WebSocket connections to AssemblyAI randomly dropping or hanging

An aggressive isConnecting state flag prevents duplicate connections. A 15-second connection timeout fires if the handshake stalls.

Teardown logic individually cleans up AudioContext, WorkletNode, MediaStream tracks, and WebSocket — preventing cascading resource leaks where one dangling handle blocks garbage collection.

CORS errors when calling OpenRouter/Ollama from the UI

Browsers enforce CORS; Node.js doesn't. Solution: all API traffic is proxied through the Main Process using net.fetch(). The Renderer never makes direct HTTP calls to external services.

This also provides better error handling — Main Process can catch 4xx/5xx responses and transform them into user-friendly error messages before sending to the UI.

White screen on high load (Render Process Gone)

Listener on win.webContents.on('render-process-gone') instantly destroys the corrupted Chromium process and spawns a fresh replacement window.

The Main Process is unaffected — it continues managing shortcuts, transcription, and other windows. Only the crashed renderer is recycled.

AssemblyAI Error 3007: Input Duration Violation

The default ScriptProcessorNode sent audio frames too small for AssemblyAI's minimum duration requirement.

Replaced with a custom AudioWorkletProcessor that accumulates samples in an internal ring buffer and only flushes when 100ms of audio is available. This eliminated all duration violations.

6. Contributing & Local Development

Imposter is open source and welcomes contributions. Here's how to get started:

# Clone and install
git clone https://github.com/Puskar-Roy/Imposter.git
cd Imposter
npm install
# Start in development mode
npm run dev
# Build for production (Windows)
npm run build
Key files to explore
  • src/main/main.js — App entry point
  • src/main/preload.js — IPC bridge
  • src/renderer/app.js — Chat UI logic
  • src/renderer/pcm-processor.js — Audio worklet
Environment variables
  • ASSEMBLYAI_API_KEY — For transcription
  • OPENROUTER_API_KEY — For cloud AI
  • Ollama runs locally — no key needed

Open Source