← Back to Seminary

Architecture

How Seminary organizes 10M+ records across 10 scholarly domains into a responsive desktop research tool.

Stack

  ┌─────────────────────────────────────────┐
  │  React + Zustand                        │  Frontend
  │  10 domain components, progressive UI   │
  └──────────────┬──────────────────────────┘
                 │  Tauri IPC (JSON-RPC)
  ┌──────────────▼──────────────────────────┐
  │  Rust / Tauri v2                        │  Backend
  │  Query engine, domain routing, caching  │
  └──────────────┬──────────────────────────┘
                 │  rusqlite
  ┌──────────────▼──────────────────────────┐
  │  SQLite (~2GB)                          │  Data
  │  30+ tables, 10M+ records              │
  │  FTS5 full-text search                  │
  └─────────────────────────────────────────┘

  ┌─────────────────────────────────────────┐
  │  50+ Python ETL Loaders                 │  Ingestion
  │  One per data source, TOML-configured   │
  └─────────────────────────────────────────┘

10 Analytical Domains

Each domain has a dedicated UI component, backend query path, and data model. Domains surface progressively — summary first, depth on demand.

Lexical

Word definitions across 12+ lexicons (BDB, HALOT, LSJ, BDAG)

Morphological

Parsing and inflection analysis for Hebrew, Greek, Aramaic

Textual-Critical

Manuscript variants — MT, DSS, LXX, Byzantine traditions

Syntactic

Clause structure, word order, grammatical relationships

Discourse

Pericope-level structure, rhetorical patterns, chiasms

Versional

185+ translations compared side-by-side with originals

Diction

Word frequency, hapax legomena, semantic range analysis

Intertextual

2.5M+ cross-references: allusions, quotations, parallels

Onomastic

Proper name analysis — etymology, geographic, genealogical

Prosodic

Hebrew poetry — parallelism, meter, accentuation (te'amim)

Data Scale

CategoryScale
Database size~2 GB
Total records10M+
Tables30+
Languages covered9 ancient languages
Lexicons12+
Bible translations185+
Cross-references2.5M+
ETL loaders50+
Tests337 (99 Rust + 129 JS + 109 Python)

Design Decisions

Rust/Tauri over Electron

Tauri v2 with Rust backend

SQLite queries over 10M records need native performance. Tauri's Rust backend handles the query engine with zero GC pauses. Binary size is ~15MB vs 150MB+ for Electron.

SQLite over PostgreSQL

Single-file embedded database with FTS5

Desktop app — no server dependency. The entire 2GB dataset ships with the app. FTS5 provides full-text search across all biblical texts without an external search engine.

TOML-Driven Extensibility

Zero-code configuration for adding data sources

New lexicons, translations, and corpora are added via TOML config files — one per source. The ETL pipeline reads the config and handles ingestion, schema mapping, and indexing automatically.

Progressive Disclosure UI

Summary-first, depth on demand

10 analytical domains at once would overwhelm. Each verse shows a compact summary; expanding a domain reveals full detail. The workspace hierarchy (Verse → Pericope → Portfolio → LLM Analysis) supports both quick lookup and deep research.