Architecture
How Seminary organizes 10M+ records across 10 scholarly domains into a responsive desktop research tool.
Stack
┌─────────────────────────────────────────┐
│ React + Zustand │ Frontend
│ 10 domain components, progressive UI │
└──────────────┬──────────────────────────┘
│ Tauri IPC (JSON-RPC)
┌──────────────▼──────────────────────────┐
│ Rust / Tauri v2 │ Backend
│ Query engine, domain routing, caching │
└──────────────┬──────────────────────────┘
│ rusqlite
┌──────────────▼──────────────────────────┐
│ SQLite (~2GB) │ Data
│ 30+ tables, 10M+ records │
│ FTS5 full-text search │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ 50+ Python ETL Loaders │ Ingestion
│ One per data source, TOML-configured │
└─────────────────────────────────────────┘10 Analytical Domains
Each domain has a dedicated UI component, backend query path, and data model. Domains surface progressively — summary first, depth on demand.
Lexical
Word definitions across 12+ lexicons (BDB, HALOT, LSJ, BDAG)
Morphological
Parsing and inflection analysis for Hebrew, Greek, Aramaic
Textual-Critical
Manuscript variants — MT, DSS, LXX, Byzantine traditions
Syntactic
Clause structure, word order, grammatical relationships
Discourse
Pericope-level structure, rhetorical patterns, chiasms
Versional
185+ translations compared side-by-side with originals
Diction
Word frequency, hapax legomena, semantic range analysis
Intertextual
2.5M+ cross-references: allusions, quotations, parallels
Onomastic
Proper name analysis — etymology, geographic, genealogical
Prosodic
Hebrew poetry — parallelism, meter, accentuation (te'amim)
Data Scale
| Category | Scale |
|---|---|
| Database size | ~2 GB |
| Total records | 10M+ |
| Tables | 30+ |
| Languages covered | 9 ancient languages |
| Lexicons | 12+ |
| Bible translations | 185+ |
| Cross-references | 2.5M+ |
| ETL loaders | 50+ |
| Tests | 337 (99 Rust + 129 JS + 109 Python) |
Design Decisions
Rust/Tauri over Electron
Tauri v2 with Rust backend
SQLite queries over 10M records need native performance. Tauri's Rust backend handles the query engine with zero GC pauses. Binary size is ~15MB vs 150MB+ for Electron.
SQLite over PostgreSQL
Single-file embedded database with FTS5
Desktop app — no server dependency. The entire 2GB dataset ships with the app. FTS5 provides full-text search across all biblical texts without an external search engine.
TOML-Driven Extensibility
Zero-code configuration for adding data sources
New lexicons, translations, and corpora are added via TOML config files — one per source. The ETL pipeline reads the config and handles ingestion, schema mapping, and indexing automatically.
Progressive Disclosure UI
Summary-first, depth on demand
10 analytical domains at once would overwhelm. Each verse shows a compact summary; expanding a domain reveals full detail. The workspace hierarchy (Verse → Pericope → Portfolio → LLM Analysis) supports both quick lookup and deep research.