Guide: n8n for Auto-Vectorizing PDFs from Google Drive to Zyrabit RAG
Guide: n8n for Auto-Vectorizing PDFs from Google Drive to Zyrabit RAG
Integrate n8n (open source workflow) into your Oracle Zyrabit node to: monitor Drive folder → extract PDF text → embeddings → insert into local ChromaDB (Zyrabit RAG). Everything sovereign, without uploading data to external clouds. Runs on the same Forge docker-compose (observability profile).
Objective: Automates indexing of contracts/notes → Zyrabit chat answers "Does this PDF violate CNBV?" with automatic citations. Time: 20 min setup.
Phase 1: Install n8n on Forge Stack
Extend docker-compose.yml: Add n8n service to observability profile.
Phase 2: Google Drive Credentials
Setup Google Cloud Console Service Account and share "Zyrabit-Ingesta" folder.
Phase 3: n8n Workflow "PDF → Automatic RAG"
Import provided JSON workflow in n8n UI.
n8n Steps:
- Credentials: New > Google Drive OAuth2 > use service account JSON.
- Google Drive Trigger: Watch folder ID.
- Code Node: Install
pdf-parsein n8n container. - HTTP Nodes: Point to internal Forge endpoints (LiteLLM/Chroma).
- Save & Activate.
Phase 4: Validation and Improvements
End-to-end Test: Upload contract PDF to "Zyrabit-Ingesta" Drive. n8n log: Generated embeddings → Chroma upsert. Zyrabit Chat: "What does the CNBV PDF say?" → automatic RAG citation.
Result: Drive → Auto-indexed PDF → RAG ready for sovereign chat. Ideal for PoCs.