Skip to content

Latest commit

 

History

History
74 lines (58 loc) · 9.76 KB

File metadata and controls

74 lines (58 loc) · 9.76 KB

⚖️ Legal Ops & Compliance

Contract workflows, forms, document review, archive search, and evidence-oriented legal and compliance support.

Who this is for

  • Legal operations, compliance, and records teams that need document intake, review, redaction, and archive workflows.
  • Teams preparing evidence packets where provenance and review boundaries matter.

Jobs covered

  • Convert scanned PDFs and office files into searchable text.
  • Extract clauses, tables, attachments, and metadata from mixed records.
  • Run cited research and matter knowledge retrieval with source boundaries.
  • Build diligence review tables and route higher-risk agent actions through approval gates.
  • Redact sensitive data before sharing or indexing.
  • Search large archives before manual review.

Workflow Stacks

  • Document review packet: OCR → extract text and tables → redact PII → search archive → export review notes
  • Signing and forms: Prepare PDF forms → route signature → store final packet → index metadata
  • Research and diligence support: Search cited sources → ingest matter documents → extract review-table fields → gate external actions → preserve decision evidence

Recommended Picks

Skill What it does here Persona Install Stars
Documenso Open Source Document Signing Platform Adds an auditable signing path for contract and approval packets. Legal ops / contract admin High 12.6k
DocuSeal Open Source Document Signing and PDF Form Platform Combines PDF form preparation and signatures for document-heavy approval flows. Legal ops / forms administrator Medium 11.7k
OCRmyPDF Searchable PDF OCR Pipeline Turns scanned evidence and records into searchable PDFs before review. Records manager / compliance analyst Medium 33.2k
Apache Tika Document Extractor Provides broad-format document extraction when matter files include Office docs, PDFs, and attachments. eDiscovery engineer / records ops High 3.7k
Apache Tika Document Parser Extracts metadata and embedded objects from heterogeneous files for archive triage. Compliance engineer / archive specialist High 3.7k
Extract structured text, metadata, tables, and images from mixed documents through an MCP server with Kreuzberg Adds an MCP-accessible extraction layer for PDFs, Office files, images, HTML, and other mixed matter inputs before review or indexing. Matter knowledge engineer / eDiscovery ops High 7.6k
pdfplumber Python PDF Text and Table Extraction Library Pulls tables, text, and layout clues from contract exhibits and regulatory PDFs. Legal analyst / data wrangler Medium 10.1k
Parse local PDFs into agent-ready text, JSON, and screenshots with LiteParse Creates text, spatial JSON, and screenshots so reviewers can inspect what an agent saw. Document review lead / AI ops Medium 5.1k
Search PDFs, Office files, ebooks, and archives with one query before manual review Finds relevant records across mixed archives before humans spend time opening files one by one. Investigator / records analyst Low 9.6k
Paperless-ngx Document OCR and Archive Management System Provides a durable archive system for scanned paperwork, tags, correspondents, and retrieval. Compliance ops / records manager High 38.1k
LangExtract LLM-Powered Structured Text Extraction Extracts named entities, obligations, dates, and clauses into auditable structured outputs. Legal analyst / compliance reviewer Medium 35k
Turn messy document collections into structured rows with DocETL Turns large contract, diligence, or evidence sets into repeatable structured rows with failure review across the corpus. Diligence lead / legal data analyst High 3.7k
Redact PII from text before sharing or indexing with scrubadub Redacts sensitive identifiers before content enters search, summarization, or external review. Privacy analyst / compliance ops Low 421
Search large PDFs and read only the relevant pages before answering Limits review to relevant pages of long PDFs instead of pushing full documents through an agent. Legal researcher / review analyst Medium 17
Run local deep research workflows with Local Deep Research Runs private cited research across web, academic, and local document sources while preserving source links and a controlled knowledge base. Legal researcher / knowledge manager High 7.9k
Process, redact, OCR, and sign documents with Nutrient Agent Skill Bundles OCR, redaction, form filling, conversion, and signing for governed document operations. Document automation lead High 5
Convert dense PDFs into LLM-ready text and page-aligned markdown with olmOCR Converts dense scanned or layout-heavy PDFs into page-aligned text for cited review. eDiscovery analyst / knowledge engineer High 17.1k
Turn documents into validated knowledge graphs with Docling Graph Extracts schema-checked entities and relationships when matters need structured fact maps. Knowledge engineer / compliance analyst High 134
Use RAGFlow as a retrieval and context layer for agent workflows Provides a supervised RAG layer for matter document knowledge bases with traceable source support before agent answers are reviewed. Matter knowledge manager / legal AI ops High 79.8k
Extract structured markdown, JSON, and tagged-PDF-ready outputs from PDFs with OpenDataLoader PDF Produces markdown, coordinate-aware JSON, and accessibility-oriented outputs from PDF packets. Document processing engineer High 19.1k
Enrich Paperless-ngx documents with AI-generated titles tags and correspondents using paperless-gpt Improves archive metadata after ingestion so humans can search and route records faster. Records manager / knowledge ops High 2.3k
Capture a live webpage as a clean PDF or readable archive for offline review with Percollate Preserves web evidence as readable offline artifacts for citation and handoff. Investigator / compliance analyst Low 4.6k
Extract structured data and attachments from raw email with MailParser Normalizes raw email evidence and attachments before archive search or review. Legal ops / mailbox reviewer Medium 1.7k
Strip quoted email history and signatures before summarizing inbound replies Separates the newest human reply from long threads so summaries do not duplicate history. Case manager / legal assistant Low 78
Load .mbox mail archives into SQLite for offline search, audits, and dataset joins Turns mailbox archives into queryable SQLite evidence stores for offline audit work. Investigator / data analyst Medium 39
Put approval gates and audit-ready policy checks between agents and external actions with DashClaw Adds approval gates and replayable decision evidence when legal AI workflows need human review before external actions. Legal AI governance lead / compliance ops High 241

Editorial Notes

  • The collection avoids legal-advice framing; these are intake, evidence, and operations tools.
  • Document-centric entries are favored over general security scanners unless they support compliance evidence work directly.
  • Research and RAG picks are framed as source-grounded support for legal operations and human review, not automated legal advice.
  • Do not let infra-policy scanners take over this collection. Keep v1 document-centric.

Adjacent Collections


← Back to industry collections