PDF Text Extractor
Extract text from PDFs with OCR support. Perfect for digitizing documents, processing invoices, or analyzing content. Zero dependencies required.
Why this rating
Deterministic checks triggered by the tool capabilities and evidence.
- Locality: Local
Parses PDFs locally and runs OCR using bundled libraries (no external services required).
- Data access: Sensitive
Processes the full contents of PDFs (e.g., invoices, contracts) and can extract metadata like author/title/creation date.
- Action surface: Execute
Runs local extraction/OCR routines and can batch-process many PDFs.
Best practices
Follow these steps to reduce risk when using this skill.
- Keep extracted text and OCR outputs in a protected folder and avoid posting raw transcripts publicly.
- Be cautious parsing untrusted PDFs; scan files and keep dependencies updated to reduce PDF parsing risk.
- Spot-check OCR results (especially numbers/dates) before using them in downstream automations.
Evidence links
Public sources backing the indicator assignments.
Always be careful when navigating away from the website.
Max-risk rule
If any capability reaches a higher level, the entire indicator level bumps up to keep ratings deterministic and easy to scan.