PyMuPDF PDF Parser Clawdbot Skill
Fast local PDF parsing with PyMuPDF (fitz) for Markdown/JSON outputs and optional images/tables. Use when speed matters more than robustness, or as a fallback while heavier parsers are unavailable. Default to single-PDF parsing with per-document output folders.
Why this rating
Deterministic checks triggered by the tool capabilities and evidence.
- Locality: Local
Runs `pymupdf_parse.py` against local PDF files without external services.
- Data access: Sensitive
Reads full PDF contents and can extract embedded images and tables into output files.
- Action surface: Execute
Runs a local parsing script and writes results under `./pymupdf-output/<pdf-basename>/`.
Best practices
Follow these steps to reduce risk when using this skill.
- Keep `pymupdf-output/` out of version control and restrict permissions on extracted text/images.
- Be cautious parsing untrusted PDFs; use antivirus/sandboxing and keep PyMuPDF updated.
- Clean up temporary outputs for sensitive documents and avoid sharing extracted content without redaction.
Evidence links
Public sources backing the indicator assignments.
Always be careful when navigating away from the website.
Max-risk rule
If any capability reaches a higher level, the entire indicator level bumps up to keep ratings deterministic and easy to scan.