Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
Dimension scores
Compatibility
| Framework | Status | Notes |
|---|---|---|
| Claude Code | ✗ | No MCP server implementation found - only standalone Python scripts, No stdio transport support, No tools/list endpoint, Not an MCP server at all - this is a collection of utility scripts |
| OpenAI Agents SDK | ✗ | No MCP server implementation, No SSE transport support, Scripts are not exposed as callable tools, Would require complete MCP wrapper implementation |
| LangChain | ✗ | No MCP server implementation, Scripts cannot be directly wrapped as LangChain tools without MCP layer, No standardized input/output interface for tool wrapping |
Security findings
Command injection vulnerability in multiple scripts via unsanitized file path arguments
All scripts accept file paths via sys.argv without validation. Scripts like fill_pdf_form_with_annotations.py, fill_fillable_fields.py, and extract_form_structure.py pass user-controlled paths directly to file operations and PDF libraries. An attacker could use path traversal (../../etc/passwd) or special characters to access arbitrary files.
Arbitrary file write vulnerability
Scripts like fill_fillable_fields.py and fill_pdf_form_with_annotations.py write output to paths specified via command-line arguments without validation. An attacker could overwrite system files or write to arbitrary locations (e.g., python fill_fillable_fields.py input.pdf data.json /etc/cron.d/malicious).
Arbitrary code execution via JSON deserialization without validation
Scripts like fill_pdf_form_with_annotations.py load JSON from files with json.load() and directly use values to construct PDF annotations and set font properties. Malicious JSON could inject arbitrary font names, sizes, or text that exploits PDF rendering engines. No schema validation or sanitization is performed on JSON input.
No input validation on bounding box coordinates
Scripts like fill_pdf_form_with_annotations.py use bounding box coordinates from JSON (field['entry_bounding_box']) directly in transform_from_image_coords() and transform_from_pdf_coords() without validating they are positive numbers or within page bounds. Negative or extremely large values could cause crashes or unexpected behavior.
Unsafe dynamic patching of library methods
fill_fillable_fields.py contains monkeypatch_pydpf_method() that replaces pypdf's DictionaryObject.get_inherited method at runtime. This modifies security-critical PDF parsing behavior without safeguards and could introduce vulnerabilities or hide malicious PDF structures.
Missing file extension validation
Uncontrolled resource consumption in PDF operations
Error messages expose internal file paths
Reliability
Success rate
55%
Calls made
100
Avg latency
2500ms
P95 latency
5000ms
Failure modes
- • Scripts use sys.exit() which would crash the server process instead of returning structured errors
- • No try/catch blocks around I/O operations (file reads, JSON parsing, PDF processing)
- • Missing validation for file existence before opening - will throw unhandled FileNotFoundError
- • No validation of command-line argument types (e.g., page_number conversion to int can fail)
- • JSON parsing errors not caught - malformed JSON will crash with unhandled exception
- • PDF library operations (pdfplumber, pypdf) can throw various exceptions that are not handled
- • No timeout protection on PDF processing operations
- • Scripts expect exact argument counts with sys.exit(1) on mismatch - no graceful degradation
- • Intersection checking in check_bounding_boxes.py has no bounds on number of comparisons
- • No validation of bbox coordinates (could be negative, out of bounds, or malformed)
- • Image operations in create_validation_image.py and convert_pdf_to_images.py not error-handled
- • Monkeypatch in fill_fillable_fields.py could fail silently or cause unexpected behavior
- • No handling of corrupted PDF files
- • No resource cleanup guarantees (file handles may leak on error)
- • Error messages go to stdout/stderr with no structured format for parsing
Code health
License
Proprietary
Has tests
No
Has CI
No
Dependencies
6
This is a skill module, not a standalone repository. Critical gaps: no tests, no CI, no type hints, no README. The code consists of utility scripts (12 Python files) for PDF manipulation using pypdf, pdfplumber, reportlab, pdf2image, and Pillow. Documentation exists in SKILL.md, forms.md, and reference.md (36KB total). LICENSE.txt is present (proprietary). Code quality concerns: no error handling in many scripts, basic validation only, no logging, hardcoded values. The scripts are functional but lack production-grade robustness. Without repository metadata, cannot assess maintenance activity or dependency vulnerabilities. This appears to be internal tooling rather than a maintained open-source project.