xlsx

SKILL

Use this skill any time a spreadsheet file is the primary input or output. This means any task where the user wants to: open, read, edit, or fix an existing .xlsx, .xlsm, .csv, or .tsv file (e.g., adding columns, computing formulas, formatting, charting, cleaning messy data); create a new spreadsheet from scratch or from other data sources; or convert between tabular file formats. Trigger especially when the user references a spreadsheet file by name or path — even casually (like "the xlsx in my downloads") — and wants something done to it or produced from it. Also trigger for cleaning or restructuring messy tabular data files (malformed rows, misplaced headers, junk data) into proper spreadsheets. The deliverable must be a spreadsheet file. Do NOT trigger when the primary deliverable is a Word document, HTML report, standalone Python script, database pipeline, or Google Sheets API integration, even if tabular data is involved.

v1.0.0 Tested 8 Feb 2026

3.0

Security gate triggered — critical vulnerabilities found. Overall score capped at 3.0.

Dimension scores

Security 3.0

Reliability 5.0

Agent usability 3.0

Compatibility 3.0

Code health 4.0

Compatibility

Framework	Status	Notes
Claude Code	✗	No MCP server implementation found - only Python scripts, No stdio transport implementation, No tools/list endpoint, No JSON-RPC protocol handling, This is a collection of utility scripts, not an MCP server
OpenAI Agents SDK	✗	No MCP server implementation found, No SSE transport support, No server protocol implementation, Scripts would need to be wrapped in an MCP server to be usable
LangChain	✗	No MCP server implementation, Scripts are standalone Python modules, not MCP tools, Would require building an MCP server wrapper first, No tool schema definitions in MCP format

Security findings

CRITICAL

Command injection vulnerability in recalc.py

Line 67-77: User-controlled 'filename' parameter is passed directly to shell command via subprocess. The abs_path variable (derived from filename) is inserted into cmd array without sanitization. An attacker could inject shell commands via specially crafted filenames like 'file.xlsx; rm -rf /' or 'file.xlsx && malicious_command'.

CRITICAL

Arbitrary file read/write via path traversal

Multiple scripts (unpack.py, pack.py, recalc.py) accept file paths from user input without sanitization against '../' traversal patterns. Line 44 in unpack.py: 'output_path.mkdir(parents=True, exist_ok=True)' creates directories at arbitrary locations. Line 95 in pack.py: 'output_path.parent.mkdir(parents=True, exist_ok=True)' enables writing files outside intended scope.

HIGH

Unsafe XML parsing allows XML entity expansion attacks

While defusedxml.minidom is used in some places (unpack.py line 28, pack.py line 13), xml.etree.ElementTree is used unsafely in validators/redlining.py lines 50-52 and 79-85 with ET.parse(). This allows XXE (XML External Entity) attacks and billion laughs attacks when parsing user-controlled XML content.

HIGH

Unsafe deserialization of XML without input validation

validators/pptx.py and validators/docx.py parse XML files with lxml.etree without validating structure first. An attacker could craft malicious XML that exploits parser vulnerabilities or causes resource exhaustion.

MEDIUM

Verbose error messages expose internal paths

MEDIUM

No file size limits on uploaded/processed files

MEDIUM

Temporary files not securely cleaned up

MEDIUM

Shell command construction vulnerable to injection

Reliability

Success rate

65%

Calls made

100

Avg latency

2500ms

P95 latency

8000ms

Failure modes

• LibreOffice macro setup failure - no error recovery, returns generic error string
• Missing file paths - basic existence check but returns dict with 'error' key inconsistently
• Timeout on long-running operations - uses timeout command but no graceful handling of partial completion
• File corruption/invalid xlsx - openpyxl may throw uncaught exceptions during load_workbook
• Permission errors on file operations - no explicit handling of OS-level permission denied
• LibreOffice process hangs - timeout kills process but leaves zombie processes, no cleanup
• Concurrent access to same file - no file locking, race conditions possible
• Very large files - no size limits, may cause memory exhaustion or extreme latency
• Special characters in file paths - Path handling exists but not validated against shell injection in subprocess calls
• Missing openpyxl dependency - ImportError not caught at module level
• AF_UNIX socket shim compilation failure - gcc error messages not parsed or returned clearly
• LD_PRELOAD conflicts with other libraries - no detection or handling
• XML parsing failures in validators - some try/except blocks exist but inconsistent error messages
• LibreOffice not installed - subprocess calls will fail with FileNotFoundError but error message assumes it exists

Code health

License

Apache-2.0

Has tests

Has CI

Dependencies

This is a skill module (not a standalone tool) for spreadsheet manipulation. Code quality signals are mixed: it has a license (Apache-2.0) and extensive documentation (SKILL.md ~11KB), but lacks critical development infrastructure. No tests, no CI/CD, no type hints despite Python 3.x syntax (type annotations present but no mypy/runtime checking configured). The codebase shows good structure with modular helpers and validators, defensive XML parsing (defusedxml), and comprehensive Office format handling. However, without git metadata in the provided snapshot, maintenance metrics are unavailable. Dependencies are minimal (openpyxl, defusedxml, lxml implied) but no lockfile present. The code appears production-ready for its context (internal skill module) but would benefit from test coverage and CI for reliability. Major gap: no automated testing for complex XML manipulation and validation logic.

View source on GitHub →