Compare commits

...

39 Commits

Author SHA1 Message Date
Lina Tawfik
2a9592678e Revert .gitignore changes 2025-05-28 18:16:13 -07:00
Lina Tawfik
7d1773e98f Remove test-markdown.json and update .gitignore 2025-05-28 18:14:13 -07:00
Lina Tawfik
019043f2fb Remove rendered.html from repository 2025-05-28 18:13:49 -07:00
Lina Tawfik
4ed7e5538d Fix prettier formatting 2025-05-28 18:13:39 -07:00
Lina Tawfik
cf04e19dbc Refactor tests to remove redundancy and improve structure
- Remove redundant 'mixed input patterns' test from sanitizer.test.ts
- Consolidate integration tests into 2 focused real-world scenarios
- Add HTML comment stripping to sanitizeContent function
- Update test expectations to match sanitization behavior
- Maintain full coverage with fewer, more focused tests
2025-05-28 18:12:07 -07:00
Lina Tawfik
046ef964a9 Format code with prettier 2025-05-28 17:30:42 -07:00
Lina Tawfik
61cd297c18 Add enhanced text sanitization 2025-05-28 17:29:09 -07:00
Ashwin Bhat
176dbc369d bump base action to 0.0.6 (#79) 2025-05-28 13:19:10 -07:00
Erjan K
8ae72a97c6 Fix readme typo (#58) 2025-05-28 10:20:00 -07:00
Ashwin Bhat
0eb34ae441 Add shallow fetch to improve performance for large repositories (#53)
* Add shallow fetch to improve performance for large repositories

This change adds `--depth=1` to git fetch operations to perform shallow
fetches instead of full history downloads. This significantly reduces
checkout time for large repositories as reported in issue #52.

Changes:
- Line 55: Added --depth=1 to PR branch fetch
- Line 102: Added --depth=1 to new branch fetch

Fixes #52

Co-authored-by: ashwin-ant <ashwin-ant@users.noreply.github.com>

* fetch 50 commits for PRs

---------

Co-authored-by: claude[bot] <209825114+claude[bot]@users.noreply.github.com>
Co-authored-by: ashwin-ant <ashwin-ant@users.noreply.github.com>
2025-05-27 16:31:06 -07:00
Ashwin Bhat
804959ac41 add issue triage workflow (#70) 2025-05-27 14:04:41 -07:00
Ashwin Bhat
21e17bd590 remove .DS_Store (#69) 2025-05-27 13:26:03 -07:00
Ashwin Bhat
4b925ddf0c Update issue templates (#51) 2025-05-27 13:18:29 -07:00
Ashwin Bhat
253f2c6796 Pin GitHub Action dependencies to commit SHAs for security (#66)
Pin oven-sh/setup-bun and anthropics/claude-code-base-action to specific commit SHAs instead of version tags to ensure reproducible builds and improve supply chain security.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-05-27 10:14:11 -07:00
Ashwin Bhat
3c6a85b54b Improve error messages for GitHub Action authentication failures (#50)
- Add helpful hint about workflow permissions when OIDC token is not found
- Include response body in app token exchange failure errors for better debugging

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-05-25 18:43:54 -07:00
Lina Tawfik
cbc3ca285d Merge pull request #39 from anthropics/fix-mcp-undefined-error
Fix MCP file operations server errors
2025-05-23 12:30:01 -07:00
Lina Tawfik
6ce69a1db5 Remove test files to fix typecheck 2025-05-23 11:32:15 -07:00
Lina Tawfik
5b025a2e43 Fix prettier formatting 2025-05-23 11:31:08 -07:00
Lina Tawfik
a29981fe38 Remove inline comments from code 2025-05-23 11:22:47 -07:00
Lina Tawfik
c60a8fb69b Fix MCP server undefined error and file path resolution
- Add error field to MCP error responses to fix 'undefined' errors
- Add REPO_DIR environment variable to fix file path resolution
- Use GITHUB_WORKSPACE for correct repository directory
- Simplify path processing logic in commit_files tool

This fixes the issue where mcp__github_file_ops__commit_files would fail
with 'Error calling tool commit_files: undefined' by ensuring error messages
are properly formatted and files are read from the correct directory.
2025-05-23 11:17:05 -07:00
Lina Tawfik
f3bfb2a9ad Merge pull request #34 from anthropics/update-claude-workflow-v2
Update Claude workflow
2025-05-22 21:51:16 -07:00
Lina Tawfik
36c5ee33cd Update Claude workflow
🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 21:43:54 -07:00
Lina Tawfik
8e84799f37 Merge pull request #25 from anthropics/update-to-use-model-parameter
Udpate claude model to default -p model
2025-05-22 11:02:14 -07:00
Lina Tawfik
57ae256d38 Run prettier formatting on README.md
Prettier adjusted the table column spacing for consistency.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 10:58:29 -07:00
Lina Tawfik
d3bb4afed5 Fix table formatting for anthropic_model parameter
The table row was broken across two lines which caused markdown rendering issues.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 10:57:32 -07:00
Lina Tawfik
17cc868124 formatting readme 2025-05-22 10:55:31 -07:00
Lina Tawfik
d822994da0 udpate claude model to default 2025-05-22 10:54:11 -07:00
Lina Tawfik
b129b800c5 Merge pull request #23 from anthropics/np-anthropic-patch-1
Add graphic to readme
2025-05-22 09:26:50 -07:00
Lina Tawfik
80dbb4a5aa Merge pull request #24 from anthropics/update-to-use-model-parameter
Update to use model parameter in claude-code-base-action
2025-05-22 09:19:03 -07:00
Lina Tawfik
1e9ea49f7a Update README example to use model parameter instead of anthropic_model 2025-05-22 09:15:14 -07:00
Lina Tawfik
08e084156a Revert unintended model change in test/mockContext.ts 2025-05-22 09:12:59 -07:00
Lina Tawfik
e67f992a13 Update to use model parameter in claude-code-base-action
This updates claude-code-action to pass the model parameter to claude-code-base-action using the new primary `model` parameter instead of the deprecated `anthropic_model`.

This change is made in conjunction with https://github.com/anthropics/claude-code-base-action/pull/4 which adds the `model` parameter to claude-code-base-action.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 09:10:44 -07:00
Nate Parrott
be7f75d65a fix formatting 2025-05-22 11:12:51 -04:00
np-anthropic
e3d126d058 Add graphic to readme 2025-05-22 10:57:20 -04:00
Lina Tawfik
11f5812e28 Merge pull request #22 from anthropics/lina/rename-anthropic-model-to-model
Rename anthropic_model input to model with backward compatibility
2025-05-22 07:26:42 -07:00
Lina Tawfik
d15de3a8e3 docs: update README examples to use 'model' parameter correctly
- Show model parameter as optional comment for direct API examples
- Keep model parameter required for Bedrock and Vertex AI examples
- Demonstrates the default behavior when model is not specified

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 07:22:13 -07:00
Lina Tawfik
9e23f6d9ed feat: rename anthropic_model input to model with backward compatibility
- Add new 'model' input parameter as the preferred way to specify the model
- Keep 'anthropic_model' for backward compatibility with deprecation notice
- Use expression syntax to prioritize 'model' over 'anthropic_model'
- Update README documentation to reflect the change

This allows existing workflows to continue working while encouraging migration to the cleaner 'model' parameter name.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
2025-05-22 07:17:18 -07:00
Lina Tawfik
48b327f164 Merge pull request #17 from anthropics/lina/html_comments_strip
Strip HTML comments from GitHub content
2025-05-21 13:29:25 -07:00
Lina Tawfik
dd5e8c974a feat: strip HTML comments from GitHub content
- Add stripHtmlComments function to remove HTML comments from text
- Apply to all GitHub content (bodies, comments, reviews, triggers)
- Add comprehensive tests for comment stripping functionality
2025-05-21 13:23:32 -07:00
16 changed files with 731 additions and 84 deletions

36
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,36 @@
---
name: Bug report
about: Create a report to help us improve
title: ""
labels: bug
assignees: ""
---
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior:
1. Go to '...'
2. Click on '....'
3. Scroll down to '....'
4. See error
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Workflow yml file**
If it's not sensitive, consider including a paste of your full Claude workflow.yml file.
**API Provider**
[ ] Anthropic First-Party API (default)
[ ] AWS Bedrock
[ ] GCP Vertex
**Additional context**
Add any other context about the problem here.

View File

@@ -1,4 +1,4 @@
name: Claude
name: Claude Code
on:
issue_comment:
@@ -11,12 +11,12 @@ on:
types: [submitted]
jobs:
claude-pr:
claude:
if: |
(github.event_name == 'issue_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review_comment' && contains(github.event.comment.body, '@claude')) ||
(github.event_name == 'pull_request_review' && contains(github.event.review.body, '@claude')) ||
(github.event_name == 'issues' && contains(github.event.issue.body, '@claude'))
(github.event_name == 'issues' && (contains(github.event.issue.body, '@claude') || contains(github.event.issue.title, '@claude')))
runs-on: ubuntu-latest
permissions:
contents: read
@@ -29,10 +29,8 @@ jobs:
with:
fetch-depth: 1
- name: Run Claude PR Agent
uses: anthropics/claude-code-action@main
- name: Run Claude Code
id: claude
uses: anthropics/claude-code-action@beta
with:
timeout_minutes: "60"
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
allowed_tools: "Bash(bun install),Bash(bun test:*),Bash(bun run format),Bash(bun typecheck)"
custom_instructions: "You have also been granted tools for editing files and running bun commands (install, run, test) for testing your changes."

104
.github/workflows/issue-triage.yml vendored Normal file
View File

@@ -0,0 +1,104 @@
name: Claude Issue Triage
description: Run Claude Code for issue triage in GitHub Actions
on:
issues:
types: [opened]
jobs:
triage-issue:
runs-on: ubuntu-latest
timeout-minutes: 10
permissions:
contents: read
issues: write
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Setup GitHub MCP Server
run: |
mkdir -p /tmp/mcp-config
cat > /tmp/mcp-config/mcp-servers.json << 'EOF'
{
"github": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"-e",
"GITHUB_PERSONAL_ACCESS_TOKEN",
"ghcr.io/github/github-mcp-server:sha-7aced2b"
],
"env": {
"GITHUB_PERSONAL_ACCESS_TOKEN": "${{ secrets.GITHUB_TOKEN }}"
}
}
}
EOF
- name: Create triage prompt
run: |
mkdir -p /tmp/claude-prompts
cat > /tmp/claude-prompts/triage-prompt.txt << 'EOF'
You're an issue triage assistant for GitHub issues. Your task is to analyze the issue and select appropriate labels from the provided list.
IMPORTANT: Don't post any comments or messages to the issue. Your only action should be to apply labels.
Issue Information:
- REPO: ${{ github.repository }}
- ISSUE_NUMBER: ${{ github.event.issue.number }}
TASK OVERVIEW:
1. First, fetch the list of labels available in this repository by running: `gh label list`. Run exactly this command with nothing else.
2. Next, use the GitHub tools to get context about the issue:
- You have access to these tools:
- mcp__github__get_issue: Use this to retrieve the current issue's details including title, description, and existing labels
- mcp__github__get_issue_comments: Use this to read any discussion or additional context provided in the comments
- mcp__github__update_issue: Use this to apply labels to the issue (do not use this for commenting)
- mcp__github__search_issues: Use this to find similar issues that might provide context for proper categorization and to identify potential duplicate issues
- mcp__github__list_issues: Use this to understand patterns in how other issues are labeled
- Start by using mcp__github__get_issue to get the issue details
3. Analyze the issue content, considering:
- The issue title and description
- The type of issue (bug report, feature request, question, etc.)
- Technical areas mentioned
- Severity or priority indicators
- User impact
- Components affected
4. Select appropriate labels from the available labels list provided above:
- Choose labels that accurately reflect the issue's nature
- Be specific but comprehensive
- Select priority labels if you can determine urgency (high-priority, med-priority, or low-priority)
- Consider platform labels (android, ios) if applicable
- If you find similar issues using mcp__github__search_issues, consider using a "duplicate" label if appropriate. Only do so if the issue is a duplicate of another OPEN issue.
5. Apply the selected labels:
- Use mcp__github__update_issue to apply your selected labels
- DO NOT post any comments explaining your decision
- DO NOT communicate directly with users
- If no labels are clearly applicable, do not apply any labels
IMPORTANT GUIDELINES:
- Be thorough in your analysis
- Only select labels from the provided list above
- DO NOT post any comments to the issue
- Your ONLY action should be to apply labels using mcp__github__update_issue
- It's okay to not add any labels if none are clearly applicable
EOF
- name: Run Claude Code for Issue Triage
uses: anthropics/claude-code-base-action@beta
with:
prompt_file: /tmp/claude-prompts/triage-prompt.txt
allowed_tools: "Bash(gh label list),mcp__github__get_issue,mcp__github__get_issue_comments,mcp__github__update_issue,mcp__github__search_issues,mcp__github__list_issues"
mcp_config_file: /tmp/mcp-config/mcp-servers.json
timeout_minutes: "5"
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}

1
.gitignore vendored
View File

@@ -1,3 +1,4 @@
.DS_Store
node_modules
**/.claude/settings.local.json

View File

@@ -1,3 +1,5 @@
![Claude Code Action responding to a comment](https://github.com/user-attachments/assets/1d60c2e9-82ed-4ee5-b749-f9e021c85f4d)
# Claude Code Action
A general-purpose [Claude Code](https://claude.ai/code) action for GitHub PRs and issues that can answer questions and implement code changes. This action listens for a trigger phrase in comments and activates Claude act on the request. It supports multiple authentication methods including Anthropic direct API, Amazon Bedrock, and Google Vertex AI.
@@ -63,20 +65,21 @@ jobs:
## Inputs
| Input | Description | Required | Default |
| --------------------- | -------------------------------------------------------------------------------------------------------------------- | -------- | ---------------------------- |
| `anthropic_api_key` | Anthropic API key (required for direct API, not needed for Bedrock/Vertex) | No\* | - |
| `direct_prompt` | Direct prompt for Claude to execute automatically without needing a trigger (for automated workflows) | No | - |
| `timeout_minutes` | Timeout in minutes for execution | No | `30` |
| `github_token` | GitHub token for Claude to operate with. **Only include this if you're connecting a custom GitHub app of your own!** | No | - |
| `anthropic_model` | Model to use (provider-specific format required for Bedrock/Vertex) | No | `claude-3-7-sonnet-20250219` |
| `use_bedrock` | Use Amazon Bedrock with OIDC authentication instead of direct Anthropic API | No | `false` |
| `use_vertex` | Use Google Vertex AI with OIDC authentication instead of direct Anthropic API | No | `false` |
| `allowed_tools` | Additional tools for Claude to use (the base GitHub tools will always be included) | No | "" |
| `disallowed_tools` | Tools that Claude should never use | No | "" |
| `custom_instructions` | Additional custom instructions to include in the prompt for Claude | No | "" |
| `assignee_trigger` | The assignee username that triggers the action (e.g. @claude). Only used for issue assignment | No | - |
| `trigger_phrase` | The trigger phrase to look for in comments, issue/PR bodies, and issue titles | No | `@claude` |
| Input | Description | Required | Default |
| --------------------- | -------------------------------------------------------------------------------------------------------------------- | -------- | --------- |
| `anthropic_api_key` | Anthropic API key (required for direct API, not needed for Bedrock/Vertex) | No\* | - |
| `direct_prompt` | Direct prompt for Claude to execute automatically without needing a trigger (for automated workflows) | No | - |
| `timeout_minutes` | Timeout in minutes for execution | No | `30` |
| `github_token` | GitHub token for Claude to operate with. **Only include this if you're connecting a custom GitHub app of your own!** | No | - |
| `model` | Model to use (provider-specific format required for Bedrock/Vertex) | No | - |
| `anthropic_model` | **DEPRECATED**: Use `model` instead. Kept for backward compatibility. | No | - |
| `use_bedrock` | Use Amazon Bedrock with OIDC authentication instead of direct Anthropic API | No | `false` |
| `use_vertex` | Use Google Vertex AI with OIDC authentication instead of direct Anthropic API | No | `false` |
| `allowed_tools` | Additional tools for Claude to use (the base GitHub tools will always be included) | No | "" |
| `disallowed_tools` | Tools that Claude should never use | No | "" |
| `custom_instructions` | Additional custom instructions to include in the prompt for Claude | No | "" |
| `assignee_trigger` | The assignee username that triggers the action (e.g. @claude). Only used for issue assignment | No | - |
| `trigger_phrase` | The trigger phrase to look for in comments, issue/PR bodies, and issue titles | No | `@claude` |
\*Required when using direct Anthropic API (default and when not using Bedrock or Vertex)
@@ -255,7 +258,7 @@ Use a specific Claude model:
```yaml
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "claude-3-7-sonnet-20250219"
# model: "claude-3-5-sonnet-20241022" # Optional: specify a different model
# ... other inputs
```
@@ -283,21 +286,20 @@ Use provider-specific model names based on your chosen provider:
# For direct Anthropic API (default)
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "claude-3-7-sonnet-20250219"
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
# ... other inputs
# For Amazon Bedrock with OIDC
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "anthropic.claude-3-7-sonnet-20250219-beta:0" # Cross-region inference
model: "anthropic.claude-3-7-sonnet-20250219-beta:0" # Cross-region inference
use_bedrock: "true"
# ... other inputs
# For Google Vertex AI with OIDC
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "claude-3-7-sonnet@20250219"
model: "claude-3-7-sonnet@20250219"
use_vertex: "true"
# ... other inputs
```
@@ -323,7 +325,7 @@ Both AWS Bedrock and GCP Vertex AI require OIDC authentication.
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "anthropic.claude-3-7-sonnet-20250219-beta:0"
model: "anthropic.claude-3-7-sonnet-20250219-beta:0"
use_bedrock: "true"
# ... other inputs
@@ -348,7 +350,7 @@ Both AWS Bedrock and GCP Vertex AI require OIDC authentication.
- uses: anthropics/claude-code-action@beta
with:
anthropic_model: "claude-3-7-sonnet@20250219"
model: "claude-3-7-sonnet@20250219"
use_vertex: "true"
# ... other inputs
@@ -444,7 +446,7 @@ anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
```
This applies to all sensitive values including API keys, access tokens, and credentials.
We also reccomend that you always use short-lived tokens when possible
We also recommend that you always use short-lived tokens when possible
## License

View File

@@ -14,10 +14,12 @@ inputs:
required: false
# Claude Code configuration
anthropic_model:
model:
description: "Model to use (provider-specific format required for Bedrock/Vertex)"
required: false
default: "claude-3-7-sonnet-20250219"
anthropic_model:
description: "DEPRECATED: Use 'model' instead. Model to use (provider-specific format required for Bedrock/Vertex)"
required: false
allowed_tools:
description: "Additional tools for Claude to use (the base GitHub tools will always be included)"
required: false
@@ -65,7 +67,7 @@ runs:
using: "composite"
steps:
- name: Install Bun
uses: oven-sh/setup-bun@v2
uses: oven-sh/setup-bun@735343b667d3e6f658f44d0eca948eb6282f2b76 # https://github.com/oven-sh/setup-bun/releases/tag/v2.0.2
with:
bun-version: 1.2.11
@@ -92,20 +94,20 @@ runs:
- name: Run Claude Code
id: claude-code
if: steps.prepare.outputs.contains_trigger == 'true'
uses: anthropics/claude-code-base-action@beta
uses: anthropics/claude-code-base-action@266585c92dd90d61d3806a3367582c4f6224e892 # https://github.com/anthropics/claude-code-base-action/releases/tag/v0.0.6
with:
prompt_file: /tmp/claude-prompts/claude-prompt.txt
allowed_tools: ${{ env.ALLOWED_TOOLS }}
disallowed_tools: ${{ env.DISALLOWED_TOOLS }}
timeout_minutes: ${{ inputs.timeout_minutes }}
anthropic_model: ${{ inputs.anthropic_model }}
model: ${{ inputs.model || inputs.anthropic_model }}
mcp_config: ${{ steps.prepare.outputs.mcp_config }}
use_bedrock: ${{ inputs.use_bedrock }}
use_vertex: ${{ inputs.use_vertex }}
anthropic_api_key: ${{ inputs.anthropic_api_key }}
env:
# Model configuration
ANTHROPIC_MODEL: ${{ inputs.anthropic_model }}
ANTHROPIC_MODEL: ${{ inputs.model || inputs.anthropic_model }}
GITHUB_TOKEN: ${{ steps.prepare.outputs.GITHUB_TOKEN }}
# AWS configuration

View File

@@ -10,6 +10,7 @@ import {
formatReviewComments,
formatChangedFilesWithSHA,
} from "../github/data/formatter";
import { sanitizeContent } from "../github/utils/sanitizer";
import {
isIssuesEvent,
isIssueCommentEvent,
@@ -418,14 +419,14 @@ ${
eventData.eventName === "pull_request_review") &&
eventData.commentBody
? `<trigger_comment>
${eventData.commentBody}
${sanitizeContent(eventData.commentBody)}
</trigger_comment>`
: ""
}
${
context.directPrompt
? `<direct_prompt>
${context.directPrompt}
${sanitizeContent(context.directPrompt)}
</direct_prompt>`
: ""
}
@@ -433,9 +434,27 @@ ${
eventData.eventName === "pull_request_review_comment"
? `<comment_tool_info>
IMPORTANT: For this inline PR review comment, you have been provided with ONLY the mcp__github__update_pull_request_comment tool to update this specific review comment.
Tool usage example for mcp__github__update_pull_request_comment:
{
"owner": "${context.repository.split("/")[0]}",
"repo": "${context.repository.split("/")[1]}",
"commentId": ${eventData.commentId || context.claudeCommentId},
"body": "Your comment text here"
}
All four parameters (owner, repo, commentId, body) are required.
</comment_tool_info>`
: `<comment_tool_info>
IMPORTANT: For this event type, you have been provided with ONLY the mcp__github__update_issue_comment tool to update comments.
Tool usage example for mcp__github__update_issue_comment:
{
"owner": "${context.repository.split("/")[0]}",
"repo": "${context.repository.split("/")[1]}",
"commentId": ${context.claudeCommentId},
"body": "Your comment text here"
}
All four parameters (owner, repo, commentId, body) are required.
</comment_tool_info>`
}
@@ -546,6 +565,9 @@ Important Notes:
- Use this spinner HTML when work is in progress: <img src="https://github.com/user-attachments/assets/5ac382c7-e004-429b-8e35-7feb3e8f9c6f" width="14px" height="14px" style="vertical-align: middle; margin-left: 4px;" />
${eventData.isPR && !eventData.claudeBranch ? `- Always push to the existing branch when triggered on a PR.` : `- IMPORTANT: You are already on the correct branch (${eventData.claudeBranch || "the created branch"}). Never create new branches when triggered on issues or closed/merged PRs.`}
- Use mcp__github_file_ops__commit_files for making commits (works for both new and existing files, single or multiple). Use mcp__github_file_ops__delete_files for deleting files (supports deleting single or multiple files atomically), or mcp__github__delete_file for deleting a single file. Edit files locally, and the tool will read the content from the same path on disk.
Tool usage examples:
- mcp__github_file_ops__commit_files: {"files": ["path/to/file1.js", "path/to/file2.py"], "message": "feat: add new feature"}
- mcp__github_file_ops__delete_files: {"files": ["path/to/old.js"], "message": "chore: remove deprecated file"}
- Display the todo list as a checklist in the GitHub comment and mark things off as you go.
- REPOSITORY SETUP INSTRUCTIONS: The repository's CLAUDE.md file(s) contain critical repo-specific setup instructions, development guidelines, and preferences. Always read and follow these files, particularly the root CLAUDE.md, as they provide essential context for working with the codebase effectively.
- Use h3 headers (###) for section titles in your comments, not h1 headers (#).

View File

@@ -6,6 +6,7 @@ import type {
GitHubReview,
} from "../types";
import type { GitHubFileWithSHA } from "./fetcher";
import { sanitizeContent } from "../utils/sanitizer";
export function formatContext(
contextData: GitHubPullRequest | GitHubIssue,
@@ -35,11 +36,12 @@ export function formatBody(
): string {
let processedBody = body;
// Replace image URLs with local paths
for (const [originalUrl, localPath] of imageUrlMap) {
processedBody = processedBody.replaceAll(originalUrl, localPath);
}
processedBody = sanitizeContent(processedBody);
return processedBody;
}
@@ -51,13 +53,14 @@ export function formatComments(
.map((comment) => {
let body = comment.body;
// Replace image URLs with local paths if we have a mapping
if (imageUrlMap && body) {
for (const [originalUrl, localPath] of imageUrlMap) {
body = body.replaceAll(originalUrl, localPath);
}
}
body = sanitizeContent(body);
return `[${comment.author.login} at ${comment.createdAt}]: ${body}`;
})
.join("\n\n");
@@ -74,6 +77,19 @@ export function formatReviewComments(
const formattedReviews = reviewData.nodes.map((review) => {
let reviewOutput = `[Review by ${review.author.login} at ${review.submittedAt}]: ${review.state}`;
if (review.body && review.body.trim()) {
let body = review.body;
if (imageUrlMap) {
for (const [originalUrl, localPath] of imageUrlMap) {
body = body.replaceAll(originalUrl, localPath);
}
}
const sanitizedBody = sanitizeContent(body);
reviewOutput += `\n${sanitizedBody}`;
}
if (
review.comments &&
review.comments.nodes &&
@@ -83,13 +99,14 @@ export function formatReviewComments(
.map((comment) => {
let body = comment.body;
// Replace image URLs with local paths if we have a mapping
if (imageUrlMap) {
for (const [originalUrl, localPath] of imageUrlMap) {
body = body.replaceAll(originalUrl, localPath);
}
}
body = sanitizeContent(body);
return ` [Comment on ${comment.path}:${comment.line || "?"}]: ${body}`;
})
.join("\n");

View File

@@ -51,8 +51,9 @@ export async function setupBranch(
const branchName = prData.headRefName;
// Execute git commands to checkout PR branch
await $`git fetch origin ${branchName}`;
// Execute git commands to checkout PR branch (shallow fetch for performance)
// Fetch the branch with a depth of 20 to avoid fetching too much history, while still allowing for some context
await $`git fetch origin --depth=20 ${branchName}`;
await $`git checkout ${branchName}`;
console.log(`Successfully checked out PR branch for PR #${entityNumber}`);
@@ -98,8 +99,8 @@ export async function setupBranch(
sha: currentSHA,
});
// Checkout the new branch
await $`git fetch origin ${newBranch}`;
// Checkout the new branch (shallow fetch for performance)
await $`git fetch origin --depth=1 ${newBranch}`;
await $`git checkout ${newBranch}`;
console.log(

View File

@@ -39,25 +39,19 @@ async function retryWithBackoff<T>(
}
}
throw new Error(
`Operation failed after ${maxAttempts} attempts. Last error: ${
lastError?.message ?? "Unknown error"
}`,
);
console.error(`Operation failed after ${maxAttempts} attempts`);
throw lastError;
}
async function getOidcToken(): Promise<string> {
try {
const oidcToken = await core.getIDToken("claude-code-github-action");
if (!oidcToken) {
throw new Error("OIDC token not found");
}
return oidcToken;
} catch (error) {
console.error("Failed to get OIDC token:", error);
throw new Error(
`Failed to get OIDC token: ${error instanceof Error ? error.message : String(error)}`,
"Could not fetch an OIDC token. Did you remember to add `id-token: write` to your workflow permissions?",
);
}
}
@@ -74,9 +68,15 @@ async function exchangeForAppToken(oidcToken: string): Promise<string> {
);
if (!response.ok) {
throw new Error(
`App token exchange failed: ${response.status} ${response.statusText}`,
const responseJson = (await response.json()) as {
error?: {
message?: string;
};
};
console.error(
`App token exchange failed: ${response.status} ${response.statusText} - ${responseJson?.error?.message ?? "Unknown error"}`,
);
throw new Error(`${responseJson?.error?.message ?? "Unknown error"}`);
}
const appTokenData = (await response.json()) as {
@@ -117,7 +117,9 @@ export async function setupGitHubToken(): Promise<string> {
core.setOutput("GITHUB_TOKEN", appToken);
return appToken;
} catch (error) {
core.setFailed(`Failed to setup GitHub token: ${error}`);
core.setFailed(
`Failed to setup GitHub token: ${error}.\n\nIf you instead wish to use this action with a custom GitHub token or custom GitHub app, provide a \`github_token\` in the \`uses\` section of the app in your workflow yml file.`,
);
process.exit(1);
}
}

View File

@@ -0,0 +1,65 @@
export function stripInvisibleCharacters(content: string): string {
content = content.replace(/[\u200B\u200C\u200D\uFEFF]/g, "");
content = content.replace(
/[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F-\u009F]/g,
"",
);
content = content.replace(/\u00AD/g, "");
content = content.replace(/[\u202A-\u202E\u2066-\u2069]/g, "");
return content;
}
export function stripMarkdownImageAltText(content: string): string {
return content.replace(/!\[[^\]]*\]\(/g, "![](");
}
export function stripMarkdownLinkTitles(content: string): string {
content = content.replace(/(\[[^\]]*\]\([^)]+)\s+"[^"]*"/g, "$1");
content = content.replace(/(\[[^\]]*\]\([^)]+)\s+'[^']*'/g, "$1");
return content;
}
export function stripHiddenAttributes(content: string): string {
content = content.replace(/\salt\s*=\s*["'][^"']*["']/gi, "");
content = content.replace(/\salt\s*=\s*[^\s>]+/gi, "");
content = content.replace(/\stitle\s*=\s*["'][^"']*["']/gi, "");
content = content.replace(/\stitle\s*=\s*[^\s>]+/gi, "");
content = content.replace(/\saria-label\s*=\s*["'][^"']*["']/gi, "");
content = content.replace(/\saria-label\s*=\s*[^\s>]+/gi, "");
content = content.replace(/\sdata-[a-zA-Z0-9-]+\s*=\s*["'][^"']*["']/gi, "");
content = content.replace(/\sdata-[a-zA-Z0-9-]+\s*=\s*[^\s>]+/gi, "");
content = content.replace(/\splaceholder\s*=\s*["'][^"']*["']/gi, "");
content = content.replace(/\splaceholder\s*=\s*[^\s>]+/gi, "");
return content;
}
export function normalizeHtmlEntities(content: string): string {
content = content.replace(/&#(\d+);/g, (_, dec) => {
const num = parseInt(dec, 10);
if (num >= 32 && num <= 126) {
return String.fromCharCode(num);
}
return "";
});
content = content.replace(/&#x([0-9a-fA-F]+);/g, (_, hex) => {
const num = parseInt(hex, 16);
if (num >= 32 && num <= 126) {
return String.fromCharCode(num);
}
return "";
});
return content;
}
export function sanitizeContent(content: string): string {
content = stripHtmlComments(content);
content = stripInvisibleCharacters(content);
content = stripMarkdownImageAltText(content);
content = stripMarkdownLinkTitles(content);
content = stripHiddenAttributes(content);
content = normalizeHtmlEntities(content);
return content;
}
export const stripHtmlComments = (content: string) =>
content.replace(/<!--[\s\S]*?-->/g, "");

View File

@@ -4,6 +4,7 @@ import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
import { readFile } from "fs/promises";
import { join } from "path";
import fetch from "node-fetch";
import { GITHUB_API_URL } from "../github/api/config";
@@ -36,6 +37,7 @@ type GitHubNewCommit = {
const REPO_OWNER = process.env.REPO_OWNER;
const REPO_NAME = process.env.REPO_NAME;
const BRANCH_NAME = process.env.BRANCH_NAME;
const REPO_DIR = process.env.REPO_DIR || process.cwd();
if (!REPO_OWNER || !REPO_NAME || !BRANCH_NAME) {
console.error(
@@ -71,18 +73,9 @@ server.tool(
throw new Error("GITHUB_TOKEN environment variable is required");
}
// Convert absolute paths to relative if they match CWD
const cwd = process.cwd();
const processedFiles = files.map((filePath) => {
if (filePath.startsWith("/")) {
if (filePath.startsWith(cwd)) {
// Strip CWD from absolute path
return filePath.slice(cwd.length + 1);
} else {
throw new Error(
`Path '${filePath}' must be relative to repository root or within current working directory`,
);
}
return filePath.slice(1);
}
return filePath;
});
@@ -126,7 +119,11 @@ server.tool(
// 3. Create tree entries for all files
const treeEntries = await Promise.all(
processedFiles.map(async (filePath) => {
const content = await readFile(filePath, "utf-8");
const fullPath = filePath.startsWith("/")
? filePath
: join(REPO_DIR, filePath);
const content = await readFile(fullPath, "utf-8");
return {
path: filePath,
mode: "100644",
@@ -232,13 +229,16 @@ server.tool(
],
};
} catch (error) {
const errorMessage =
error instanceof Error ? error.message : String(error);
return {
content: [
{
type: "text",
text: `Error: ${error instanceof Error ? error.message : String(error)}`,
text: `Error: ${errorMessage}`,
},
],
error: errorMessage,
isError: true,
};
}
@@ -423,13 +423,16 @@ server.tool(
],
};
} catch (error) {
const errorMessage =
error instanceof Error ? error.message : String(error);
return {
content: [
{
type: "text",
text: `Error: ${error instanceof Error ? error.message : String(error)}`,
text: `Error: ${errorMessage}`,
},
],
error: errorMessage,
isError: true,
};
}

View File

@@ -34,6 +34,7 @@ export async function prepareMcpConfig(
REPO_OWNER: owner,
REPO_NAME: repo,
BRANCH_NAME: branch,
REPO_DIR: process.env.GITHUB_WORKSPACE || process.cwd(),
},
},
},

View File

@@ -98,9 +98,9 @@ Some more text.`;
const result = formatBody(body, imageUrlMap);
expect(result)
.toBe(`Here is some text with an image: ![screenshot](/tmp/github-images/image-1234-0.png)
.toBe(`Here is some text with an image: ![](/tmp/github-images/image-1234-0.png)
And another one: ![another](/tmp/github-images/image-1234-1.jpg)
And another one: ![](/tmp/github-images/image-1234-1.jpg)
Some more text.`);
});
@@ -123,7 +123,7 @@ Some more text.`);
]);
const result = formatBody(body, imageUrlMap);
expect(result).toBe("![image](https://example.com/image.png)");
expect(result).toBe("![](https://example.com/image.png)");
});
test("handles multiple occurrences of same image", () => {
@@ -138,8 +138,8 @@ Second: ![img](https://github.com/user-attachments/assets/test.png)`;
]);
const result = formatBody(body, imageUrlMap);
expect(result).toBe(`First: ![img](/tmp/github-images/image-1234-0.png)
Second: ![img](/tmp/github-images/image-1234-0.png)`);
expect(result).toBe(`First: ![](/tmp/github-images/image-1234-0.png)
Second: ![](/tmp/github-images/image-1234-0.png)`);
});
});
@@ -204,7 +204,7 @@ describe("formatComments", () => {
const result = formatComments(comments, imageUrlMap);
expect(result).toBe(
`[user1 at 2023-01-01T00:00:00Z]: Check out this screenshot: ![screenshot](/tmp/github-images/image-1234-0.png)\n\n[user2 at 2023-01-02T00:00:00Z]: Here's another image: ![bug](/tmp/github-images/image-1234-1.jpg)`,
`[user1 at 2023-01-01T00:00:00Z]: Check out this screenshot: ![](/tmp/github-images/image-1234-0.png)\n\n[user2 at 2023-01-02T00:00:00Z]: Here's another image: ![](/tmp/github-images/image-1234-1.jpg)`,
);
});
@@ -232,7 +232,7 @@ describe("formatComments", () => {
const result = formatComments(comments, imageUrlMap);
expect(result).toBe(
`[user1 at 2023-01-01T00:00:00Z]: Two images: ![first](/tmp/github-images/image-1234-0.png) and ![second](/tmp/github-images/image-1234-1.png)`,
`[user1 at 2023-01-01T00:00:00Z]: Two images: ![](/tmp/github-images/image-1234-0.png) and ![](/tmp/github-images/image-1234-1.png)`,
);
});
@@ -249,7 +249,7 @@ describe("formatComments", () => {
const result = formatComments(comments);
expect(result).toBe(
`[user1 at 2023-01-01T00:00:00Z]: Image: ![test](https://github.com/user-attachments/assets/test.png)`,
`[user1 at 2023-01-01T00:00:00Z]: Image: ![](https://github.com/user-attachments/assets/test.png)`,
);
});
});
@@ -293,7 +293,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n [Comment on src/index.ts:42]: Nice implementation\n [Comment on src/utils.ts:?]: Consider adding error handling`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nThis is a great PR! LGTM.\n [Comment on src/index.ts:42]: Nice implementation\n [Comment on src/utils.ts:?]: Consider adding error handling`,
);
});
@@ -316,7 +316,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nLooks good to me!`,
);
});
@@ -383,7 +383,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: CHANGES_REQUESTED\n\n[Review by reviewer2 at 2023-01-02T00:00:00Z]: APPROVED`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: CHANGES_REQUESTED\nNeeds changes\n\n[Review by reviewer2 at 2023-01-02T00:00:00Z]: APPROVED\nLGTM`,
);
});
@@ -437,7 +437,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData, imageUrlMap);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n [Comment on src/index.ts:42]: Comment with image: ![comment-img](/tmp/github-images/image-1234-1.png)`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nReview with image: ![](/tmp/github-images/image-1234-0.png)\n [Comment on src/index.ts:42]: Comment with image: ![](/tmp/github-images/image-1234-1.png)`,
);
});
@@ -481,7 +481,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData, imageUrlMap);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n [Comment on src/main.ts:15]: Two issues: ![issue1](/tmp/github-images/image-1234-0.png) and ![issue2](/tmp/github-images/image-1234-1.png)`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nGood work\n [Comment on src/main.ts:15]: Two issues: ![](/tmp/github-images/image-1234-0.png) and ![](/tmp/github-images/image-1234-1.png)`,
);
});
@@ -514,7 +514,7 @@ describe("formatReviewComments", () => {
const result = formatReviewComments(reviewData);
expect(result).toBe(
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n [Comment on src/index.ts:42]: Image: ![test](https://github.com/user-attachments/assets/test.png)`,
`[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nReview body\n [Comment on src/index.ts:42]: Image: ![](https://github.com/user-attachments/assets/test.png)`,
);
});
});

View File

@@ -0,0 +1,134 @@
import { describe, expect, it } from "bun:test";
import { formatBody, formatComments } from "../src/github/data/formatter";
import type { GitHubComment } from "../src/github/types";
describe("Sanitization Integration", () => {
it("should sanitize complete issue/PR body with various hidden content patterns", () => {
const issueBody = `
# Feature Request: Add user dashboard
## Description
We need a new dashboard for users to track their activity.
<!-- HTML comment that should be removed -->
## Technical Details
The dashboard should display:
- User statistics ![dashboard mockup with hiddentext](dashboard.png)
- Activity graphs <img alt="example graph description" src="graph.jpg">
- Recent actions
## Implementation Notes
See [documentation](https://docs.example.com "internal docs title") for API details.
<div data-instruction="example instruction" aria-label="dashboard label" title="hover text">
The implementation should follow our standard patterns.
</div>
Additional notes: Text­with­soft­hyphens and &#72;&#105;&#100;&#100;&#101;&#110; encoded content.
<input placeholder="search placeholder" type="text" />
Direction override test: reversed text should be normalized.`;
const imageUrlMap = new Map<string, string>();
const result = formatBody(issueBody, imageUrlMap);
// Verify hidden content is removed
expect(result).not.toContain("<!-- HTML comment");
expect(result).not.toContain("hiddentext");
expect(result).not.toContain("example graph description");
expect(result).not.toContain("internal docs title");
expect(result).not.toContain("example instruction");
expect(result).not.toContain("dashboard label");
expect(result).not.toContain("hover text");
expect(result).not.toContain("search placeholder");
expect(result).not.toContain("\u200B");
expect(result).not.toContain("\u200C");
expect(result).not.toContain("\u200D");
expect(result).not.toContain("\u00AD");
expect(result).not.toContain("\u202E");
expect(result).not.toContain("&#72;");
// Verify legitimate content is preserved
expect(result).toContain("# Feature Request: Add user dashboard");
expect(result).toContain("## Description");
expect(result).toContain("We need a new dashboard");
expect(result).toContain("User statistics");
expect(result).toContain("![](dashboard.png)");
expect(result).toContain('<img src="graph.jpg">');
expect(result).toContain("[documentation](https://docs.example.com)");
expect(result).toContain(
"The implementation should follow our standard patterns",
);
expect(result).toContain("Hidden encoded content");
expect(result).toContain('<input type="text" />');
});
it("should sanitize GitHub comments preserving discussion flow", () => {
const comments: GitHubComment[] = [
{
id: "1",
databaseId: "100001",
body: `Great idea! Here are my thoughts:
1. We should consider the performance impact
2. The UI mockup looks good: ![ui design](mockup.png)
3. Check the [API docs](https://api.example.com "api reference") for rate limits
<div aria-label="comment metadata" data-comment-type="review">
This change would affect multiple systems.
</div>
Note: Implementationshouldfollowbestpractices.`,
author: { login: "reviewer1" },
createdAt: "2023-01-01T10:00:00Z",
},
{
id: "2",
databaseId: "100002",
body: `Thanks for the feedback!
<!-- Internal note: discussed with team -->
I've updated the proposal based on your suggestions.
&#84;&#101;&#115;&#116; &#110;&#111;&#116;&#101;: All systems checked.
<span title="status update" data-status="approved">Ready for implementation</span>`,
author: { login: "author1" },
createdAt: "2023-01-01T12:00:00Z",
},
];
const result = formatComments(comments);
// Verify hidden content is removed
expect(result).not.toContain("<!-- Internal note");
expect(result).not.toContain("api reference");
expect(result).not.toContain("comment metadata");
expect(result).not.toContain('data-comment-type="review"');
expect(result).not.toContain("status update");
expect(result).not.toContain('data-status="approved"');
expect(result).not.toContain("\u200B");
expect(result).not.toContain("&#84;");
// Verify discussion flow is preserved
expect(result).toContain("Great idea! Here are my thoughts:");
expect(result).toContain("1. We should consider the performance impact");
expect(result).toContain("2. The UI mockup looks good: ![](mockup.png)");
expect(result).toContain(
"3. Check the [API docs](https://api.example.com)",
);
expect(result).toContain("This change would affect multiple systems.");
expect(result).toContain("Implementationshouldfollowbestpractices");
expect(result).toContain("Thanks for the feedback!");
expect(result).toContain(
"I've updated the proposal based on your suggestions.",
);
expect(result).toContain("Test note: All systems checked.");
expect(result).toContain("Ready for implementation");
expect(result).toContain("[reviewer1 at");
expect(result).toContain("[author1 at");
});
});

259
test/sanitizer.test.ts Normal file
View File

@@ -0,0 +1,259 @@
import { describe, expect, it } from "bun:test";
import {
stripInvisibleCharacters,
stripMarkdownImageAltText,
stripMarkdownLinkTitles,
stripHiddenAttributes,
normalizeHtmlEntities,
sanitizeContent,
stripHtmlComments,
} from "../src/github/utils/sanitizer";
describe("stripInvisibleCharacters", () => {
it("should remove zero-width characters", () => {
expect(stripInvisibleCharacters("Hello\u200BWorld")).toBe("HelloWorld");
expect(stripInvisibleCharacters("Text\u200C\u200D")).toBe("Text");
expect(stripInvisibleCharacters("\uFEFFStart")).toBe("Start");
});
it("should remove control characters", () => {
expect(stripInvisibleCharacters("Hello\u0000World")).toBe("HelloWorld");
expect(stripInvisibleCharacters("Text\u001F\u007F")).toBe("Text");
});
it("should preserve common whitespace", () => {
expect(stripInvisibleCharacters("Hello\nWorld")).toBe("Hello\nWorld");
expect(stripInvisibleCharacters("Tab\there")).toBe("Tab\there");
expect(stripInvisibleCharacters("Carriage\rReturn")).toBe(
"Carriage\rReturn",
);
});
it("should remove soft hyphens", () => {
expect(stripInvisibleCharacters("Soft\u00ADHyphen")).toBe("SoftHyphen");
});
it("should remove Unicode direction overrides", () => {
expect(stripInvisibleCharacters("Text\u202A\u202BMore")).toBe("TextMore");
expect(stripInvisibleCharacters("\u2066Isolated\u2069")).toBe("Isolated");
});
});
describe("stripMarkdownImageAltText", () => {
it("should remove alt text from markdown images", () => {
expect(stripMarkdownImageAltText("![example alt text](image.png)")).toBe(
"![](image.png)",
);
expect(
stripMarkdownImageAltText("Text ![description](pic.jpg) more text"),
).toBe("Text ![](pic.jpg) more text");
});
it("should handle multiple images", () => {
expect(stripMarkdownImageAltText("![one](1.png) ![two](2.png)")).toBe(
"![](1.png) ![](2.png)",
);
});
it("should handle empty alt text", () => {
expect(stripMarkdownImageAltText("![](image.png)")).toBe("![](image.png)");
});
});
describe("stripMarkdownLinkTitles", () => {
it("should remove titles from markdown links", () => {
expect(stripMarkdownLinkTitles('[Link](url.com "example title")')).toBe(
"[Link](url.com)",
);
expect(stripMarkdownLinkTitles("[Link](url.com 'example title')")).toBe(
"[Link](url.com)",
);
});
it("should handle multiple links", () => {
expect(
stripMarkdownLinkTitles('[One](1.com "first") [Two](2.com "second")'),
).toBe("[One](1.com) [Two](2.com)");
});
it("should preserve links without titles", () => {
expect(stripMarkdownLinkTitles("[Link](url.com)")).toBe("[Link](url.com)");
});
});
describe("stripHiddenAttributes", () => {
it("should remove alt attributes", () => {
expect(
stripHiddenAttributes('<img alt="example text" src="pic.jpg">'),
).toBe('<img src="pic.jpg">');
expect(stripHiddenAttributes("<img alt='example' src=\"pic.jpg\">")).toBe(
'<img src="pic.jpg">',
);
expect(stripHiddenAttributes('<img alt=example src="pic.jpg">')).toBe(
'<img src="pic.jpg">',
);
});
it("should remove title attributes", () => {
expect(
stripHiddenAttributes('<a title="example text" href="#">Link</a>'),
).toBe('<a href="#">Link</a>');
expect(stripHiddenAttributes("<div title='example'>Content</div>")).toBe(
"<div>Content</div>",
);
});
it("should remove aria-label attributes", () => {
expect(
stripHiddenAttributes('<button aria-label="example">Click</button>'),
).toBe("<button>Click</button>");
});
it("should remove data-* attributes", () => {
expect(
stripHiddenAttributes(
'<div data-test="example" data-info="more example">Text</div>',
),
).toBe("<div>Text</div>");
});
it("should remove placeholder attributes", () => {
expect(
stripHiddenAttributes('<input placeholder="example text" type="text">'),
).toBe('<input type="text">');
});
it("should handle multiple attributes", () => {
expect(
stripHiddenAttributes(
'<img alt="example" title="test" src="pic.jpg" class="image">',
),
).toBe('<img src="pic.jpg" class="image">');
});
});
describe("normalizeHtmlEntities", () => {
it("should decode numeric entities", () => {
expect(normalizeHtmlEntities("&#72;&#101;&#108;&#108;&#111;")).toBe(
"Hello",
);
expect(normalizeHtmlEntities("&#65;&#66;&#67;")).toBe("ABC");
});
it("should decode hex entities", () => {
expect(normalizeHtmlEntities("&#x48;&#x65;&#x6C;&#x6C;&#x6F;")).toBe(
"Hello",
);
expect(normalizeHtmlEntities("&#x41;&#x42;&#x43;")).toBe("ABC");
});
it("should remove non-printable entities", () => {
expect(normalizeHtmlEntities("&#0;&#31;")).toBe("");
expect(normalizeHtmlEntities("&#x00;&#x1F;")).toBe("");
});
it("should preserve normal text", () => {
expect(normalizeHtmlEntities("Normal text")).toBe("Normal text");
});
});
describe("sanitizeContent", () => {
it("should apply all sanitization measures", () => {
const testContent = `
<!-- This is a comment -->
<img alt="example alt text" src="image.jpg">
![example image description](screenshot.png)
[click here](https://example.com "example title")
<div data-prompt="example data" aria-label="example label">
Normal text with hidden\u200Bcharacters
</div>
&#72;&#105;&#100;&#100;&#101;&#110; message
`;
const sanitized = sanitizeContent(testContent);
expect(sanitized).not.toContain("<!-- This is a comment -->");
expect(sanitized).not.toContain("example alt text");
expect(sanitized).not.toContain("example image description");
expect(sanitized).not.toContain("example title");
expect(sanitized).not.toContain("example data");
expect(sanitized).not.toContain("example label");
expect(sanitized).not.toContain("\u200B");
expect(sanitized).not.toContain("alt=");
expect(sanitized).not.toContain("data-prompt=");
expect(sanitized).not.toContain("aria-label=");
expect(sanitized).toContain("Normal text with hiddencharacters");
expect(sanitized).toContain("Hidden message");
expect(sanitized).toContain('<img src="image.jpg">');
expect(sanitized).toContain("![](screenshot.png)");
expect(sanitized).toContain("[click here](https://example.com)");
});
it("should handle complex nested patterns", () => {
const complexContent = `
Text with ![alt \u200B text](image.png) and more.
<a href="#" title="example\u00ADtitle">Link</a>
<div data-x="&#72;&#105;">Content</div>
`;
const sanitized = sanitizeContent(complexContent);
expect(sanitized).not.toContain("\u200B");
expect(sanitized).not.toContain("\u00AD");
expect(sanitized).not.toContain("alt ");
expect(sanitized).not.toContain('title="');
expect(sanitized).not.toContain('data-x="');
expect(sanitized).toContain("![](image.png)");
expect(sanitized).toContain('<a href="#">Link</a>');
});
it("should preserve legitimate markdown and HTML", () => {
const legitimateContent = `
# Heading
This is **bold** and *italic* text.
Here's a normal image: ![](normal.jpg)
And a normal link: [Click here](https://example.com)
<div class="container">
<p id="para">Normal paragraph</p>
<input type="text" name="field">
</div>
`;
const sanitized = sanitizeContent(legitimateContent);
expect(sanitized).toBe(legitimateContent);
});
it("should handle entity-encoded text", () => {
const encodedText = `
&#72;&#105;&#100;&#100;&#101;&#110; &#109;&#101;&#115;&#115;&#97;&#103;&#101;
<div title="&#101;&#120;&#97;&#109;&#112;&#108;&#101;">Test</div>
`;
const sanitized = sanitizeContent(encodedText);
expect(sanitized).toContain("Hidden message");
expect(sanitized).not.toContain('title="');
expect(sanitized).toContain("<div>Test</div>");
});
});
describe("stripHtmlComments (legacy)", () => {
it("should remove HTML comments", () => {
expect(stripHtmlComments("Hello <!-- example -->World")).toBe(
"Hello World",
);
expect(stripHtmlComments("<!-- comment -->Text")).toBe("Text");
expect(stripHtmlComments("Text<!-- comment -->")).toBe("Text");
});
it("should handle multiline comments", () => {
expect(stripHtmlComments("Hello <!-- \nexample\n -->World")).toBe(
"Hello World",
);
});
});