fix mistake in FAQ

switch to opus for this repo's claude workflow (#97 )
* switch to opus for this repo's claude workflow * prettier
2026-01-23 23:14:13 +08:00 · 2025-05-30 10:56:54 -07:00 · 2025-05-30 08:14:11 -07:00 · 2025-05-29 16:45:44 -07:00 · 2025-05-29 16:35:50 -07:00 · 2025-05-29 12:57:57 -07:00
10 changed files with 665 additions and 176 deletions
--- a/.github/workflows/claude.yml
+++ b/.github/workflows/claude.yml
@@ -36,3 +36,4 @@ jobs:
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
          allowed_tools: "Bash(bun install),Bash(bun test:*),Bash(bun run format),Bash(bun typecheck)"
          custom_instructions: "You have also been granted tools for editing files and running bun commands (install, run, test, typecheck) for testing your changes: bun install, bun test, bun run format, bun typecheck."
+          model: "claude-opus-4-20250514"
--- a/.github/workflows/issue-triage.yml
+++ b/.github/workflows/issue-triage.yml
@@ -99,6 +99,6 @@ jobs:
        with:
          prompt_file: /tmp/claude-prompts/triage-prompt.txt
          allowed_tools: "Bash(gh label list),mcp__github__get_issue,mcp__github__get_issue_comments,mcp__github__update_issue,mcp__github__search_issues,mcp__github__list_issues"
-          mcp_config_file: /tmp/mcp-config/mcp-servers.json
+          mcp_config: /tmp/mcp-config/mcp-servers.json
          timeout_minutes: "5"
          anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
--- a/FAQ.md
+++ b/FAQ.md
@@ -0,0 +1,156 @@
+# Frequently Asked Questions (FAQ)
+
+This FAQ addresses common questions and gotchas when using the Claude Code GitHub Action.
+
+## Triggering and Authentication
+
+### Why doesn't tagging @claude from my automated workflow work?
+
+The `github-actions` user cannot trigger subsequent GitHub Actions workflows. This is a GitHub security feature to prevent infinite loops. To make this work, you need to use a Personal Access Token (PAT) instead, which will act as a regular user, or use a separate app token of your own. When posting a comment on an issue or PR from your workflow, use your PAT instead of the `GITHUB_TOKEN` generated in your workflow.
+
+### Why does Claude say I don't have permission to trigger it?
+
+Only users with **write permissions** to the repository can trigger Claude. This is a security feature to prevent unauthorized use. Make sure the user commenting has at least write access to the repository.
+
+### Why am I getting OIDC authentication errors?
+
+If you're using the default GitHub App authentication, you must add the `id-token: write` permission to your workflow:
+
+```yaml
+permissions:
+  contents: read
+  id-token: write # Required for OIDC authentication
+```
+
+The OIDC token is required in order for the Claude GitHub app to function. If you wish to not use the GitHub app, you can instead provide a `github_token` input to the action for Claude to operate with. See the [Claude Code permissions documentation][perms] for more.
+
+## Claude's Capabilities and Limitations
+
+### Why won't Claude update workflow files when I ask it to?
+
+The GitHub App for Claude doesn't have workflow write access for security reasons. This prevents Claude from modifying CI/CD configurations that could potentially create unintended consequences. This is something we may reconsider in the future.
+
+### Why won't Claude rebase my branch?
+
+By default, Claude only uses commit tools for non-destructive changes to the branch. Claude is configured to:
+
+- Never push to branches other than where it was invoked (either its own branch or the PR branch)
+- Never force push or perform destructive operations
+
+You can grant additional tools via the `allowed_tools` input if needed:
+
+```yaml
+allowed_tools: "Bash(git rebase:*)" # Use with caution
+```
+
+### Why won't Claude create a pull request?
+
+Claude doesn't create PRs by default. Instead, it pushes commits to a branch and provides a link to a pre-filled PR submission page. This approach ensures your repository's branch protection rules are still adhered to and gives you final control over PR creation.
+
+### Why can't Claude run my tests or see CI results?
+
+Claude cannot access GitHub Actions logs, test results, or other CI/CD outputs by default. It only has access to the repository files. If you need Claude to see test results, you can either:
+
+1. Instruct Claude to run tests before making commits
+2. Copy and paste CI results into a comment for Claude to analyze
+
+This limitation exists for security reasons but may be reconsidered in the future based on user feedback.
+
+### Why does Claude only update one comment instead of creating new ones?
+
+Claude is configured to update a single comment to avoid cluttering PR/issue discussions. All of Claude's responses, including progress updates and final results, will appear in the same comment with checkboxes showing task progress.
+
+## Branch and Commit Behavior
+
+### Why did Claude create a new branch when commenting on a closed PR?
+
+Claude's branch behavior depends on the context:
+
+- **Open PRs**: Pushes directly to the existing PR branch
+- **Closed/Merged PRs**: Creates a new branch (cannot push to closed PR branches)
+- **Issues**: Always creates a new branch with a timestamp
+
+### Why are my commits shallow/missing history?
+
+For performance, Claude uses shallow clones:
+
+- PRs: `--depth=20` (last 20 commits)
+- New branches: `--depth=1` (single commit)
+
+If you need full history, you can configure this in your workflow before calling Claude in the `actions/checkout` step.
+
+```
+- uses: actions/checkout@v4
+  depth: 0 # will fetch full repo history
+```
+
+## Configuration and Tools
+
+### What's the difference between `direct_prompt` and `custom_instructions`?
+
+These inputs serve different purposes in how Claude responds:
+
+- **`direct_prompt`**: Bypasses trigger detection entirely. When provided, Claude executes this exact instruction regardless of comments or mentions. Perfect for automated workflows where you want Claude to perform a specific task on every run (e.g., "Update the API documentation based on changes in this PR").
+
+- **`custom_instructions`**: Additional context added to Claude's system prompt while still respecting normal triggers. These instructions modify Claude's behavior but don't replace the triggering comment. Use this to give Claude standing instructions like "You have been granted additional tools for ...".
+
+Example:
+
+```yaml
+# Using direct_prompt - runs automatically without @claude mention
+direct_prompt: "Review this PR for security vulnerabilities"
+
+# Using custom_instructions - still requires @claude trigger
+custom_instructions: "Focus on performance implications and suggest optimizations"
+```
+
+### Why doesn't Claude execute my bash commands?
+
+The Bash tool is **disabled by default** for security. To enable individual bash commands:
+
+```yaml
+allowed_tools: "Bash(npm:*),Bash(git:*)" # Allows only npm and git commands
+```
+
+### Can Claude work across multiple repositories?
+
+No, Claude's GitHub app token is sandboxed to the current repository only. It cannot push to any other repositories. It can, however, read public repositories, but to get access to this, you must configure it with tools to do so.
+
+## MCP Servers and Extended Functionality
+
+### What MCP servers are available by default?
+
+Claude Code Action automatically configures two MCP servers:
+
+1. **GitHub MCP server**: For GitHub API operations
+2. **File operations server**: For advanced file manipulation
+
+However, tools from these servers still need to be explicitly allowed via `allowed_tools`.
+
+## Troubleshooting
+
+### How can I debug what Claude is doing?
+
+Check the GitHub Action log for Claude's run for the full execution trace.
+
+### Why can't I trigger Claude with `@claude-mention` or `claude!`?
+
+The trigger uses word boundaries, so `@claude` must be a complete word. Variations like `@claude-bot`, `@claude!`, or `claude@mention` won't work unless you customize the `trigger_phrase`.
+
+## Best Practices
+
+1. **Always specify permissions explicitly** in your workflow file
+2. **Use GitHub Secrets** for API keys - never hardcode them
+3. **Be specific with `allowed_tools`** - only enable what's necessary
+4. **Test in a separate branch** before using on important PRs
+5. **Monitor Claude's token usage** to avoid hitting API limits
+6. **Review Claude's changes** carefully before merging
+
+## Getting Help
+
+If you encounter issues not covered here:
+
+1. Check the [GitHub Issues](https://github.com/anthropics/claude-code-action/issues)
+2. Review the [example workflows](https://github.com/anthropics/claude-code-action#examples)
+
+[perms]: https://docs.anthropic.com/en/docs/claude-code/settings#permissions
--- a/README.md
+++ b/README.md
@@ -33,6 +33,10 @@ This command will guide you through setting up the GitHub app and required secre
 2. Add `ANTHROPIC_API_KEY` to your repository secrets ([Learn how to use secrets in GitHub Actions](https://docs.github.com/en/actions/security-for-github-actions/security-guides/using-secrets-in-github-actions))
 3. Copy the workflow file from [`examples/claude.yml`](./examples/claude.yml) into your repository's `.github/workflows/`

+## 📚 FAQ
+
+Having issues or questions? Check out our [Frequently Asked Questions](./FAQ.md) for solutions to common problems and detailed explanations of Claude's capabilities and limitations.
+
 ## Usage

 Add a workflow file to your repository (e.g., `.github/workflows/claude.yml`):
--- a/src/create-prompt/index.ts
+++ b/src/create-prompt/index.ts
@@ -9,8 +9,8 @@ import {
  formatComments,
  formatReviewComments,
  formatChangedFilesWithSHA,
-  stripHtmlComments,
 } from "../github/data/formatter";
+import { sanitizeContent } from "../github/utils/sanitizer";
 import {
  isIssuesEvent,
  isIssueCommentEvent,
@@ -436,14 +436,14 @@ ${
    eventData.eventName === "pull_request_review") &&
  eventData.commentBody
    ? `<trigger_comment>
-${stripHtmlComments(eventData.commentBody)}
+${sanitizeContent(eventData.commentBody)}
 </trigger_comment>`
    : ""
 }
 ${
  context.directPrompt
    ? `<direct_prompt>
-${stripHtmlComments(context.directPrompt)}
+${sanitizeContent(context.directPrompt)}
 </direct_prompt>`
    : ""
 }
@@ -611,6 +611,11 @@ What You CANNOT Do:
 - Execute commands outside the repository context
 - Run arbitrary Bash commands (unless explicitly allowed via allowed_tools configuration)
 - Perform branch operations (cannot merge branches, rebase, or perform other git operations beyond pushing commits)
+- Modify files in the .github/workflows directory (GitHub App permissions do not allow workflow modifications)
+- View CI/CD results or workflow run outputs (cannot access GitHub Actions logs or test results)
+
+When users ask you to perform actions you cannot do, politely explain the limitation and, when applicable, direct them to the FAQ for more information and workarounds:
+"I'm unable to [specific action] due to [reason]. You can find more information and potential workarounds in the [FAQ](https://github.com/anthropics/claude-code-action/blob/main/FAQ.md)."

 If a user asks for something outside these capabilities (and you have no other tools provided), politely explain that you cannot perform that action and suggest an alternative approach if possible.

--- a/src/github/data/formatter.ts
+++ b/src/github/data/formatter.ts
@@ -6,10 +6,7 @@ import type {
  GitHubReview,
 } from "../types";
 import type { GitHubFileWithSHA } from "./fetcher";
-
-export function stripHtmlComments(text: string): string {
-  return text.replace(/<!--[\s\S]*?-->/g, "");
-}
+import { sanitizeContent } from "../utils/sanitizer";

 export function formatContext(
  contextData: GitHubPullRequest | GitHubIssue,
@@ -37,13 +34,14 @@ export function formatBody(
  body: string,
  imageUrlMap: Map<string, string>,
 ): string {
-  let processedBody = stripHtmlComments(body);
+  let processedBody = body;

-  // Replace image URLs with local paths
  for (const [originalUrl, localPath] of imageUrlMap) {
    processedBody = processedBody.replaceAll(originalUrl, localPath);
  }

+  processedBody = sanitizeContent(processedBody);
+
  return processedBody;
 }

@@ -53,15 +51,16 @@ export function formatComments(
 ): string {
  return comments
    .map((comment) => {
-      let body = stripHtmlComments(comment.body);
+      let body = comment.body;

-      // Replace image URLs with local paths if we have a mapping
      if (imageUrlMap && body) {
        for (const [originalUrl, localPath] of imageUrlMap) {
          body = body.replaceAll(originalUrl, localPath);
        }
      }

+      body = sanitizeContent(body);
+
      return `[${comment.author.login} at ${comment.createdAt}]: ${body}`;
    })
    .join("\n\n");
@@ -78,6 +77,19 @@ export function formatReviewComments(
  const formattedReviews = reviewData.nodes.map((review) => {
    let reviewOutput = `[Review by ${review.author.login} at ${review.submittedAt}]: ${review.state}`;

+    if (review.body && review.body.trim()) {
+      let body = review.body;
+
+      if (imageUrlMap) {
+        for (const [originalUrl, localPath] of imageUrlMap) {
+          body = body.replaceAll(originalUrl, localPath);
+        }
+      }
+
+      const sanitizedBody = sanitizeContent(body);
+      reviewOutput += `\n${sanitizedBody}`;
+    }
+
    if (
      review.comments &&
      review.comments.nodes &&
@@ -85,15 +97,16 @@ export function formatReviewComments(
    ) {
      const comments = review.comments.nodes
        .map((comment) => {
-          let body = stripHtmlComments(comment.body);
+          let body = comment.body;

-          // Replace image URLs with local paths if we have a mapping
          if (imageUrlMap) {
            for (const [originalUrl, localPath] of imageUrlMap) {
              body = body.replaceAll(originalUrl, localPath);
            }
          }

+          body = sanitizeContent(body);
+
          return `  [Comment on ${comment.path}:${comment.line || "?"}]: ${body}`;
        })
        .join("\n");
--- a/src/github/utils/sanitizer.ts
+++ b/src/github/utils/sanitizer.ts
@@ -0,0 +1,65 @@
+export function stripInvisibleCharacters(content: string): string {
+  content = content.replace(/[\u200B\u200C\u200D\uFEFF]/g, "");
+  content = content.replace(
+    /[\u0000-\u0008\u000B\u000C\u000E-\u001F\u007F-\u009F]/g,
+    "",
+  );
+  content = content.replace(/\u00AD/g, "");
+  content = content.replace(/[\u202A-\u202E\u2066-\u2069]/g, "");
+  return content;
+}
+
+export function stripMarkdownImageAltText(content: string): string {
+  return content.replace(/!\[[^\]]*\]\(/g, "![](");
+}
+
+export function stripMarkdownLinkTitles(content: string): string {
+  content = content.replace(/(\[[^\]]*\]\([^)]+)\s+"[^"]*"/g, "$1");
+  content = content.replace(/(\[[^\]]*\]\([^)]+)\s+'[^']*'/g, "$1");
+  return content;
+}
+
+export function stripHiddenAttributes(content: string): string {
+  content = content.replace(/\salt\s*=\s*["'][^"']*["']/gi, "");
+  content = content.replace(/\salt\s*=\s*[^\s>]+/gi, "");
+  content = content.replace(/\stitle\s*=\s*["'][^"']*["']/gi, "");
+  content = content.replace(/\stitle\s*=\s*[^\s>]+/gi, "");
+  content = content.replace(/\saria-label\s*=\s*["'][^"']*["']/gi, "");
+  content = content.replace(/\saria-label\s*=\s*[^\s>]+/gi, "");
+  content = content.replace(/\sdata-[a-zA-Z0-9-]+\s*=\s*["'][^"']*["']/gi, "");
+  content = content.replace(/\sdata-[a-zA-Z0-9-]+\s*=\s*[^\s>]+/gi, "");
+  content = content.replace(/\splaceholder\s*=\s*["'][^"']*["']/gi, "");
+  content = content.replace(/\splaceholder\s*=\s*[^\s>]+/gi, "");
+  return content;
+}
+
+export function normalizeHtmlEntities(content: string): string {
+  content = content.replace(/&#(\d+);/g, (_, dec) => {
+    const num = parseInt(dec, 10);
+    if (num >= 32 && num <= 126) {
+      return String.fromCharCode(num);
+    }
+    return "";
+  });
+  content = content.replace(/&#x([0-9a-fA-F]+);/g, (_, hex) => {
+    const num = parseInt(hex, 16);
+    if (num >= 32 && num <= 126) {
+      return String.fromCharCode(num);
+    }
+    return "";
+  });
+  return content;
+}
+
+export function sanitizeContent(content: string): string {
+  content = stripHtmlComments(content);
+  content = stripInvisibleCharacters(content);
+  content = stripMarkdownImageAltText(content);
+  content = stripMarkdownLinkTitles(content);
+  content = stripHiddenAttributes(content);
+  content = normalizeHtmlEntities(content);
+  return content;
+}
+
+export const stripHtmlComments = (content: string) =>
+  content.replace(/<!--[\s\S]*?-->/g, "");
--- a/test/data-formatter.test.ts
+++ b/test/data-formatter.test.ts
@@ -6,7 +6,6 @@ import {
  formatReviewComments,
  formatChangedFiles,
  formatChangedFilesWithSHA,
-  stripHtmlComments,
 } from "../src/github/data/formatter";
 import type {
  GitHubPullRequest,
@@ -99,9 +98,9 @@ Some more text.`;

    const result = formatBody(body, imageUrlMap);
    expect(result)
-      .toBe(`Here is some text with an image: ![screenshot](/tmp/github-images/image-1234-0.png)
+      .toBe(`Here is some text with an image: ![](/tmp/github-images/image-1234-0.png)
    
-And another one: ![another](/tmp/github-images/image-1234-1.jpg)
+And another one: ![](/tmp/github-images/image-1234-1.jpg)

 Some more text.`);
  });
@@ -124,7 +123,7 @@ Some more text.`);
    ]);

    const result = formatBody(body, imageUrlMap);
-    expect(result).toBe("![image](https://example.com/image.png)");
+    expect(result).toBe("![](https://example.com/image.png)");
  });

  test("handles multiple occurrences of same image", () => {
@@ -139,8 +138,8 @@ Second: ![img](https://github.com/user-attachments/assets/test.png)`;
    ]);

    const result = formatBody(body, imageUrlMap);
-    expect(result).toBe(`First: ![img](/tmp/github-images/image-1234-0.png)
-Second: ![img](/tmp/github-images/image-1234-0.png)`);
+    expect(result).toBe(`First: ![](/tmp/github-images/image-1234-0.png)
+Second: ![](/tmp/github-images/image-1234-0.png)`);
  });
 });

@@ -205,7 +204,7 @@ describe("formatComments", () => {

    const result = formatComments(comments, imageUrlMap);
    expect(result).toBe(
-      `[user1 at 2023-01-01T00:00:00Z]: Check out this screenshot: ![screenshot](/tmp/github-images/image-1234-0.png)\n\n[user2 at 2023-01-02T00:00:00Z]: Here's another image: ![bug](/tmp/github-images/image-1234-1.jpg)`,
+      `[user1 at 2023-01-01T00:00:00Z]: Check out this screenshot: ![](/tmp/github-images/image-1234-0.png)\n\n[user2 at 2023-01-02T00:00:00Z]: Here's another image: ![](/tmp/github-images/image-1234-1.jpg)`,
    );
  });

@@ -233,7 +232,7 @@ describe("formatComments", () => {

    const result = formatComments(comments, imageUrlMap);
    expect(result).toBe(
-      `[user1 at 2023-01-01T00:00:00Z]: Two images: ![first](/tmp/github-images/image-1234-0.png) and ![second](/tmp/github-images/image-1234-1.png)`,
+      `[user1 at 2023-01-01T00:00:00Z]: Two images: ![](/tmp/github-images/image-1234-0.png) and ![](/tmp/github-images/image-1234-1.png)`,
    );
  });

@@ -250,7 +249,7 @@ describe("formatComments", () => {

    const result = formatComments(comments);
    expect(result).toBe(
-      `[user1 at 2023-01-01T00:00:00Z]: Image: ![test](https://github.com/user-attachments/assets/test.png)`,
+      `[user1 at 2023-01-01T00:00:00Z]: Image: ![](https://github.com/user-attachments/assets/test.png)`,
    );
  });
 });
@@ -294,7 +293,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n  [Comment on src/index.ts:42]: Nice implementation\n  [Comment on src/utils.ts:?]: Consider adding error handling`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nThis is a great PR! LGTM.\n  [Comment on src/index.ts:42]: Nice implementation\n  [Comment on src/utils.ts:?]: Consider adding error handling`,
    );
  });

@@ -317,7 +316,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nLooks good to me!`,
    );
  });

@@ -384,7 +383,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: CHANGES_REQUESTED\n\n[Review by reviewer2 at 2023-01-02T00:00:00Z]: APPROVED`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: CHANGES_REQUESTED\nNeeds changes\n\n[Review by reviewer2 at 2023-01-02T00:00:00Z]: APPROVED\nLGTM`,
    );
  });

@@ -438,7 +437,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData, imageUrlMap);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n  [Comment on src/index.ts:42]: Comment with image: ![comment-img](/tmp/github-images/image-1234-1.png)`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nReview with image: ![](/tmp/github-images/image-1234-0.png)\n  [Comment on src/index.ts:42]: Comment with image: ![](/tmp/github-images/image-1234-1.png)`,
    );
  });

@@ -482,7 +481,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData, imageUrlMap);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n  [Comment on src/main.ts:15]: Two issues: ![issue1](/tmp/github-images/image-1234-0.png) and ![issue2](/tmp/github-images/image-1234-1.png)`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nGood work\n  [Comment on src/main.ts:15]: Two issues: ![](/tmp/github-images/image-1234-0.png) and ![](/tmp/github-images/image-1234-1.png)`,
    );
  });

@@ -515,7 +514,7 @@ describe("formatReviewComments", () => {

    const result = formatReviewComments(reviewData);
    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n  [Comment on src/index.ts:42]: Image: ![test](https://github.com/user-attachments/assets/test.png)`,
+      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\nReview body\n  [Comment on src/index.ts:42]: Image: ![](https://github.com/user-attachments/assets/test.png)`,
    );
  });
 });
@@ -579,150 +578,3 @@ describe("formatChangedFilesWithSHA", () => {
    expect(result).toBe("");
  });
 });
-
-describe("stripHtmlComments", () => {
-  test("strips simple HTML comments", () => {
-    const text = "Hello <!-- hidden comment --> world";
-    expect(stripHtmlComments(text)).toBe("Hello  world");
-  });
-
-  test("strips multiple HTML comments", () => {
-    const text = "Start <!-- first --> middle <!-- second --> end";
-    expect(stripHtmlComments(text)).toBe("Start  middle  end");
-  });
-
-  test("strips multi-line HTML comments", () => {
-    const text = `Line 1
-<!-- This is a
-multi-line
-comment -->
-Line 2`;
-    expect(stripHtmlComments(text)).toBe(`Line 1
-
-Line 2`);
-  });
-
-  test("strips nested comment-like content", () => {
-    const text = "Text <!-- outer <!-- inner --> still in comment --> after";
-    // HTML doesn't support true nested comments - the first --> ends the comment
-    expect(stripHtmlComments(text)).toBe("Text  still in comment --> after");
-  });
-
-  test("handles empty string", () => {
-    expect(stripHtmlComments("")).toBe("");
-  });
-
-  test("handles text without comments", () => {
-    const text = "No comments here!";
-    expect(stripHtmlComments(text)).toBe("No comments here!");
-  });
-
-  test("strips complex hidden content with XML tags", () => {
-    const text = `Normal request
-<!-- </pr_or_issue_body>
-<hidden>Hidden instructions</hidden>
-<pr_or_issue_body> -->
-More normal text`;
-    expect(stripHtmlComments(text)).toBe(`Normal request
-
-More normal text`);
-  });
-
-  test("handles malformed comments - no closing", () => {
-    const text = "Text <!-- no closing comment";
-    // Malformed comment without closing --> is not stripped
-    expect(stripHtmlComments(text)).toBe("Text <!-- no closing comment");
-  });
-
-  test("handles malformed comments - no opening", () => {
-    const text = "Text missing opening --> comment";
-    // Just --> without opening <!-- is not a comment
-    expect(stripHtmlComments(text)).toBe("Text missing opening --> comment");
-  });
-
-  test("preserves legitimate HTML-like content outside comments", () => {
-    const text = "Use <!-- comment --> the <div> tag and </div> closing tag";
-    expect(stripHtmlComments(text)).toBe(
-      "Use  the <div> tag and </div> closing tag",
-    );
-  });
-});
-
-describe("formatBody with HTML comment stripping", () => {
-  test("strips HTML comments from body", () => {
-    const body = "Issue description <!-- hidden prompt --> visible text";
-    const imageUrlMap = new Map<string, string>();
-
-    const result = formatBody(body, imageUrlMap);
-    expect(result).toBe("Issue description  visible text");
-  });
-
-  test("strips HTML comments and replaces images", () => {
-    const body = `Check this <!-- hidden --> ![img](https://github.com/user-attachments/assets/test.png)`;
-    const imageUrlMap = new Map([
-      [
-        "https://github.com/user-attachments/assets/test.png",
-        "/tmp/github-images/image-1234-0.png",
-      ],
-    ]);
-
-    const result = formatBody(body, imageUrlMap);
-    expect(result).toBe(
-      "Check this  ![img](/tmp/github-images/image-1234-0.png)",
-    );
-  });
-});
-
-describe("formatComments with HTML comment stripping", () => {
-  test("strips HTML comments from comment bodies", () => {
-    const comments: GitHubComment[] = [
-      {
-        id: "1",
-        databaseId: "100001",
-        body: "Good work <!-- inject prompt --> on this PR",
-        author: { login: "user1" },
-        createdAt: "2023-01-01T00:00:00Z",
-      },
-    ];
-
-    const result = formatComments(comments);
-    expect(result).toBe(
-      "[user1 at 2023-01-01T00:00:00Z]: Good work  on this PR",
-    );
-  });
-});
-
-describe("formatReviewComments with HTML comment stripping", () => {
-  test("strips HTML comments from review comment bodies", () => {
-    const reviewData = {
-      nodes: [
-        {
-          id: "review1",
-          databaseId: "300001",
-          author: { login: "reviewer1" },
-          body: "LGTM",
-          state: "APPROVED",
-          submittedAt: "2023-01-01T00:00:00Z",
-          comments: {
-            nodes: [
-              {
-                id: "comment1",
-                databaseId: "200001",
-                body: "Nice work <!-- malicious --> here",
-                author: { login: "reviewer1" },
-                createdAt: "2023-01-01T00:00:00Z",
-                path: "src/index.ts",
-                line: 42,
-              },
-            ],
-          },
-        },
-      ],
-    };
-
-    const result = formatReviewComments(reviewData);
-    expect(result).toBe(
-      `[Review by reviewer1 at 2023-01-01T00:00:00Z]: APPROVED\n  [Comment on src/index.ts:42]: Nice work  here`,
-    );
-  });
-});
--- a/test/integration-sanitization.test.ts
+++ b/test/integration-sanitization.test.ts
@@ -0,0 +1,134 @@
+import { describe, expect, it } from "bun:test";
+import { formatBody, formatComments } from "../src/github/data/formatter";
+import type { GitHubComment } from "../src/github/types";
+
+describe("Sanitization Integration", () => {
+  it("should sanitize complete issue/PR body with various hidden content patterns", () => {
+    const issueBody = `
+# Feature Request: Add user dashboard
+
+## Description
+We need a new dashboard for users to track their activity.
+
+<!-- HTML comment that should be removed -->
+
+## Technical Details
+The dashboard should display:
+- User statistics ![dashboard mockup with hidden‌‍text](dashboard.png)
+- Activity graphs <img alt="example graph description" src="graph.jpg">
+- Recent actions
+
+## Implementation Notes
+See [documentation](https://docs.example.com "internal docs title") for API details.
+
+<div data-instruction="example instruction" aria-label="dashboard label" title="hover text">
+  The implementation should follow our standard patterns.
+</div>
+
+Additional notes: Textwithsofthyphens and &#72;&#105;&#100;&#100;&#101;&#110; encoded content.
+
+<input placeholder="search placeholder" type="text" />
+
+Direction override test: ‮reversed‬ text should be normalized.`;
+
+    const imageUrlMap = new Map<string, string>();
+    const result = formatBody(issueBody, imageUrlMap);
+
+    // Verify hidden content is removed
+    expect(result).not.toContain("<!-- HTML comment");
+    expect(result).not.toContain("hidden‌‍text");
+    expect(result).not.toContain("example graph description");
+    expect(result).not.toContain("internal docs title");
+    expect(result).not.toContain("example instruction");
+    expect(result).not.toContain("dashboard label");
+    expect(result).not.toContain("hover text");
+    expect(result).not.toContain("search placeholder");
+    expect(result).not.toContain("\u200B");
+    expect(result).not.toContain("\u200C");
+    expect(result).not.toContain("\u200D");
+    expect(result).not.toContain("\u00AD");
+    expect(result).not.toContain("\u202E");
+    expect(result).not.toContain("&#72;");
+
+    // Verify legitimate content is preserved
+    expect(result).toContain("# Feature Request: Add user dashboard");
+    expect(result).toContain("## Description");
+    expect(result).toContain("We need a new dashboard");
+    expect(result).toContain("User statistics");
+    expect(result).toContain("![](dashboard.png)");
+    expect(result).toContain('<img src="graph.jpg">');
+    expect(result).toContain("[documentation](https://docs.example.com)");
+    expect(result).toContain(
+      "The implementation should follow our standard patterns",
+    );
+    expect(result).toContain("Hidden encoded content");
+    expect(result).toContain('<input type="text" />');
+  });
+
+  it("should sanitize GitHub comments preserving discussion flow", () => {
+    const comments: GitHubComment[] = [
+      {
+        id: "1",
+        databaseId: "100001",
+        body: `Great idea! Here are my thoughts:
+
+1. We should consider the performance impact
+2. The UI mockup looks good: ![ui design](mockup.png)
+3. Check the [API docs](https://api.example.com "api reference") for rate limits
+
+<div aria-label="comment metadata" data-comment-type="review">
+  This change would affect multiple systems.
+</div>
+
+Note: Implementationshouldfollowbestpractices.`,
+        author: { login: "reviewer1" },
+        createdAt: "2023-01-01T10:00:00Z",
+      },
+      {
+        id: "2",
+        databaseId: "100002",
+        body: `Thanks for the feedback! 
+
+<!-- Internal note: discussed with team -->
+
+I've updated the proposal based on your suggestions.
+
+&#84;&#101;&#115;&#116; &#110;&#111;&#116;&#101;: All systems checked.
+
+<span title="status update" data-status="approved">Ready for implementation</span>`,
+        author: { login: "author1" },
+        createdAt: "2023-01-01T12:00:00Z",
+      },
+    ];
+
+    const result = formatComments(comments);
+
+    // Verify hidden content is removed
+    expect(result).not.toContain("<!-- Internal note");
+    expect(result).not.toContain("api reference");
+    expect(result).not.toContain("comment metadata");
+    expect(result).not.toContain('data-comment-type="review"');
+    expect(result).not.toContain("status update");
+    expect(result).not.toContain('data-status="approved"');
+    expect(result).not.toContain("\u200B");
+    expect(result).not.toContain("&#84;");
+
+    // Verify discussion flow is preserved
+    expect(result).toContain("Great idea! Here are my thoughts:");
+    expect(result).toContain("1. We should consider the performance impact");
+    expect(result).toContain("2. The UI mockup looks good: ![](mockup.png)");
+    expect(result).toContain(
+      "3. Check the [API docs](https://api.example.com)",
+    );
+    expect(result).toContain("This change would affect multiple systems.");
+    expect(result).toContain("Implementationshouldfollowbestpractices");
+    expect(result).toContain("Thanks for the feedback!");
+    expect(result).toContain(
+      "I've updated the proposal based on your suggestions.",
+    );
+    expect(result).toContain("Test note: All systems checked.");
+    expect(result).toContain("Ready for implementation");
+    expect(result).toContain("[reviewer1 at");
+    expect(result).toContain("[author1 at");
+  });
+});
--- a/test/sanitizer.test.ts
+++ b/test/sanitizer.test.ts
@@ -0,0 +1,259 @@
+import { describe, expect, it } from "bun:test";
+import {
+  stripInvisibleCharacters,
+  stripMarkdownImageAltText,
+  stripMarkdownLinkTitles,
+  stripHiddenAttributes,
+  normalizeHtmlEntities,
+  sanitizeContent,
+  stripHtmlComments,
+} from "../src/github/utils/sanitizer";
+
+describe("stripInvisibleCharacters", () => {
+  it("should remove zero-width characters", () => {
+    expect(stripInvisibleCharacters("Hello\u200BWorld")).toBe("HelloWorld");
+    expect(stripInvisibleCharacters("Text\u200C\u200D")).toBe("Text");
+    expect(stripInvisibleCharacters("\uFEFFStart")).toBe("Start");
+  });
+
+  it("should remove control characters", () => {
+    expect(stripInvisibleCharacters("Hello\u0000World")).toBe("HelloWorld");
+    expect(stripInvisibleCharacters("Text\u001F\u007F")).toBe("Text");
+  });
+
+  it("should preserve common whitespace", () => {
+    expect(stripInvisibleCharacters("Hello\nWorld")).toBe("Hello\nWorld");
+    expect(stripInvisibleCharacters("Tab\there")).toBe("Tab\there");
+    expect(stripInvisibleCharacters("Carriage\rReturn")).toBe(
+      "Carriage\rReturn",
+    );
+  });
+
+  it("should remove soft hyphens", () => {
+    expect(stripInvisibleCharacters("Soft\u00ADHyphen")).toBe("SoftHyphen");
+  });
+
+  it("should remove Unicode direction overrides", () => {
+    expect(stripInvisibleCharacters("Text\u202A\u202BMore")).toBe("TextMore");
+    expect(stripInvisibleCharacters("\u2066Isolated\u2069")).toBe("Isolated");
+  });
+});
+
+describe("stripMarkdownImageAltText", () => {
+  it("should remove alt text from markdown images", () => {
+    expect(stripMarkdownImageAltText("![example alt text](image.png)")).toBe(
+      "![](image.png)",
+    );
+    expect(
+      stripMarkdownImageAltText("Text ![description](pic.jpg) more text"),
+    ).toBe("Text ![](pic.jpg) more text");
+  });
+
+  it("should handle multiple images", () => {
+    expect(stripMarkdownImageAltText("![one](1.png) ![two](2.png)")).toBe(
+      "![](1.png) ![](2.png)",
+    );
+  });
+
+  it("should handle empty alt text", () => {
+    expect(stripMarkdownImageAltText("![](image.png)")).toBe("![](image.png)");
+  });
+});
+
+describe("stripMarkdownLinkTitles", () => {
+  it("should remove titles from markdown links", () => {
+    expect(stripMarkdownLinkTitles('[Link](url.com "example title")')).toBe(
+      "[Link](url.com)",
+    );
+    expect(stripMarkdownLinkTitles("[Link](url.com 'example title')")).toBe(
+      "[Link](url.com)",
+    );
+  });
+
+  it("should handle multiple links", () => {
+    expect(
+      stripMarkdownLinkTitles('[One](1.com "first") [Two](2.com "second")'),
+    ).toBe("[One](1.com) [Two](2.com)");
+  });
+
+  it("should preserve links without titles", () => {
+    expect(stripMarkdownLinkTitles("[Link](url.com)")).toBe("[Link](url.com)");
+  });
+});
+
+describe("stripHiddenAttributes", () => {
+  it("should remove alt attributes", () => {
+    expect(
+      stripHiddenAttributes('<img alt="example text" src="pic.jpg">'),
+    ).toBe('<img src="pic.jpg">');
+    expect(stripHiddenAttributes("<img alt='example' src=\"pic.jpg\">")).toBe(
+      '<img src="pic.jpg">',
+    );
+    expect(stripHiddenAttributes('<img alt=example src="pic.jpg">')).toBe(
+      '<img src="pic.jpg">',
+    );
+  });
+
+  it("should remove title attributes", () => {
+    expect(
+      stripHiddenAttributes('<a title="example text" href="#">Link</a>'),
+    ).toBe('<a href="#">Link</a>');
+    expect(stripHiddenAttributes("<div title='example'>Content</div>")).toBe(
+      "<div>Content</div>",
+    );
+  });
+
+  it("should remove aria-label attributes", () => {
+    expect(
+      stripHiddenAttributes('<button aria-label="example">Click</button>'),
+    ).toBe("<button>Click</button>");
+  });
+
+  it("should remove data-* attributes", () => {
+    expect(
+      stripHiddenAttributes(
+        '<div data-test="example" data-info="more example">Text</div>',
+      ),
+    ).toBe("<div>Text</div>");
+  });
+
+  it("should remove placeholder attributes", () => {
+    expect(
+      stripHiddenAttributes('<input placeholder="example text" type="text">'),
+    ).toBe('<input type="text">');
+  });
+
+  it("should handle multiple attributes", () => {
+    expect(
+      stripHiddenAttributes(
+        '<img alt="example" title="test" src="pic.jpg" class="image">',
+      ),
+    ).toBe('<img src="pic.jpg" class="image">');
+  });
+});
+
+describe("normalizeHtmlEntities", () => {
+  it("should decode numeric entities", () => {
+    expect(normalizeHtmlEntities("&#72;&#101;&#108;&#108;&#111;")).toBe(
+      "Hello",
+    );
+    expect(normalizeHtmlEntities("&#65;&#66;&#67;")).toBe("ABC");
+  });
+
+  it("should decode hex entities", () => {
+    expect(normalizeHtmlEntities("&#x48;&#x65;&#x6C;&#x6C;&#x6F;")).toBe(
+      "Hello",
+    );
+    expect(normalizeHtmlEntities("&#x41;&#x42;&#x43;")).toBe("ABC");
+  });
+
+  it("should remove non-printable entities", () => {
+    expect(normalizeHtmlEntities("&#0;&#31;")).toBe("");
+    expect(normalizeHtmlEntities("&#x00;&#x1F;")).toBe("");
+  });
+
+  it("should preserve normal text", () => {
+    expect(normalizeHtmlEntities("Normal text")).toBe("Normal text");
+  });
+});
+
+describe("sanitizeContent", () => {
+  it("should apply all sanitization measures", () => {
+    const testContent = `
+      <!-- This is a comment -->
+      <img alt="example alt text" src="image.jpg">
+      ![example image description](screenshot.png)
+      [click here](https://example.com "example title")
+      <div data-prompt="example data" aria-label="example label">
+        Normal text with hidden\u200Bcharacters
+      </div>
+      &#72;&#105;&#100;&#100;&#101;&#110; message
+    `;
+
+    const sanitized = sanitizeContent(testContent);
+
+    expect(sanitized).not.toContain("<!-- This is a comment -->");
+    expect(sanitized).not.toContain("example alt text");
+    expect(sanitized).not.toContain("example image description");
+    expect(sanitized).not.toContain("example title");
+    expect(sanitized).not.toContain("example data");
+    expect(sanitized).not.toContain("example label");
+    expect(sanitized).not.toContain("\u200B");
+    expect(sanitized).not.toContain("alt=");
+    expect(sanitized).not.toContain("data-prompt=");
+    expect(sanitized).not.toContain("aria-label=");
+
+    expect(sanitized).toContain("Normal text with hiddencharacters");
+    expect(sanitized).toContain("Hidden message");
+    expect(sanitized).toContain('<img src="image.jpg">');
+    expect(sanitized).toContain("![](screenshot.png)");
+    expect(sanitized).toContain("[click here](https://example.com)");
+  });
+
+  it("should handle complex nested patterns", () => {
+    const complexContent = `
+      Text with ![alt \u200B text](image.png) and more.
+      <a href="#" title="example\u00ADtitle">Link</a>
+      <div data-x="&#72;&#105;">Content</div>
+    `;
+
+    const sanitized = sanitizeContent(complexContent);
+
+    expect(sanitized).not.toContain("\u200B");
+    expect(sanitized).not.toContain("\u00AD");
+    expect(sanitized).not.toContain("alt ");
+    expect(sanitized).not.toContain('title="');
+    expect(sanitized).not.toContain('data-x="');
+    expect(sanitized).toContain("![](image.png)");
+    expect(sanitized).toContain('<a href="#">Link</a>');
+  });
+
+  it("should preserve legitimate markdown and HTML", () => {
+    const legitimateContent = `
+      # Heading
+      
+      This is **bold** and *italic* text.
+      
+      Here's a normal image: ![](normal.jpg)
+      And a normal link: [Click here](https://example.com)
+      
+      <div class="container">
+        <p id="para">Normal paragraph</p>
+        <input type="text" name="field">
+      </div>
+    `;
+
+    const sanitized = sanitizeContent(legitimateContent);
+
+    expect(sanitized).toBe(legitimateContent);
+  });
+
+  it("should handle entity-encoded text", () => {
+    const encodedText = `
+      &#72;&#105;&#100;&#100;&#101;&#110; &#109;&#101;&#115;&#115;&#97;&#103;&#101;
+      <div title="&#101;&#120;&#97;&#109;&#112;&#108;&#101;">Test</div>
+    `;
+
+    const sanitized = sanitizeContent(encodedText);
+
+    expect(sanitized).toContain("Hidden message");
+    expect(sanitized).not.toContain('title="');
+    expect(sanitized).toContain("<div>Test</div>");
+  });
+});
+
+describe("stripHtmlComments (legacy)", () => {
+  it("should remove HTML comments", () => {
+    expect(stripHtmlComments("Hello <!-- example -->World")).toBe(
+      "Hello World",
+    );
+    expect(stripHtmlComments("<!-- comment -->Text")).toBe("Text");
+    expect(stripHtmlComments("Text<!-- comment -->")).toBe("Text");
+  });
+
+  it("should handle multiline comments", () => {
+    expect(stripHtmlComments("Hello <!-- \nexample\n -->World")).toBe(
+      "Hello World",
+    );
+  });
+});
Author	SHA1	Message	Date
Ashwin Bhat	ddef0ddd29	fix mistake in FAQ	2025-05-30 10:56:54 -07:00
Ashwin Bhat	180a1b6680	switch to opus for this repo's claude workflow (#97 ) * switch to opus for this repo's claude workflow * prettier	2025-05-30 08:14:11 -07:00
Ashwin Bhat	8da47815ec	docs: add comprehensive FAQ covering common gotchas and limitations (#92 ) - Add FAQ.md with sections on triggering, authentication, capabilities, and troubleshooting - Document key limitations including workflow access, PR creation, and CI results visibility - Include workarounds for common issues like automated workflows and test result access - Cover security considerations and best practices for safe usage 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-05-29 16:45:44 -07:00
Lina Tawfik	35ad5fc467	Add enhanced text sanitization (#83 ) * Add enhanced text sanitization * Format code with prettier * Refactor tests to remove redundancy and improve structure - Remove redundant 'mixed input patterns' test from sanitizer.test.ts - Consolidate integration tests into 2 focused real-world scenarios - Add HTML comment stripping to sanitizeContent function - Update test expectations to match sanitization behavior - Maintain full coverage with fewer, more focused tests * Fix prettier formatting * Remove rendered.html from repository * Remove test-markdown.json and update .gitignore * Revert .gitignore changes	2025-05-29 16:35:50 -07:00
zenmush	fb7365fba9	fix: Correct mcp_config_file parameter to mcp_config in issue-triage workflow (#89 ) The workflow was using 'mcp_config_file' which is not a valid parameter for the claude-code-base-action. The correct parameter name is 'mcp_config' as defined in the action.yml file. This fix ensures that the MCP server configuration is properly passed to the action, allowing the GitHub MCP server to be correctly initialized. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-05-29 12:57:57 -07:00