Skip to content

[CI] Add PR/Issue Auto Labeler#13380

Open
DN6 wants to merge 7 commits intomainfrom
labeler
Open

[CI] Add PR/Issue Auto Labeler#13380
DN6 wants to merge 7 commits intomainfrom
labeler

Conversation

@DN6
Copy link
Copy Markdown
Collaborator

@DN6 DN6 commented Apr 1, 2026

What does this PR do?

Addresses suggestions brought up in #13343 by adding scripts/workflows to auto labels issues and PRs .

For Issues:
Use Qwen/Qwen3.5-35B-A3B via Inference Providers to parse issues titles and content and assign appropriate labels from a fixed set.

For PRs:

  • Add labels based on which files are changed
  • Add labels in case tests are missing for new model additions and pipelines
  • Adds a size label to PRs so it's easy for maintainers to prioritise reviews based on size.
    - S (<50 LOC diff)
    - M (<200 LOC diff)
    - L (>200 LOC diff)

Fixes # (issue)
#13343

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@DN6 DN6 requested review from sayakpaul and yiyixuxu April 1, 2026 13:23
@yiyixuxu
Copy link
Copy Markdown
Collaborator

yiyixuxu commented Apr 1, 2026

@claude are you here, can you do a review too?

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 1, 2026

Claude finished @yiyixuxu's task in 1m 28s —— View job


PR Review

  • Read PR diff and changed files
  • Analyze changes and identify issues
  • Post review feedback

All 5 changed files in this PR are outside src/diffusers/ (they are in .github/ and utils/), which is the only directory within my configured review scope. There are no src/diffusers/ changes to report correctness issues on.

That said, here are a few observations worth noting on the utility scripts and CI files:

utils/label_issues.py

  • [utils/label_issues.py:414] json.loads(response) will raise if the LLM returns non-JSON (e.g. thinking tokens or markdown fences). Qwen3 may emit <think>…</think> before JSON output — the bare except Exception: pass on line 425 silently swallows this and produces no labels. Consider stripping thinking tags or using response_format={"type":"json_object"} if the endpoint supports it. Fix this →

  • [utils/label_issues.py:416] Variable name l shadows built-in l; minor but worth renaming to label for clarity.

  • [utils/label_issues.py:421] any(model_name.lower() in name for name in existing) does substring matching. A model named "wan" would match a directory named "wanvideo" correctly, but a very short name like "flux" would also match "reflux" if such a directory existed. This is likely fine in practice, but worth being aware of.

utils/check_test_missing.py

  • [utils/check_test_missing.py:276] The file is opened at f["filename"] (a repo-relative path). This only works if the script is run from the repository root. That's assumed by the workflow (it does actions/checkout first), so it's fine — but fragile if run locally from a different cwd.

  • [utils/check_test_missing.py:283-287] Only added test files are checked, not modified ones. A PR that adds a new model class and extends an existing test file (rather than adding a new one) will incorrectly be labeled missing-tests. This could lead to false positives.

.github/workflows/issue_labeler.yml

  • [.github/workflows/issue_labeler.yml:131-132] ISSUE_TITLE and ISSUE_BODY are passed as environment variables from github.event.issue.*. This is the safe pattern (avoids shell injection), so this is correct.

  • [.github/workflows/issue_labeler.yml:143] The shell loop for label in $(echo "$LABELS" | python -c ...) will break on label names containing spaces. The current label set has no spaces, but it's a latent bug if labels are ever changed.

.github/workflows/pr_labeler.yml

  • [.github/workflows/pr_labeler.yml:154] Uses pull_request_target with actions/checkout@v4 (default, checks out base branch). This is correct and safe — it does not check out the PR head in a privileged context.

  • [.github/workflows/pr_labeler.yml:168] sync-labels: true on the labeler action means labels will be removed from PRs if the corresponding files are no longer changed. This is intentional but worth documenting so maintainers understand why labels disappear on synchronize events.

Overall the PR is well-structured. The main concern worth addressing before merge is the silent JSON parse failure in label_issues.py when Qwen3 produces thinking-token output.

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is awesome! thanks
I left one comment

from huggingface_hub import InferenceClient


SYSTEM_PROMPT = """\
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have a missing-info/ or missing-code-example label, cam we add it here too?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reusing our existing needs-code-example label for that.
https://github.com/huggingface/diffusers/pull/13380/changes#diff-ed09b43f8018cb6694ae7bc3d686d34f5267f7bba506ca6b650d1ddb2b54a7b9R36

Did you also mean missing info like env details?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

oh i just meant needs-code-example, but not sure if you want to re-use name. we can add one for missing env details too, but not as important I think!

Did you also mean missing info like env details?

@iwr-redmond
Copy link
Copy Markdown

A few suggestions:

Copy link
Copy Markdown
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for working on this!
super excited to see it in action

from huggingface_hub import InferenceClient


SYSTEM_PROMPT = """\
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

oh i just meant needs-code-example, but not sure if you want to re-use name. we can add one for missing env details too, but not as important I think!

Did you also mean missing info like env details?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants