Improve dataset handling with adaptive folder detection#331
Draft
yyq19990828 wants to merge 1 commit intoroboflow:developfrom
Draft
Improve dataset handling with adaptive folder detection#331yyq19990828 wants to merge 1 commit intoroboflow:developfrom
yyq19990828 wants to merge 1 commit intoroboflow:developfrom
Conversation
…st dataset fallback - Add adaptive val/valid folder detection to support both YOLO (val) and COCO (valid) dataset structures - Implement fallback mechanism for test dataset since most datasets don't include test split - Reduce the need for manual dataset modification by automatically handling different folder naming conventions - Add proper error handling with FileNotFoundError for missing validation folders - Convert Chinese comments to English for better internationalization 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
fcd7819 to
2c9e689
Compare
|
I think this is interesting but I will say that datasets SHOULD include a test set that's not the val set ;) often nowadays people benchmark on COCO val, but that's not because that form is fine, it's because the test set is hidden away on a private server .. ultralytics notably does NOT report test set numbers on datasets they train on, but imo that gives a misleading measure of final accuracy because they're also picking the best checkpoint based on val score so the result is biased so I would still like it to be clear when folks train a model that they SHOULD have a test set .. but handling val vs valid seems logical to me |
Borda
reviewed
Jan 28, 2026
Member
Borda
left a comment
There was a problem hiding this comment.
We have added YOLO dataset support in 1.4, so is this still needed?
60b16c1 to
523f9df
Compare
a6e6ca0 to
0485141
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This PR improves the dataset handling functionality in
rfdetr/datasets/coco.pyby adding adaptive folder detection and better fallback mechanisms for test datasets.Key improvements:
val(YOLO format) orvalid(COCO format) folder namingWhy this change?
valfolder while COCO datasets usevalidfolderTest plan
valandvalidfolder structures🤖 Generated with Claude Code