Extend CvDTree branch coverage (regression / 2-D / pruning / sampleIdx)#28
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds 4 direct unit tests for
CvDTree::trainthat drive code paths the cascade trainer never touches (it always runs depth-1 classification without surrogates, regression, or CV folds). All tests follow the project's Arrange / Act / Assert convention with comments per step.Changes
Modified: test_dtree.cpp
regression mode learns a step-function on a 1-D datasetvarType = (ORDERED, ORDERED)+regression_accuracy = 0.01→find_split_ord_reg, regression branch ofcalc_node_value2-D classification picks an axis-aligned splitfind_best_split(DTreeBestSplitFinder)cv_folds pruning produces a usable tree on a noisy datasetcv_folds = 3,use_1se_rule = true,truncate_pruned_tree = true→prune_cv/free_prune_data/ fold-allocation pathssampleIdx mask trains on the selected subset onlysample_count(selects 6 / 10 rows) → sample-mask preprocessing inCvDTreeTrainData::set_dataNo production source modified.
Verification
Coverage impact
Net gain: +215 lines and +180 branches covered, concentrated in the previously cold legacy CART machinery:
o_cvdtree.cppo_cvboostree.cppo_cvcascadeboosttree.cpp(Per-file deltas: regression test alone exercises the regression branch of
calc_node_valueandfind_split_ord_reg; the cv_folds test exercises the prune / fold-allocation paths; the sampleIdx mask test exercises the sample-preprocessing branch inset_data.)Risk
Test-only change. No production source modified.
Notes
CvDTreeTrainDatarejectscv_foldswhen per-fold sample count is too low — pruning test uses 60 samples / 3 folds for that reason (documented inline).sampleIdxtest uses an 8U mask rather than a 32S index array; both paths were considered, the mask form additionally walks the mask-expansion branch inset_data.Checklist