T6 M0: Technical plan + analysis notebook for multi-objective vector …#61
T6 M0: Technical plan + analysis notebook for multi-objective vector …#61carlosrod723 wants to merge 7 commits intoAgentOpt:experimentalfrom
Conversation
…, evaluate_vector, BasicSearch integration, 59 tests
…k, add weight-sensitivity demo
| """ | ||
| score, _ = self.get_feedback(query, response, reference, **kwargs) | ||
| if isinstance(score, dict): | ||
| return float(np.mean(list(score.values()))) |
There was a problem hiding this comment.
We should leave this behavior to be configurable from the Objective side.
It should not be hard coded here.
There was a problem hiding this comment.
Also why do we need this method from the Guide to begin with? I guess the question is whether we would require passing objective into Guide?
Or asked differently, should the Guide be the one who creates the Objective and sends them around? @allenanie what do you think?
| """ | ||
| ... | ||
|
|
||
| def aggregate_vector_scores(scores: list) -> Union[float, Dict[str, float]]: |
There was a problem hiding this comment.
As above, the logic should be implemented by Objective.
docs/T6_technical_plan.md
Outdated
| Isolate all multi-objective logic into one new module (`opto/trainer/objectives.py`) containing **pure functions**: | ||
|
|
||
| ``` | ||
| normalize_score() → scalar ↔ dict conversion |
There was a problem hiding this comment.
Let's use a different name. normalize_score implies some sort of scaling or shifting is done.
Let's use something explicit like to_score_dict or some term that is more neutral
|
Hi @chinganc I propose to address your comments by moving all dict -> scalar + aggregation policy into Concretely:
This keeps Guide responsible for producing raw metrics, and keeps ObjectiveConfig (trainer-side) responsible for aggregation/scalarization/selection, without passing ObjectiveConfig into the Guide. |
…larize_dict, aggregate to objectives.py
M0 delivery for T6 Multi-Objective Vector Scores.
Deliverables:
docs/T6_technical_plan.md— Refined tech plan with API signatures, edge cases, test planexamples/notebooks/t6_m0_analysis.ipynb— Colab notebook (no API keys needed)Notebook demonstrates current baseline behavior and a working prototype of weighted vs Pareto selection with deterministic tie-break validation.