Modifying score metric for segments of driving by mpragnay · Pull Request #319 · Emerge-Lab/PufferDrive

mpragnay · 2026-02-27T03:10:13Z

Added a score that checks if the agent completed its goal without colliding for each goal(driving segment).
Also added heuristic based check for the score of the last goal, has the agent attempted it's last goal, making some progress at a good speed.

…riving

Copilot

Pull request overview

This PR updates Drive episode scoring/completion metrics to evaluate performance per driving segment/goal, including a heuristic for the final (possibly incomplete) goal segment, and switches completion_rate to be based on “attempted” goals rather than “sampled” goals.

Changes:

Replace completion_rate denominator from goals_sampled_this_episode to goals_attempted_this_episode.
Add goal-segment scoring accumulation and a last-segment heuristic to award partial credit.
Track additional per-agent state needed for last-segment evaluation (displacement since last goal, last-goal timestep, previous goal position).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File	Description
pufferlib/ocean/env_binding.h	Updates aggregated `completion_rate` computation to use attempted goals.
pufferlib/ocean/drive/drive.h	Implements attempted-goal counting, per-segment score logic, and last-goal heuristics; updates log aggregation accordingly.
pufferlib/ocean/drive/datatypes.h	Extends `Agent` with new fields to support attempted-goal accounting and last-segment heuristics.

Comments suppressed due to low confidence (4)

pufferlib/ocean/drive/drive.h:1354

agent->goals_attempted_this_episode can be 0 here (e.g., GOAL_GENERATE_NEW, no goals reached, no collision, and time_since_last_goal < MIN_GOAL_SEGMENT_TIME_TO_ANALYZE_AGENT), which makes both frac_goal_reached and the later score computation divide by zero. Add a guard (e.g., treat 0 attempts as 0 completion/score, or clamp denominator to >= 1) before doing these divisions.

        env->log.goals_reached_this_episode += agent->goals_reached_this_episode;
        env->log.goals_sampled_this_episode += agent->goals_sampled_this_episode;
        env->log.goals_attempted_this_episode += agent->goals_attempted_this_episode;

        int offroad = env->logs[i].offroad_rate;
        env->log.offroad_rate += offroad;
        int collided = env->logs[i].collision_rate;
        env->log.collision_rate += collided;
        float offroad_per_agent = env->logs[i].offroad_per_agent;
        env->log.offroad_per_agent += offroad_per_agent;
        float collisions_per_agent = env->logs[i].collisions_per_agent;

pufferlib/ocean/drive/drive.h:2486

last_goal_reached_timestep is initialized to 0 on reset, but env->timestep is set to env->init_steps. This makes time_since_last_goal = (env->timestep - last_goal_reached_timestep) * dt incorrectly include the pre-episode offset and can skew the last-segment heuristics/attempt counting. Initialize last_goal_reached_timestep to the current timestep at reset (or to env->init_steps) instead of 0.

        checkNeighbors(env, agent->sim_x, agent->sim_y, z_offsets, Z_RANGE * Z_RANGE, &list_size);
    if (list_size > 0) {
        DepthPoint road_neighbours[list_size];
        int valid_count = 0;

pufferlib/ocean/drive/drive.h:1319

The heuristic awards only +0.5 to env->logs[i].score, but it also increments agent->goals_reached_this_episode by 1. This makes completion_rate/frac_goal_reached treat the last segment as fully completed even though the score is partial, and it changes the meaning of the "goals_reached" counter. Consider not modifying goals_reached_this_episode for this heuristic (or, if partial completion is intended, update the metric naming/formula so completion_rate remains consistent with the scoring).

        agent->goals_attempted_this_episode += 1; // Count as attempt if agent had enough time
        float displacement_from_last_goal =
            sqrtf((agent->sim_x - agent->prev_goal_x) * (agent->sim_x - agent->prev_goal_x) +
                  (agent->sim_y - agent->prev_goal_y) * (agent->sim_y - agent->prev_goal_y) +

pufferlib/ocean/drive/datatypes.h:195

Minor grammar/clarity: the comment reads "goals reached + last goal(if this segment can be judged as an attempt)"; add a space before the parenthesis and consider rephrasing to make it clearer that this is a count of attempted segments (not necessarily sampled goals).

    float goals_attempted_this_episode; // goals reached + last goal(if this segment can be judged as an attempt)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

pufferlib/ocean/env_binding.h

greptile-apps · 2026-02-27T03:14:13Z

Greptile Summary

This PR refactors the scoring system to track goal completion on a per-segment basis rather than using episode-level thresholds. For GOAL_GENERATE_NEW mode, each goal segment now receives a score of 1.0 if completed without collision or 0.0 if collision occurred. A heuristic was added to give partial credit (0.5) to the last goal if the agent made sufficient progress (moved with avg speed > 2.0 m/s for at least 1.0 second).

Key changes:

Added tracking for goals_attempted_this_episode to distinguish between sampled goals and actual attempts
Changed final score calculation from threshold-based to average score across attempted goals
Introduced heuristics MIN_GOAL_SEGMENT_TIME_TO_ANALYZE_AGENT (1.0s) and MIN_AVG_SPEED_TO_CONSIDER_AS_ATTEMPTING_GOAL (2.0 m/s)
Tracks cumulative_displacement_since_last_goal, last_goal_reached_timestep, and previous goal positions

Issues found:

Critical division by zero bug when goals_attempted_this_episode is 0 (lines 1344, 1353)
Variable naming could be clearer since goals_reached_this_episode now includes partial credit for attempts

Confidence Score: 2/5

This PR contains a critical division by zero bug that will cause runtime errors
The division by zero bug at lines 1344 and 1353 will cause NaN/Inf values in very short episodes, which is a critical issue that must be fixed before merging
pufferlib/ocean/drive/drive.h requires immediate attention for division by zero fixes

Important Files Changed

Filename	Overview
pufferlib/ocean/drive/datatypes.h	Added new agent tracking fields for goal attempts, displacement since last goal, and previous goal positions
pufferlib/ocean/drive/drive.h	Refactored scoring to per-segment basis with heuristics for last goal; contains division by zero bug
pufferlib/ocean/env_binding.h	Updated completion rate calculation to use attempted goals instead of sampled goals

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    Start[Episode End] --> ComputeMetrics[compute_metrics_for_last_goal_segment]
    ComputeMetrics --> CheckMode{goal_behavior?}
    
    CheckMode -->|GOAL_GENERATE_NEW| InitAttempted[goals_attempted = goals_reached]
    CheckMode -->|Other modes| SetSampled[goals_attempted = goals_sampled]
    
    InitAttempted --> CheckCollision{collided_before_goal?}
    CheckCollision -->|Yes| IncrementBad[goals_attempted += 1<br/>Bad Attempt]
    CheckCollision -->|No| CheckTime{time_since_last_goal<br/>>= 1.0s?}
    
    CheckTime -->|No| SkipAnalysis[Skip heuristic analysis]
    CheckTime -->|Yes| IncrementAttempt[goals_attempted += 1]
    
    IncrementAttempt --> CheckProgress{displacement > threshold<br/>AND avg_speed > 2.0 m/s?}
    CheckProgress -->|Yes| PartialCredit[score += 0.5<br/>goals_reached += 1]
    CheckProgress -->|No| NoCredit[No partial credit]
    
    IncrementBad --> AddLog
    SkipAnalysis --> AddLog
    PartialCredit --> AddLog
    NoCredit --> AddLog
    SetSampled --> AddLog
    
    AddLog[add_log: Compute final score] --> CalcScore{goal_behavior?}
    CalcScore -->|GOAL_RESPAWN/STOP| SimpleScore[score = 1 if reached<br/>without collision]
    CalcScore -->|GOAL_GENERATE_NEW| AvgScore[score = total_score /<br/>goals_attempted]
    
    SimpleScore --> End[Aggregate metrics]
    AvgScore --> End

_{Last reviewed commit: b80b8c1}

greptile-apps

_{3 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

greptile-apps · 2026-02-27T03:14:16Z

pufferlib/ocean/drive/drive.h

        int agent_idx = env->active_agent_indices[i];
        Agent *agent = &env->agents[agent_idx];

        env->log.goals_reached_this_episode += agent->goals_reached_this_episode;


division by zero when goals_attempted_this_episode is 0 (e.g., very short episodes in GOAL_GENERATE_NEW mode where agent hasn't collided and time < 1s)

Suggested change

env->log.goals_reached_this_episode += agent->goals_reached_this_episode;

float frac_goal_reached = agent->goals_attempted_this_episode > 0 ? agent->goals_reached_this_episode / agent->goals_attempted_this_episode : 0.0f;

greptile-apps · 2026-02-27T03:14:20Z

pufferlib/ocean/drive/drive.h

+
+        agent->goals_attempted_this_episode += 1; // Count as attempt if agent had enough time
+        float displacement_from_last_goal =
+            sqrtf((agent->sim_x - agent->prev_goal_x) * (agent->sim_x - agent->prev_goal_x) +


incrementing goals_reached_this_episode when goal wasn't actually reached is semantically confusing - consider renaming variable to better reflect it includes partial credit

_{Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!}

greptile-apps · 2026-02-27T03:14:22Z

Additional Comments (1)

pufferlib/ocean/drive/drive.h
division by zero when goals_attempted_this_episode is 0

            score = agent->goals_attempted_this_episode > 0 ? env->logs[i].score / agent->goals_attempted_this_episode : 0.0f;

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…riving Merge branch 3.0_beta

mpragnay added 2 commits February 26, 2026 14:41

Updated score for each goal segment

0654af0

Added Heuristic for last goal segment

63ad5c1

Copilot AI review requested due to automatic review settings February 27, 2026 03:10

Copilot started reviewing on behalf of mpragnay February 27, 2026 03:10 View session

Merge branch '3.0_beta' into Modifying-Score-Metric-for-Segments-of-D…

b80b8c1

…riving

Copilot AI reviewed Feb 27, 2026

View reviewed changes

pufferlib/ocean/env_binding.h Outdated Show resolved Hide resolved

greptile-apps bot reviewed Feb 27, 2026

View reviewed changes

mpragnay and others added 2 commits February 27, 2026 12:40

Update pufferlib/ocean/env_binding.h

bb93b61

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Merge branch '3.0_beta' into Modifying-Score-Metric-for-Segments-of-D…

83c968a

…riving Merge branch 3.0_beta

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Modifying score metric for segments of driving#319

Modifying score metric for segments of driving#319
mpragnay wants to merge 5 commits into3.0_betafrom
Modifying-Score-Metric-for-Segments-of-Driving

mpragnay commented Feb 27, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

greptile-apps bot commented Feb 27, 2026

Uh oh!

greptile-apps bot left a comment

Uh oh!

greptile-apps bot Feb 27, 2026

Uh oh!

greptile-apps bot Feb 27, 2026

Uh oh!

greptile-apps bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	env->log.goals_reached_this_episode += agent->goals_reached_this_episode;
	float frac_goal_reached = agent->goals_attempted_this_episode > 0 ? agent->goals_reached_this_episode / agent->goals_attempted_this_episode : 0.0f;

Conversation

mpragnay commented Feb 27, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

greptile-apps bot commented Feb 27, 2026

Greptile Summary

Confidence Score: 2/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Feb 27, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants