Skip to content

Modifying score metric for segments of driving#319

Open
mpragnay wants to merge 5 commits into3.0_betafrom
Modifying-Score-Metric-for-Segments-of-Driving
Open

Modifying score metric for segments of driving#319
mpragnay wants to merge 5 commits into3.0_betafrom
Modifying-Score-Metric-for-Segments-of-Driving

Conversation

@mpragnay
Copy link

Added a score that checks if the agent completed its goal without colliding for each goal(driving segment).
Also added heuristic based check for the score of the last goal, has the agent attempted it's last goal, making some progress at a good speed.

Copilot AI review requested due to automatic review settings February 27, 2026 03:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates Drive episode scoring/completion metrics to evaluate performance per driving segment/goal, including a heuristic for the final (possibly incomplete) goal segment, and switches completion_rate to be based on “attempted” goals rather than “sampled” goals.

Changes:

  • Replace completion_rate denominator from goals_sampled_this_episode to goals_attempted_this_episode.
  • Add goal-segment scoring accumulation and a last-segment heuristic to award partial credit.
  • Track additional per-agent state needed for last-segment evaluation (displacement since last goal, last-goal timestep, previous goal position).

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
pufferlib/ocean/env_binding.h Updates aggregated completion_rate computation to use attempted goals.
pufferlib/ocean/drive/drive.h Implements attempted-goal counting, per-segment score logic, and last-goal heuristics; updates log aggregation accordingly.
pufferlib/ocean/drive/datatypes.h Extends Agent with new fields to support attempted-goal accounting and last-segment heuristics.
Comments suppressed due to low confidence (4)

pufferlib/ocean/drive/drive.h:1354

  • agent->goals_attempted_this_episode can be 0 here (e.g., GOAL_GENERATE_NEW, no goals reached, no collision, and time_since_last_goal < MIN_GOAL_SEGMENT_TIME_TO_ANALYZE_AGENT), which makes both frac_goal_reached and the later score computation divide by zero. Add a guard (e.g., treat 0 attempts as 0 completion/score, or clamp denominator to >= 1) before doing these divisions.
        env->log.goals_reached_this_episode += agent->goals_reached_this_episode;
        env->log.goals_sampled_this_episode += agent->goals_sampled_this_episode;
        env->log.goals_attempted_this_episode += agent->goals_attempted_this_episode;

        int offroad = env->logs[i].offroad_rate;
        env->log.offroad_rate += offroad;
        int collided = env->logs[i].collision_rate;
        env->log.collision_rate += collided;
        float offroad_per_agent = env->logs[i].offroad_per_agent;
        env->log.offroad_per_agent += offroad_per_agent;
        float collisions_per_agent = env->logs[i].collisions_per_agent;

pufferlib/ocean/drive/drive.h:2486

  • last_goal_reached_timestep is initialized to 0 on reset, but env->timestep is set to env->init_steps. This makes time_since_last_goal = (env->timestep - last_goal_reached_timestep) * dt incorrectly include the pre-episode offset and can skew the last-segment heuristics/attempt counting. Initialize last_goal_reached_timestep to the current timestep at reset (or to env->init_steps) instead of 0.
        checkNeighbors(env, agent->sim_x, agent->sim_y, z_offsets, Z_RANGE * Z_RANGE, &list_size);
    if (list_size > 0) {
        DepthPoint road_neighbours[list_size];
        int valid_count = 0;

pufferlib/ocean/drive/drive.h:1319

  • The heuristic awards only +0.5 to env->logs[i].score, but it also increments agent->goals_reached_this_episode by 1. This makes completion_rate/frac_goal_reached treat the last segment as fully completed even though the score is partial, and it changes the meaning of the "goals_reached" counter. Consider not modifying goals_reached_this_episode for this heuristic (or, if partial completion is intended, update the metric naming/formula so completion_rate remains consistent with the scoring).
        agent->goals_attempted_this_episode += 1; // Count as attempt if agent had enough time
        float displacement_from_last_goal =
            sqrtf((agent->sim_x - agent->prev_goal_x) * (agent->sim_x - agent->prev_goal_x) +
                  (agent->sim_y - agent->prev_goal_y) * (agent->sim_y - agent->prev_goal_y) +

pufferlib/ocean/drive/datatypes.h:195

  • Minor grammar/clarity: the comment reads "goals reached + last goal(if this segment can be judged as an attempt)"; add a space before the parenthesis and consider rephrasing to make it clearer that this is a count of attempted segments (not necessarily sampled goals).
    float goals_attempted_this_episode; // goals reached + last goal(if this segment can be judged as an attempt)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@greptile-apps
Copy link

greptile-apps bot commented Feb 27, 2026

Greptile Summary

This PR refactors the scoring system to track goal completion on a per-segment basis rather than using episode-level thresholds. For GOAL_GENERATE_NEW mode, each goal segment now receives a score of 1.0 if completed without collision or 0.0 if collision occurred. A heuristic was added to give partial credit (0.5) to the last goal if the agent made sufficient progress (moved with avg speed > 2.0 m/s for at least 1.0 second).

Key changes:

  • Added tracking for goals_attempted_this_episode to distinguish between sampled goals and actual attempts
  • Changed final score calculation from threshold-based to average score across attempted goals
  • Introduced heuristics MIN_GOAL_SEGMENT_TIME_TO_ANALYZE_AGENT (1.0s) and MIN_AVG_SPEED_TO_CONSIDER_AS_ATTEMPTING_GOAL (2.0 m/s)
  • Tracks cumulative_displacement_since_last_goal, last_goal_reached_timestep, and previous goal positions

Issues found:

  • Critical division by zero bug when goals_attempted_this_episode is 0 (lines 1344, 1353)
  • Variable naming could be clearer since goals_reached_this_episode now includes partial credit for attempts

Confidence Score: 2/5

  • This PR contains a critical division by zero bug that will cause runtime errors
  • The division by zero bug at lines 1344 and 1353 will cause NaN/Inf values in very short episodes, which is a critical issue that must be fixed before merging
  • pufferlib/ocean/drive/drive.h requires immediate attention for division by zero fixes

Important Files Changed

Filename Overview
pufferlib/ocean/drive/datatypes.h Added new agent tracking fields for goal attempts, displacement since last goal, and previous goal positions
pufferlib/ocean/drive/drive.h Refactored scoring to per-segment basis with heuristics for last goal; contains division by zero bug
pufferlib/ocean/env_binding.h Updated completion rate calculation to use attempted goals instead of sampled goals

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    Start[Episode End] --> ComputeMetrics[compute_metrics_for_last_goal_segment]
    ComputeMetrics --> CheckMode{goal_behavior?}
    
    CheckMode -->|GOAL_GENERATE_NEW| InitAttempted[goals_attempted = goals_reached]
    CheckMode -->|Other modes| SetSampled[goals_attempted = goals_sampled]
    
    InitAttempted --> CheckCollision{collided_before_goal?}
    CheckCollision -->|Yes| IncrementBad[goals_attempted += 1<br/>Bad Attempt]
    CheckCollision -->|No| CheckTime{time_since_last_goal<br/>>= 1.0s?}
    
    CheckTime -->|No| SkipAnalysis[Skip heuristic analysis]
    CheckTime -->|Yes| IncrementAttempt[goals_attempted += 1]
    
    IncrementAttempt --> CheckProgress{displacement > threshold<br/>AND avg_speed > 2.0 m/s?}
    CheckProgress -->|Yes| PartialCredit[score += 0.5<br/>goals_reached += 1]
    CheckProgress -->|No| NoCredit[No partial credit]
    
    IncrementBad --> AddLog
    SkipAnalysis --> AddLog
    PartialCredit --> AddLog
    NoCredit --> AddLog
    SetSampled --> AddLog
    
    AddLog[add_log: Compute final score] --> CalcScore{goal_behavior?}
    CalcScore -->|GOAL_RESPAWN/STOP| SimpleScore[score = 1 if reached<br/>without collision]
    CalcScore -->|GOAL_GENERATE_NEW| AvgScore[score = total_score /<br/>goals_attempted]
    
    SimpleScore --> End[Aggregate metrics]
    AvgScore --> End
Loading

Last reviewed commit: b80b8c1

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

int agent_idx = env->active_agent_indices[i];
Agent *agent = &env->agents[agent_idx];

env->log.goals_reached_this_episode += agent->goals_reached_this_episode;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

division by zero when goals_attempted_this_episode is 0 (e.g., very short episodes in GOAL_GENERATE_NEW mode where agent hasn't collided and time < 1s)

Suggested change
env->log.goals_reached_this_episode += agent->goals_reached_this_episode;
float frac_goal_reached = agent->goals_attempted_this_episode > 0 ? agent->goals_reached_this_episode / agent->goals_attempted_this_episode : 0.0f;


agent->goals_attempted_this_episode += 1; // Count as attempt if agent had enough time
float displacement_from_last_goal =
sqrtf((agent->sim_x - agent->prev_goal_x) * (agent->sim_x - agent->prev_goal_x) +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

incrementing goals_reached_this_episode when goal wasn't actually reached is semantically confusing - consider renaming variable to better reflect it includes partial credit

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@greptile-apps
Copy link

greptile-apps bot commented Feb 27, 2026

Additional Comments (1)

pufferlib/ocean/drive/drive.h
division by zero when goals_attempted_this_episode is 0

            score = agent->goals_attempted_this_episode > 0 ? env->logs[i].score / agent->goals_attempted_this_episode : 0.0f;

mpragnay and others added 2 commits February 27, 2026 12:40
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants