Skip to content

Mohit/organized scripts#29

Open
m2kulkarni wants to merge 11 commits intosync-upstreamfrom
mohit/organized-scripts
Open

Mohit/organized scripts#29
m2kulkarni wants to merge 11 commits intosync-upstreamfrom
mohit/organized-scripts

Conversation

@m2kulkarni
Copy link

No description provided.

m2kulkarni and others added 10 commits February 24, 2026 01:10
- Fixed buffer overflow in init_grid_map (grid_index bounds check)
- Fixed truncated binary file detection in load_map_binary
- Fixed large ID overflow in save_map_binary for nuplan
- Changed VLA to heap allocation in init_grid_map
- Updated drive.ini settings

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@greptile-apps
Copy link

greptile-apps bot commented Feb 27, 2026

Greptile Summary

This PR reorganizes training scripts and unifies configuration terminology across the codebase.

Key Changes

  • Script Organization: Created structured directories (adaptive/, baselines/, coplayers/) with separate scripts for WOMD/NuPlan datasets and Recurrent/Transformer architectures (24 new scripts)
  • Terminology Unification: Replaced inconsistent bptt_horizon, context_window, and context_length parameters with unified horizon parameter throughout codebase
  • Architecture Detection: Improved from config-string-based to policy-object-based detection (checking for transformer/lstm attributes)
  • C Code Improvements: Added robust error handling in binary file reading, grid index validation, and switched from VLA to heap allocation for large maps
  • Configuration Updates: Adjusted agent counts, enabled human replay evaluation, and updated architecture naming from transformer_name to rnn_name

Issues Found

  • ID truncation in drive.py:880 uses modulo which creates collision risk for large IDs
  • Map existence check changed from map_000.bin to map_001.bin breaks datasets starting at index 0

Confidence Score: 4/5

  • This PR is generally safe to merge with two logical issues that should be addressed
  • The refactoring improves code organization and consistency significantly. However, two bugs were identified: (1) ID collision risk from modulo truncation and (2) map file check regression. The C code improvements add valuable error handling. Most changes are low-risk script organization and config updates.
  • Pay close attention to pufferlib/ocean/drive/drive.py (ID truncation and map check issues)

Important Files Changed

Filename Overview
pufferlib/config/ocean/adaptive.ini Updated config: unified horizon terminology, adjusted agent counts, changed architecture naming from transformer_name to rnn_name
pufferlib/config/ocean/drive.ini Updated config: unified horizon terminology, enabled human replay eval, adjusted agent counts
pufferlib/models.py Renamed context_length parameter to horizon throughout TransformerWrapper for consistency
pufferlib/ocean/benchmark/evaluator.py Improved architecture detection by inspecting policy object instead of config strings
pufferlib/ocean/drive/drive.h Added error handling for file reads, grid_index validation, and heap allocation for large arrays
pufferlib/ocean/drive/drive.py Changed map existence check to map_001.bin, added ID truncation for large IDs, made agent count adjustment more lenient
pufferlib/pufferl.py Major refactor: unified bptt_horizon, context_window into single horizon parameter; improved architecture detection
pufferlib/utils.py Updated eval to use rnn_name instead of policy_architecture; added render_human_replay_videos function

Last reviewed commit: be81b52

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

31 files reviewed, 2 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +877 to +881
# Truncate large IDs to fit in int32 range
obj_id = obj.get("id", 0)
if isinstance(obj_id, int) and (obj_id > 2147483647 or obj_id < -2147483648):
obj_id = obj_id % 2147483647
f.write(struct.pack("i", obj_id)) # id
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ID truncation with modulo risks collisions - two large IDs (e.g., 2147483648 and 4295967295) both map to 1

Suggested change
# Truncate large IDs to fit in int32 range
obj_id = obj.get("id", 0)
if isinstance(obj_id, int) and (obj_id > 2147483647 or obj_id < -2147483648):
obj_id = obj_id % 2147483647
f.write(struct.pack("i", obj_id)) # id
# Truncate large IDs to fit in int32 range - use hash to avoid collisions
obj_id = obj.get("id", 0)
if isinstance(obj_id, int) and (obj_id > 2147483647 or obj_id < -2147483648):
obj_id = hash(obj_id) % 2147483647

Comment on lines 228 to +233

# Check if resources directory exists
binary_path = f"{map_dir}/map_000.bin"
# Check if resources directory exists (check map_001 since some datasets start at 001)
binary_path = f"{map_dir}/map_001.bin"
if not os.path.exists(binary_path):
raise FileNotFoundError(
f"Required directory {binary_path} not found. Please ensure the Drive maps are downloaded and installed correctly per docs."
f"Required file {binary_path} not found. Please ensure the Drive maps are downloaded and installed correctly per docs."
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed check from map_000.bin to map_001.bin - breaks datasets that start at map_000

Suggested change
# Check if resources directory exists
binary_path = f"{map_dir}/map_000.bin"
# Check if resources directory exists (check map_001 since some datasets start at 001)
binary_path = f"{map_dir}/map_001.bin"
if not os.path.exists(binary_path):
raise FileNotFoundError(
f"Required directory {binary_path} not found. Please ensure the Drive maps are downloaded and installed correctly per docs."
f"Required file {binary_path} not found. Please ensure the Drive maps are downloaded and installed correctly per docs."
# Check if resources directory exists (try both map_000 and map_001)
binary_path = f"{map_dir}/map_000.bin"
if not os.path.exists(binary_path):
binary_path = f"{map_dir}/map_001.bin"
if not os.path.exists(binary_path):
raise FileNotFoundError(
f"Required files not found in {map_dir}. Please ensure the Drive maps are downloaded and installed correctly per docs."
)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant