Skip to content

fix: osw export bug and memory view bug#200

Merged
singjc merged 4 commits intoPyProphet:masterfrom
singjc:master
Mar 26, 2026
Merged

fix: osw export bug and memory view bug#200
singjc merged 4 commits intoPyProphet:masterfrom
singjc:master

Conversation

@singjc
Copy link
Contributor

@singjc singjc commented Mar 26, 2026

This pull request introduces improvements to logging and data handling in both the pyprophet/io/export/osw.py and pyprophet/scoring/data_handling.py modules. The main changes include more descriptive log messages, enhanced debug output for data inspection, and a new utility function to ensure numpy arrays are writable and C-contiguous for compatibility with Cython routines. This utility function is now used in ranking operations to prevent issues with read-only or non-contiguous arrays, and it is covered by new tests.

Logging and Debugging Improvements:

  • Replaced references to "Parquet file" with "OSW file" in log messages for clarity, and added debug output for the number of rows read from OSW files in _read_sqlite and _read_standard_data. Additional trace logs were added after merging alignment features to help with debugging data processing steps. (pyprophet/io/export/osw.py) [1] [2] [3]

Data Handling Enhancements:

  • Introduced _to_writable_c_array, a utility function to ensure numpy arrays passed to Cython routines are writable and C-contiguous, preventing potential issues with read-only or misaligned arrays. (pyprophet/scoring/data_handling.py)
  • Updated ranking-related methods (rank_by, add_peak_group_rank) to use _to_writable_c_array for preparing input arrays, improving robustness and compatibility with Cython. (pyprophet/scoring/data_handling.py) [1] [2]

Testing:

  • Added a new test to verify that _to_writable_c_array correctly handles read-only input arrays, ensuring the output is writable and C-contiguous. (tests/test_data_handling.py) [1] [2]

singjc added 4 commits March 8, 2026 11:00
…ing methods

- Introduced `_to_writable_c_array` to create writable, C-contiguous numpy arrays for Cython memoryviews.
- Updated `Experiment` class methods to use the new utility function for handling `tg_num_id` and score arrays.
- Enhanced tests to verify the functionality of `_to_writable_c_array` with read-only input.
Copilot AI review requested due to automatic review settings March 26, 2026 03:00
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves robustness around Cython-backed ranking operations by ensuring arrays passed into Cython memoryviews are writable and C-contiguous, and it adds additional logging/trace output for OSW (SQLite) export reads to aid debugging.

Changes:

  • Added _to_writable_c_array() helper and applied it to ranking paths that call Cython routines (find_top_ranked, rank).
  • Added a unit test validating _to_writable_c_array() behavior for read-only arrays.
  • Refined OSW reader log messages and added extra trace/debug statements for row/column visibility; regenerated the Cython-generated _optimized.c.

Reviewed changes

Copilot reviewed 3 out of 4 changed files in this pull request and generated 2 comments.

File Description
pyprophet/scoring/data_handling.py Adds _to_writable_c_array() and uses it to precondition arrays for Cython ranking calls.
tests/test_data_handling.py Adds coverage for _to_writable_c_array() with read-only inputs.
pyprophet/io/export/osw.py Updates OSW reader log wording and adds row/column trace logs to aid debugging.
pyprophet/scoring/_optimized.c Large regenerated Cython output (header indicates Cython 3.2.4).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +80 to +83
def _to_writable_c_array(values, dtype):
"""Return a writable, C-contiguous numpy array for Cython memoryviews."""
arr = np.asarray(values, dtype=dtype)
return np.require(arr, dtype=dtype, requirements=["C", "W"])
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider adding a companion test for non-contiguous inputs (e.g., a sliced array with a stride or a Fortran-ordered array) to confirm _to_writable_c_array() reliably returns a C-contiguous writable array in addition to the read-only case. This function is now a key precondition for Cython memoryview calls, and non-contiguity was one of the reported failure modes.

Copilot uses AI. Check for mistakes.
Comment on lines +1 to 4
/* Generated by Cython 3.2.4 */

/* BEGIN: Cython Metadata
{
Copy link

Copilot AI Mar 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_optimized.c appears to have been regenerated with a newer Cython (header now says 3.2.4). The repo’s pinned dependencies include cython==3.1.1 (requirements.txt), so regenerating with a different version can cause persistent diffs and confusion about the intended toolchain. Suggest either regenerating this file with the pinned Cython version or updating the pin/tooling so the generated C matches the expected build environment.

Copilot uses AI. Check for mistakes.
@singjc singjc enabled auto-merge March 26, 2026 03:29
@singjc singjc disabled auto-merge March 26, 2026 03:30
@singjc singjc merged commit a0af440 into PyProphet:master Mar 26, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants