fix: pin pandas version in docker image#202
Conversation
There was a problem hiding this comment.
Pull request overview
This PR pins the pandas version in the Docker image build and tweaks data table construction to use dtype-appropriate default values for newly added columns, improving consistency during scoring/analysis.
Changes:
- Pin
pandas==2.2.3in the Docker image dependency install step. - Use boolean/float placeholder columns in
prepare_data_tableforis_top_peakandclassifier_score. - Reformat
pyprophet/glyco/report.py(mostly style/formatting changes).
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
pyprophet/scoring/data_handling.py |
Uses typed placeholder columns (False/0.0) for new dataframe fields to keep dtypes consistent. |
pyprophet/glyco/report.py |
Large reformat (quotes/wrapping); no intentional functional changes apparent, but includes a small typo in a colormap name. |
Dockerfile |
Pins pandas to a specific version during image build. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| cmap.name + '_transperent', | ||
| y, cmap.N | ||
| ) | ||
| return LinearSegmentedColormap.from_list(cmap.name + "_transperent", y, cmap.N) |
There was a problem hiding this comment.
The colormap name suffix is spelled "_transperent"; this looks like a typo and can make debugging/plot inspection harder. Consider renaming it to "_transparent" (and update any downstream references if they exist).
| return LinearSegmentedColormap.from_list(cmap.name + "_transperent", y, cmap.N) | |
| return LinearSegmentedColormap.from_list(cmap.name + "_transparent", y, cmap.N) |
| try: | ||
| import matplotlib | ||
| matplotlib.use('Agg') | ||
|
|
||
| matplotlib.use("Agg") | ||
| from matplotlib.backends.backend_pdf import PdfPages | ||
| import matplotlib.pyplot as plt | ||
| import mpl_toolkits.mplot3d as plt3d |
There was a problem hiding this comment.
This PR is described as focusing on pandas pinning and small dtype fixes, but this file also includes a large, mostly formatting-only refactor (quote style, line wrapping, etc.). To reduce review risk and avoid noisy diffs, please either (a) split these formatting changes into a separate PR or (b) explicitly call out the reformat in the PR description.
This pull request makes minor improvements to the data preparation logic in
pyprophet/scoring/data_handling.pyand updates the Python dependencies in theDockerfile. The main focus is on ensuring the correct data types are used for newly created columns, which improves consistency and prevents potential issues during data processing.Dependency update:
pandas==2.2.3to the list of Python dependencies in theDockerfileto ensure the required version is installed.Data type consistency improvements in
prepare_data_table:empty_col(a list of zeros) with more type-appropriate placeholders:empty_bool_col(list ofFalsefor boolean columns) andempty_float_col(list of0.0for float columns) for new columns in the data table.is_top_peakto useempty_bool_colfor boolean consistency.classifier_scoreto useempty_float_colfor float consistency.