Add per-location CSV splitting and reorganise csv directory#88
Add per-location CSV splitting and reorganise csv directory#88david-mears-2 merged 2 commits intomainfrom
Conversation
Add a Node script (splitCsvsByLocation.ts) to split summary table CSVs into per-country and per-subregion files under a nested folder structure. Move source CSVs into csv/source/ to separate them from generated output. Hardcode paths in both Node scripts using import.meta.dirname, removing the need for CLI arguments. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
fcb9987 to
01c8d82
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #88 +/- ##
=======================================
Coverage 98.03% 98.03%
=======================================
Files 36 36
Lines 813 813
Branches 230 230
=======================================
Hits 797 797
Misses 10 10
Partials 6 6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
M-Kusumgar
left a comment
There was a problem hiding this comment.
Looks pretty good! had a couple of small questions about the script and csvs but happy to approve after that!
| const content = fs.readFileSync(filepath, "utf8"); | ||
| const lines = content.split("\n"); | ||
| const header = lines[0]; | ||
| const headers = header.split(",").map((h) => h.replace(/"/g, "")); |
There was a problem hiding this comment.
delimiters can be different for csvs from different regions, is this all static and guaranteed to have the , delimiter?
also im assuming the csvs we are given always have a header too right?
There was a problem hiding this comment.
Yes, the csvs are static and will stop changing as soon as the paper gets published. So we can hard code the delimiter and the headers.
Just a little 7000 liner ;) actually it's just one data-processing script being added.
We have an existing script for processing the data source into nice vaxviz-friendly jsons, which calls out to a few sub-scripts. It is for updating the static data, which will stop being necessary, one imagines, after paper publication; instructions for running the script are in the README.
In this PR I add another sub-script that splits up the summary table csvs into many small chunks, chunked by location value. This is to enable future work for downloading location-specific data.
AI use
We said it would be helpful to demarcate what code contributions were AI-generated.
I made this change before the easter break in a local session with claude code as an agent. I think CC wrote the code, but I of course had a hand in it (like when your baby cousin says 'I cooked the dinner and Mummy helped') and I've read each line.
Tests??
I haven't got tests on these scripts because they only have one job and it's easy to check if they do that. The proof is in the pudding - did it produce the csvs? Well, there you go then.