We are dealing with a long but small volumetric data set (~50k-100k time points, ~200Kb per time point).
Currently saving this images object using tobinary() produces a large number of small files, which is slow to read, write and especially delete using our storage back-end.
I suggest to add a parameter that would group n time points in order to reduce the number of files written to disk.
A few points to think about are:
- What grouping to use: would a list work or we will need to stack the n dimensional data in the n+1 dimension.
- What would be the equivalent series implementation: a grouping factor for each axis?
- How to retrieve the original
images or series object: change the conf.json file to include this parameter, have a similar parameter in frombinary(), maybe both.
@freeman-lab, @jwittenbach would like to hear what you think before starting to play around with this.
We are dealing with a long but small volumetric data set (~50k-100k time points, ~200Kb per time point).
Currently saving this
imagesobject usingtobinary()produces a large number of small files, which is slow to read, write and especially delete using our storage back-end.I suggest to add a parameter that would group
ntime points in order to reduce the number of files written to disk.A few points to think about are:
imagesorseriesobject: change theconf.jsonfile to include this parameter, have a similar parameter infrombinary(), maybe both.@freeman-lab, @jwittenbach would like to hear what you think before starting to play around with this.