Add support for DICOM (aka Supp 145) #157

Mathieu Malaterre mathieu.malaterre at gmail.com
Fri Oct 7 02:14:57 EDT 2016


Hi,

On Fri, Oct 7, 2016 at 5:58 AM, Benjamin Gilbert via openslide-users
<openslide-users at lists.andrew.cmu.edu> wrote:
> On Mon, Oct 03, 2016 at 02:18:14PM +0200, Mathieu Malaterre via openslide-users wrote:
>> Case 2: A single file contains a single JPEG stream
>>
>> I did not have any dataset, so I used GDCM to split the above dataset
>> into individual file. In this case the header is 104K (x 4824 files).
>> This is nasty mostly because the ICC profile is repeated in every
>> single file (that may explain why no vendor choose to implement this
>> option).
>
> As I understand the spec, all tiles from a given level, Z-plane, and
> wavelength must be stored in a single image object, and thus in a single
> file.  (I think that's also what David is saying?)  Am I misunderstanding
> the spec, or is there another reason that there could be a large number of
> files?

There is no requirement for the the tiles of a given level to be
stored in a single file. Technically I also missed that, Alvaro did
pointed the missing functionality in his original post:

https://lists.andrew.cmu.edu/pipermail/openslide-users/2015-March/001029.html

>> So even in the Case 2, this represent ~512Mo in memory. Does that
>> correspond to other slice format ?
>
> It's significantly more memory than other formats.  If we ship our own
> parser, that shouldn't really be an issue: we'd free the DICOM parse trees
> before the end of openslide_open(), and we'd probably only have one file's
> parse tree in RAM at a time.

Fair enough.

>> All DICOM toolkits read the entire file, using there own memory buffer...
>
> To be clear: you're saying that GDCM and DCMTK need to load all of the image
> *data* into RAM, not just the metadata?

Correct. Keep in mind that DICOM really is just like XML. You need a
properly defined DTD and specifically tailored parser if you want to
do anything smart.

>> I assumed a DICOMDIR would be available but if you tell me how to handle
>> the other case in the openslide framework, I can adapt the code.
>
> For the multiple-file case, we can't go hunting around the filesystem for
> "related files" so we'll need a DICOMDIR.  If it's likely that users will
> want to open individual Sup 145 files, though, we should be able to handle
> those directly.

OK.

I'll return to the review page and address your remaining comments.
Pay attention that we are close to Debian freeze, so it's my turn to
being drag onto other shiny things.

-M


More information about the openslide-users mailing list