Question about NDPI

Benjamin Gilbert bgilbert+openslide at cs.cmu.edu
Sat May 11 16:06:37 EDT 2024


On Sat, May 11, 2024 at 2:15 PM Martin Weihrauch
<m.weihrauch at smartinmedia.com> wrote:
> If I see it correctly, it (ab)uses the Tiff format and stores a single JPEG in each zoom level, (ab)using the JPEG rules, e. g. by exceeding the 65,000 pixel limit, etc.

Right.

> My question: how is it possible to quickly extract a tile from the large JPEG? Does it internally have multiple frames (the stripes) and if yes, how is it possible to locate the internal MCU/8x8 blocks without reading the entire stripe or does each stripe have to be "parsed" at least once completely? Is there something like a directory involved?

An NDPI tile is actually a sequence of MCUs between two JPEG restart
markers.  Restart markers are a JPEG feature that isn't normally used
much; they allow the decoder to recover from data corruption.  Restart
markers can be searched for without decoding the image data, and they
reset the state of the encoder/decoder when encountered, so it's
possible to start decoding at any restart marker without knowledge of
previous MCUs.  OpenSlide reads a tile by concatenating the JPEG
header with the tile's MCUs, fixing up the trailer marker and the
header width/height fields as necessary, and passing the result to the
JPEG decoder.

Restart markers are placed after every N MCUs, and NDPI TIFF tag 65426
lists the byte offset (within the JPEG) of each MCU that immediately
follows a restart marker.  OpenSlide can also scan for restart markers
if that tag is missing.

Best,
--Benjamin Gilbert


More information about the openslide-users mailing list