A portable OpenSlide viewer for Windows: Smart Zoom Viewer

Mathieu Malaterre mathieu.malaterre at gmail.com
Tue Nov 17 03:10:11 EST 2015


On Tue, Nov 17, 2015 at 4:33 AM, Benjamin Gilbert via openslide-users
<openslide-users at lists.andrew.cmu.edu> wrote:
> On 2015-11-12 12:49, John Cupitt via openslide-users wrote:
>>
>> It's just deepzoom in an uncompressed zip container. The idea is that
>> deepzoom is great, because it can be served extremely quickly, but
>> also very annoying, since each 256x256 tile is a separate JPEG file.
>> If your slide is 100,000 x 100,000 pixels, your highest-resolution
>> directory will contain 150,000 files.
>
>
> So SZI is intended to hold a single pyramidal image with no metadata, rather
> than an entire slide?
>
>> This isn't so bad on Linux hosts, but Windows really struggles with
>> large directories. Very large numbers of small files can also be
>> rather inefficient in disk usage: many filesystems will allocate a
>> separate 4kb page for each file, so for 1kb JPEGs, 3kb will be wasted per
>> tile.
>> Huge directory trees are also rather slow to copy about between hosts,
>> especially on Windows.
>
>
> What advantages does this format have over pyramidal tiled TIFF?  I'm seeing
> a couple disadvantages:
>
> - File size.  I used VIPS d88304a2 to make an SZI file and a TIFF file like
> this:
>
> vips dzsave "CMU-3-40x - 2010-01-12 13.57.09.vms" cmu-3.szi --suffix
> .jpeg[Q=80] --overlap=0
> vips extract_band "CMU-3-40x - 2010-01-12 13.57.09.vms"
> cmu-3.tiff[tile,pyramid,tile-width=256,tile-height=256,compression=jpeg,bigtiff,Q=80]
> 0 --n 3
>
> The output files break down as follows:
>
> TIFF:
>    38.0 MB - JPEG quantization tables [*]
>   827.2 MB - Other JPEG headers and data
>     4.5 MB - Remainder of file (TIFF metadata)
>   869.7 MB - Total size
>
> ZIP:
>     4.5 MB - JPEG JFIF headers [*]
>    49.9 MB - JPEG EXIF headers [*]
>    38.0 MB - JPEG quantization tables
>   950.7 MB - Other JPEG headers and data
>    18.3 MB - ZIP member filenames (two copies per member)
>    32.3 MB - Remainder of file (ZIP metadata)
>  1093.8 MB - Total size
>
> [*] entries are caused by bugs, and are not fundamental requirements of the
> formats.  ZIP metadata is 50.6 MB (4.9% excluding bugs) vs. 4.5 MB (0.5%)
> for TIFF.  Also, the ZIP has to include JPEG quantization tables with every
> tile, while TIFF can consolidate these into one set of tables per pyramid
> level.  (Supporting that would require a little extra code in the tile
> server, but not much, I think.)  This costs 38.0 MB for this sample, so in
> total the ZIP has 88.6 MB (8.5%) of overhead.
>
> - From an interoperability perspective, ZIP is not ideal.  The spec is
> large, occasionally ambiguous, and has many optional features.  For SZI to
> be well-defined, a profile of ZIP would need to be specified.  (E.g., is
> central directory encryption allowed?  Should ZIP64 be enabled conditionally
> or unconditionally?)  The ZIP format also contains redundancy (between the
> local file headers and the central directory) which tends to lead to
> implementation errors.  I have, on several different occasions, encountered
> interoperability problems between widely-deployed ZIP writers and readers.

Thanks Benjamin for this en-lighting summary !

This reminded me of https://xkcd.com/927/

-M


More information about the openslide-users mailing list