A portable OpenSlide viewer for Windows: Smart Zoom Viewer
Mathieu Malaterre
mathieu.malaterre at gmail.com
Tue Nov 17 03:10:11 EST 2015
On Tue, Nov 17, 2015 at 4:33 AM, Benjamin Gilbert via openslide-users
<openslide-users at lists.andrew.cmu.edu> wrote:
> On 2015-11-12 12:49, John Cupitt via openslide-users wrote:
>>
>> It's just deepzoom in an uncompressed zip container. The idea is that
>> deepzoom is great, because it can be served extremely quickly, but
>> also very annoying, since each 256x256 tile is a separate JPEG file.
>> If your slide is 100,000 x 100,000 pixels, your highest-resolution
>> directory will contain 150,000 files.
>
>
> So SZI is intended to hold a single pyramidal image with no metadata, rather
> than an entire slide?
>
>> This isn't so bad on Linux hosts, but Windows really struggles with
>> large directories. Very large numbers of small files can also be
>> rather inefficient in disk usage: many filesystems will allocate a
>> separate 4kb page for each file, so for 1kb JPEGs, 3kb will be wasted per
>> tile.
>> Huge directory trees are also rather slow to copy about between hosts,
>> especially on Windows.
>
>
> What advantages does this format have over pyramidal tiled TIFF? I'm seeing
> a couple disadvantages:
>
> - File size. I used VIPS d88304a2 to make an SZI file and a TIFF file like
> this:
>
> vips dzsave "CMU-3-40x - 2010-01-12 13.57.09.vms" cmu-3.szi --suffix
> .jpeg[Q=80] --overlap=0
> vips extract_band "CMU-3-40x - 2010-01-12 13.57.09.vms"
> cmu-3.tiff[tile,pyramid,tile-width=256,tile-height=256,compression=jpeg,bigtiff,Q=80]
> 0 --n 3
>
> The output files break down as follows:
>
> TIFF:
> 38.0 MB - JPEG quantization tables [*]
> 827.2 MB - Other JPEG headers and data
> 4.5 MB - Remainder of file (TIFF metadata)
> 869.7 MB - Total size
>
> ZIP:
> 4.5 MB - JPEG JFIF headers [*]
> 49.9 MB - JPEG EXIF headers [*]
> 38.0 MB - JPEG quantization tables
> 950.7 MB - Other JPEG headers and data
> 18.3 MB - ZIP member filenames (two copies per member)
> 32.3 MB - Remainder of file (ZIP metadata)
> 1093.8 MB - Total size
>
> [*] entries are caused by bugs, and are not fundamental requirements of the
> formats. ZIP metadata is 50.6 MB (4.9% excluding bugs) vs. 4.5 MB (0.5%)
> for TIFF. Also, the ZIP has to include JPEG quantization tables with every
> tile, while TIFF can consolidate these into one set of tables per pyramid
> level. (Supporting that would require a little extra code in the tile
> server, but not much, I think.) This costs 38.0 MB for this sample, so in
> total the ZIP has 88.6 MB (8.5%) of overhead.
>
> - From an interoperability perspective, ZIP is not ideal. The spec is
> large, occasionally ambiguous, and has many optional features. For SZI to
> be well-defined, a profile of ZIP would need to be specified. (E.g., is
> central directory encryption allowed? Should ZIP64 be enabled conditionally
> or unconditionally?) The ZIP format also contains redundancy (between the
> local file headers and the central directory) which tends to lead to
> implementation errors. I have, on several different occasions, encountered
> interoperability problems between widely-deployed ZIP writers and readers.
Thanks Benjamin for this en-lighting summary !
This reminded me of https://xkcd.com/927/
-M
More information about the openslide-users
mailing list