LEICA BigTIFF

Adam Goode adam at spicenitz.org
Sat Jul 7 14:13:51 EDT 2012


http://0pointer.de/blog/projects/beware-of-xmlCleanupParser.html

On Sat, Jul 7, 2012 at 12:14 PM, Agelos Pappas <agelos at smartcode.gr> wrote:
> Hi again Benjamin,
>
> I wrote a function that uses libxml2 to parse the description of LEICA
> files, in the new vendor file I'm writing. I use the view tag's size to
> decide which image tag is the thumbnail and which is the main image. Here is
> an extract from a LEICA file description:
>
> <scn ...>
>   <collection sizeX="26564529" sizeY="76734666">
>     <barcode>Q2FzZSAxLEhFUjIsLCw=</barcode>
>     <image>
>
>       <creationDate>2011-10-18T09:20:15.067Z</creationDate>
>       <device model="Leica SCN400;Leica SCN" version="1.2.5.8691 2010/07/15
> 06:56:41;1.4.0.9708" />
>       <pixels sizeX="1616" sizeY="4668">
>         <dimension sizeX="1616" sizeY="4668" r="0" ifd="0" />
>         <dimension sizeX="404" sizeY="1167" r="1" ifd="1" />
>         <dimension sizeX="101" sizeY="291" r="2" ifd="2" />
>       </pixels>
>       <view sizeX="26564529" sizeY="76734666" offsetX="0" offsetY="0"
> spacingZ="0" />
>         .....
>     </image>
>     <image>
>
>       <creationDate>2011-10-18T09:23:28.277Z</creationDate>
>       <device model="Leica SCN400;Leica SCN" version="1.2.5.8691 2010/07/15
> 06:56:41;1.4.0.9708" />
>       <pixels sizeX="22112" sizeY="13696">
>         <dimension sizeX="22112" sizeY="13696" r="0" ifd="3" />
>         <dimension sizeX="5528" sizeY="3424" r="1" ifd="4" />
>         <dimension sizeX="1382" sizeY="856" r="2" ifd="5" />
>         <dimension sizeX="346" sizeY="214" r="3" ifd="6" />
>       </pixels>
>       <view sizeX="11056000" sizeY="6848000" offsetX="7359742"
> offsetY="27020003" spacingZ="400" />
>         .....
>     </image>
>   </collection>
> </scn>
>
> As you can see the one of the two image tags, the thumbnail, has the same
> dimensions as the collection. The code will fail if the above condition is
> not met in any of the two tags.
> I'm adding the lowest resolution of the thumbnail as an associated image. I
> thought it would be quicker to not use the largest one.
>
> There is a big issue with libxml2 however, which I found out the hard way.
> There's a function called xmlCleanupParser. This function should be called
> once we are done using the library, so that it cleans up memory used by the
> library itself. Initially I called the function right after I finished
> parsing the file's description, pretty sure that I was doing the right
> thing. The problem though, is that I use OpenSlide as a dll in a
> multi-threaded application where parallel calls to the library take place. I
> noticed that I was getting segmentation faults since I added the parsing
> functionality. I commented it out, rebuilt and everything went back to
> normal. Then I read the function's documentation:
>
> http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser
>
> This function name is somewhat misleading. It does not clean up parser
> state, it cleans up memory allocated by the library itself. It is a cleanup
> function for the XML library. It tries to reclaim all related global memory
> allocated for the library processing. It doesn't deallocate any document
> related memory. One should call xmlCleanupParser() only when the process has
> finished using the library and all XML/HTML documents built with it. See
> also xmlInitParser() which has the opposite function of preparing the
> library for operations. WARNING: if your application is multithreaded or has
> plugin support calling this may crash the application if another thread or a
> plugin is still using libxml2. It's sometimes very hard to guess if libxml2
> is in use in the application, some libraries or plugins may use it without
> notice. In case of doubt abstain from calling this function or do it just
> before calling exit() to avoid leak reports from valgrind !
>
> Does this mean that whenever you call the library's functions, new memory is
> allocated? Or does initialization takes place only once?
> Let me know if you want me to send you the leica vendor file I've written.
>
> Regards
> Agelos
>
>
>
> On 6/7/2012 1:32 πμ, Benjamin Gilbert wrote:
>
> On 07/05/2012 05:30 PM, Agelos Pappas wrote:
>
> My question here is: Would it be safe / acceptable to assume that the
> first tag always refers to the the thumbnail IFDs and the second to the
> main image?
>
>
> I'd rather not if we can avoid it.  It looks as though there are a few
> things you could use to distinguish the images:
>
> 1a. The <view> coordinates for the slide image describe a rectangle inside
> the view for the thumbnail.
>
> 1b. The <view> for the thumbnail has its origin at (0, 0).
>
> 2. The <pixels> width and height are larger for the slide.
>
> 3. The <objective> is lower for the thumbnail.
>
>
> It's a judgment call.  #3 is probably too obscure.  The paranoid approach
> would be to check all of #1a, #1b, and #2, and fail the open if we find any
> inconsistencies.  (OpenSlide should always fail the open when it gets
> confused, because that way we'll get a bug report that will help us better
> understand the format.)
>
> Also, a slide could contain more than two images, e.g. two completely
> separate slide pyramids from different regions of the slide.  Your sanity
> checks should catch this.
>
> Thanks
> --Benjamin Gilbert
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.2193 / Virus Database: 2437/5112 - Release Date: 07/05/12
>
>
>
>
>
> _______________________________________________
> openslide-users mailing list
> openslide-users at lists.andrew.cmu.edu
> https://lists.andrew.cmu.edu/mailman/listinfo/openslide-users
>


More information about the openslide-users mailing list