LEICA BigTIFF

Sat Jul 7 12:14:11 EDT 2012

Hi again Benjamin,

I wrote a function that uses libxml2 to parse the description of LEICA 
files, in the new vendor file I'm writing. I use the view tag's size to 
decide which image tag is the thumbnail and which is the main image. 
Here is an extract from a LEICA file description:

<scn ...>
   <collection *sizeX="26564529" sizeY="76734666"*>
     <barcode>Q2FzZSAxLEhFUjIsLCw=</barcode>
     <image>
<creationDate>2011-10-18T09:20:15.067Z</creationDate>
       <device model="Leica SCN400;Leica SCN" version="1.2.5.8691 
2010/07/15 06:56:41;1.4.0.9708" />
       <pixels sizeX="1616" sizeY="4668">
         <dimension sizeX="1616" sizeY="4668" r="0" ifd="0" />
         <dimension sizeX="404" sizeY="1167" r="1" ifd="1" />
         <dimension sizeX="101" sizeY="291" r="2" ifd="2" />
       </pixels>
       <view *sizeX="26564529" sizeY="76734666"* offsetX="0" offsetY="0" 
spacingZ="0" />
         .....
     </image>
     <image>
<creationDate>2011-10-18T09:23:28.277Z</creationDate>
       <device model="Leica SCN400;Leica SCN" version="1.2.5.8691 
2010/07/15 06:56:41;1.4.0.9708" />
       <pixels sizeX="22112" sizeY="13696">
         <dimension sizeX="22112" sizeY="13696" r="0" ifd="3" />
         <dimension sizeX="5528" sizeY="3424" r="1" ifd="4" />
         <dimension sizeX="1382" sizeY="856" r="2" ifd="5" />
         <dimension sizeX="346" sizeY="214" r="3" ifd="6" />
       </pixels>
       <view *sizeX="11056000" sizeY="6848000"* offsetX="7359742" 
offsetY="27020003" spacingZ="400" />
         .....
     </image>
   </collection>
</scn>

As you can see the one of the two image tags, the thumbnail, has the 
same dimensions as the collection. The code will fail if the above 
condition is not met in any of the two tags.
I'm adding the lowest resolution of the thumbnail as an associated 
image. I thought it would be quicker to not use the largest one.

There is a big issue with libxml2 however, which I found out the hard 
way. There's a function called *xmlCleanupParser*. This function should 
be called once we are done using the library, so that it cleans up 
memory used by the library itself. Initially I called the function right 
after I finished parsing the file's description, pretty sure that I was 
doing the right thing. The problem though, is that I use OpenSlide as a 
dll in a multi-threaded application where parallel calls to the library 
take place. I noticed that I was getting segmentation faults since I 
added the parsing functionality. I commented it out, rebuilt and 
everything went back to normal. Then I read the function's documentation:

http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser

/This function name is somewhat misleading. It does not clean up parser 
state, it cleans up memory allocated by the library itself. It is a 
cleanup function for the XML library. It tries to reclaim all related 
global memory allocated for the library processing. It doesn't 
deallocate any document related memory. One should call 
xmlCleanupParser() only when the process has finished using the library 
and all XML/HTML documents built with it. See also xmlInitParser() which 
has the opposite function of preparing the library for operations. 
WARNING: if your application is multithreaded or has plugin support 
calling this may crash the application if another thread or a plugin is 
still using libxml2. It's sometimes very hard to guess if libxml2 is in 
use in the application, some libraries or plugins may use it without 
notice. In case of doubt abstain from calling this function or do it 
just before calling exit() to avoid leak reports from valgrind !

/Does this mean that whenever you call the library's functions, new 
memory is allocated? Or does initialization takes place only once?
Let me know if you want me to send you the leica vendor file I've written.

Regards
Agelos

On 6/7/2012 1:32 ??, Benjamin Gilbert wrote:
> On 07/05/2012 05:30 PM, Agelos Pappas wrote:
>> My question here is: Would it be safe / acceptable to assume that the
>> first tag always refers to the the thumbnail IFDs and the second to the
>> main image?
>
> I'd rather not if we can avoid it.  It looks as though there are a few 
> things you could use to distinguish the images:
>
> 1a. The <view> coordinates for the slide image describe a rectangle 
> inside the view for the thumbnail.
>
> 1b. The <view> for the thumbnail has its origin at (0, 0).
>
> 2. The <pixels> width and height are larger for the slide.
>
> 3. The <objective> is lower for the thumbnail.
>
>
> It's a judgment call.  #3 is probably too obscure.  The paranoid 
> approach would be to check all of #1a, #1b, and #2, and fail the open 
> if we find any inconsistencies.  (OpenSlide should always fail the 
> open when it gets confused, because that way we'll get a bug report 
> that will help us better understand the format.)
>
> Also, a slide could contain more than two images, e.g. two completely 
> separate slide pyramids from different regions of the slide.  Your 
> sanity checks should catch this.
>
> Thanks
> --Benjamin Gilbert
>
>
> -----
> No virus found in this message.
> Checked by AVG - www.avg.com
> Version: 2012.0.2193 / Virus Database: 2437/5112 - Release Date: 07/05/12
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.andrew.cmu.edu/pipermail/openslide-users/attachments/20120707/ee7f5215/attachment.html