<html>

  <head>

    <meta content="text/html; charset=ISO-8859-1"

      http-equiv="Content-Type">

  </head>

  <body bgcolor="#FFFFFF" text="#000000">

    Hi again Benjamin,<br>

    <br>

    I wrote a function that uses libxml2 to parse the description of

    LEICA files, in the new vendor file I'm writing. I use the view

    tag's size to decide which image tag is the thumbnail and which is

    the main image. Here is an extract from a LEICA file description:<br>

    <br>

    <font color="#000099"><tt>&lt;scn ...&gt;<br>

        &nbsp; &lt;collection <font color="#cc0000"><b>sizeX="26564529"

            sizeY="76734666"</b></font>&gt;<br>

        &nbsp;&nbsp;&nbsp; &lt;barcode&gt;Q2FzZSAxLEhFUjIsLCw=&lt;/barcode&gt;<br>

        &nbsp;&nbsp;&nbsp; &lt;image&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

        &lt;creationDate&gt;2011-10-18T09:20:15.067Z&lt;/creationDate&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;device model="Leica SCN400;Leica SCN"

        version="1.2.5.8691 2010/07/15 06:56:41;1.4.0.9708" /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;pixels sizeX="1616" sizeY="4668"&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="1616" sizeY="4668" r="0" ifd="0"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="404" sizeY="1167" r="1" ifd="1"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="101" sizeY="291" r="2" ifd="2"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/pixels&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;view <font color="#cc0000"><b>sizeX="26564529"

            sizeY="76734666"</b></font> offsetX="0" offsetY="0"

        spacingZ="0" /&gt;<br>

      </tt></font><font color="#000099"><tt>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; .....<br>

      </tt></font><font color="#000099"><tt>&nbsp;&nbsp;&nbsp; &lt;/image&gt;<br>

        &nbsp;&nbsp;&nbsp; &lt;image&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;

        &lt;creationDate&gt;2011-10-18T09:23:28.277Z&lt;/creationDate&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;device model="Leica SCN400;Leica SCN"

        version="1.2.5.8691 2010/07/15 06:56:41;1.4.0.9708" /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;pixels sizeX="22112" sizeY="13696"&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="22112" sizeY="13696" r="0" ifd="3"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="5528" sizeY="3424" r="1" ifd="4"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="1382" sizeY="856" r="2" ifd="5"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;dimension sizeX="346" sizeY="214" r="3" ifd="6"

        /&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;/pixels&gt;<br>

        &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &lt;view <font color="#cc0000"><b>sizeX="11056000"

            sizeY="6848000"</b></font> offsetX="7359742"

        offsetY="27020003" spacingZ="400" /&gt;<br>

        &nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp; .....<br>

        &nbsp;&nbsp;&nbsp; &lt;/image&gt;<br>

        &nbsp; &lt;/collection&gt;<br>

        &lt;/scn&gt;</tt></font><br>

    <br>

    As you can see the one of the two image tags, the thumbnail, has the

    same dimensions as the collection. The code will fail if the above

    condition is not met in any of the two tags. <br>

    I'm adding the lowest resolution of the thumbnail as an associated

    image. I thought it would be quicker to not use the largest one.<br>

    <br>

    There is a big issue with libxml2 however, which I found out the

    hard way. There's a function called <b>xmlCleanupParser</b>. This

    function should be called once we are done using the library, so

    that it cleans up memory used by the library itself. Initially I

    called the function right after I finished parsing the file's

    description, pretty sure that I was doing the right thing. The

    problem though, is that I use OpenSlide as a dll in a multi-threaded

    application where parallel calls to the library take place. I

    noticed that I was getting segmentation faults since I added the

    parsing functionality. I commented it out, rebuilt and everything

    went back to normal. Then I read the function's documentation:<br>

    <br>

    <a class="moz-txt-link-freetext" href="http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser">http://xmlsoft.org/html/libxml-parser.html#xmlCleanupParser</a><br>

    <br>

    <tt><i><font color="#000099">This function name is somewhat

          misleading. It does not clean up parser state, it cleans up

          memory allocated by the library itself. It is a cleanup

          function for the XML library. It tries to reclaim all related

          global memory allocated for the library processing. It doesn't

          deallocate any document related memory. One should call

          xmlCleanupParser() only when the process has finished using

          the library and all XML/HTML documents built with it. See also

          xmlInitParser() which has the opposite function of preparing

          the library for operations. WARNING: if your application is

          multithreaded or has plugin support calling this may crash the

          application if another thread or a plugin is still using

          libxml2. It's sometimes very hard to guess if libxml2 is in

          use in the application, some libraries or plugins may use it

          without notice. In case of doubt abstain from calling this

          function or do it just before calling exit() to avoid leak

          reports from valgrind !<br>

        </font><br>

      </i></tt>Does this mean that whenever you call the library's

    functions, new memory is allocated? Or does initialization takes

    place only once? <br>

    Let me know if you want me to send you the leica vendor file I've

    written.<br>

    <br>

    Regards<br>

    Agelos<br>

    <br>

    <br>

    <div class="moz-cite-prefix">On 6/7/2012 1:32 &#960;&#956;, Benjamin Gilbert

      wrote:<br>

    </div>

    <blockquote cite="mid:4FF6161A.4020404@cs.cmu.edu" type="cite">On

      07/05/2012 05:30 PM, Agelos Pappas wrote:

      <br>

      <blockquote type="cite">My question here is: Would it be safe /

        acceptable to assume that the

        <br>

        first tag always refers to the the thumbnail IFDs and the second

        to the

        <br>

        main image?

        <br>

      </blockquote>

      <br>

      I'd rather not if we can avoid it.&nbsp; It looks as though there are a

      few things you could use to distinguish the images:

      <br>

      <br>

      1a. The &lt;view&gt; coordinates for the slide image describe a

      rectangle inside the view for the thumbnail.

      <br>

      <br>

      1b. The &lt;view&gt; for the thumbnail has its origin at (0, 0).

      <br>

      <br>

      2. The &lt;pixels&gt; width and height are larger for the slide.

      <br>

      <br>

      3. The &lt;objective&gt; is lower for the thumbnail.

      <br>

      <br>

      <br>

      It's a judgment call.&nbsp; #3 is probably too obscure.&nbsp; The paranoid

      approach would be to check all of #1a, #1b, and #2, and fail the

      open if we find any inconsistencies.&nbsp; (OpenSlide should always

      fail the open when it gets confused, because that way we'll get a

      bug report that will help us better understand the format.)

      <br>

      <br>

      Also, a slide could contain more than two images, e.g. two

      completely separate slide pyramids from different regions of the

      slide.&nbsp; Your sanity checks should catch this.

      <br>

      <br>

      Thanks

      <br>

      --Benjamin Gilbert

      <br>

      <br>

      <br>

      -----

      <br>

      No virus found in this message.

      <br>

      Checked by AVG - <a class="moz-txt-link-abbreviated" href="http://www.avg.com">www.avg.com</a>

      <br>

      Version: 2012.0.2193 / Virus Database: 2437/5112 - Release Date:

      07/05/12

      <br>

      <br>

      <br>

    </blockquote>

    <br>

    <br>

  </body>

</html>