Color-/Lineshifts under Windows?

Adam Goode agoode at andrew.cmu.edu
Tue Aug 10 15:20:22 EDT 2010


On 08/10/2010 03:06 AM, Hauke Heibel wrote:
> On Mon, Aug 9, 2010 at 5:52 PM, Adam Goode <agoode at andrew.cmu.edu> wrote:
>> When it comes time to downsample the images, things get bizarre. To
>> downsample, MIRAX concatenates 4 tiles together into 1 new JPEG tile.
>> This is fine, except that it does this without stitching together the
>> overlaps. So within a JPEG tile, you'll get overlaps. You can't treat
>> the JPEG file as an atomic unit, instead you have to cut parts out of it
>> and draw it just as you would the full sized tiles.
> 
> Ok, this is definitely "crazy". How did you find this out? Are there
> indications in the Index.dat file leading to this conclusion? The
> Slidedat.ini at least holds no clues for this.
> 

Glad you agree, I was worrying that I was a little crazy. The way I
found it out was many months of trial and error: I'd make some
assumptions on how things worked, try to render things, see it was
wrong, then look for clues in the opaque files that would help me to
render it correctly. I repeated this over and over until it looks like
it does today. Also "hexdump" and my HP-48 calculator.

The Index.dat file is a sort of linked-list tree file, you can try
misc/print-mirax.py to see it. Most of these binary files have a string
header at the beginning. Once you remove it, you see that the file is
divisible by 4: this is a clue that it's 32-bit values.

With print-mirax, each 32-bit value is a single row. The first column is
the row number, the second is the 32-bit decoded value, the third column
is the 2nd column divided by 4 if it divides evenly (heuristic for
pointer), and the 4th column is offset from current row.

You'll see the first 2 entries are pointers elsewhere. The 0 entry is
the "hierarchical" and the 1 entry is "non-hierarchical". Follow the
pointers and you start to see entries that match up with things in
Slidedat.ini.

Note that if you follow everything, you'll see some detached entries,
these don't have any clear path from the first 2 entries to them. They
point to some super-broken XML which doesn't appear anywhere in the
official MIRAX viewer, so I ignored them in OpenSlide.

The way Index.dat links back to the dat files is in entries like these:

   1875      195748
   1876     1598241
   1877        4217       1045    ->       -832
   1878           4

Once you follow enough pointers, you'll get these kind of records
(called "pages" in Slidedat.ini). This structure is tile_index, offset,
length, fileno. Anyway, that's enough detail right now, I can give more
if you like.


> Regarding the overlaps, you mention in the code that the MIRAX camera
> takes a photo and splits that one into image_division^2 JPEG tiles. I
> played with the split-mirax.py script and took a look at some pictures
> and found some with no overlap, like these
> 
> Data0004_0000143002.jpg (left)
> Data0004_0000143001.jpg (center)
> Data0004_0000143000.jpg (right)
> 
> and some with a huge overlap like these
> 
> Data0004_0000142995.jpg (right)
> Data0004_0000142996.jpg (right)
> 
> Am I hitting here such an example? Probably. What is interesting, is
> that the latter pair of pictures requires horizontal and vertical
> (~10px) shifting to be perfectly aligned.
> 

Yes, the MIRAX hardware seems to be very good at recording the position
of its camera, but a little sloppy in actually positioning it.

> Anyways, now I am confused, that it is possible to have a constant
> tile shift for painting because some tiles do not overlap at all (at
> least on the first levels until they are merged) and some others do.
> 

There is not really a constant tile shift, that is a little bit of a
hack that I use in order to not do a full 2D range search. I read in the
absolute positions of each tile and then assume that the tiles are
roughly near where the average tile shift should say it should be. Then
I can do just a tiny local search for the tiles nearby to draw.

> I think in order to continue, I need to try to understand the
> Index.dat file since I assume this is the file you mentioned earlier
> where the tile positions are stored. That may take some time. ;)
> 

You probably don't need to understand Index.dat, but you could look at
the tile position file. Just look for the file that is

 ((IMAGENUMBER_X / CameraImageDivisionsPerSide) *
  (IMAGENUMBER_Y / CameraImageDivisionsPerSide) * 9)) +
  strlen(SLIDE_VERSION) + strlen(SLIDE_ID)

bytes in size. For CMU-1, it's Data0019.dat. Or you could parse through
Index.dat to figure out which file corresponds to
VIMSLIDE_POSITION_BUFFER which is what OpenSlide actually does.



> So far, thanks for your answers!
> 

Good luck!

Anyway, to solve the dots problem, we really need to figure out how much
we need to read of each image when it is so downsampled. The problem is
the blended pixels on the edges which I try to avoid reading, but there
may not be a better way at that size.

The other thing we could do is simply not report that we can produce
downsampled images at that point... I think that we will not lose too
much performance anyway. And I suspect that the real MIRAX viewer reads
the higher resolution images that it downsamples in the background since
the quality of the pre-downsampled images on disk isn't as good as it
could be.


> Regards,
> Hauke
> 

Adam

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 261 bytes
Desc: OpenPGP digital signature
Url : http://lists.andrew.cmu.edu/pipermail/openslide-users/attachments/20100810/90cd006d/attachment.bin 


More information about the openslide-users mailing list