This seadragon is indeed magical

January 7, 2008

by Mark Ollig

“You need to go to ted-dot-com and watch the video demonstration,” my oldest son enthusiastically told me during lunch last Tuesday.

We were discussing two newer technologies being applied to visual images.

During the course of our conversation, my son convinced me the two programs, Seadragon and Photosynth were worth researching.

Blaise Aguera Arcas, is an engineer at Microsoft Live Labs, gave the online video demonstration at www.ted.com. Blaise is also the designer of Seadragon and the co-creator of Photosynth.

The Seadragon program creates a software environment that allows a person the ability to interact with large amounts of visualdata.

Photosynth is a software program that assembles many still photographs together to present images users can smoothly zoom-in on with incredible detail. It also allows these images to be navigated in a panoramic, or “bird’s-eye” fashion.

The presentation began with the audience in attendance viewing images displayed on a large projection screen connected to Blaise’s laptop computer.

He was scanning effortlessly though a collection of thousands of very small “thumb-nail” images, representing many gigabytes worth of data.

One of the images shown in the collection was a map of the US scanned from the Library of Congress. This image was in the 300 mega-pixel range. Blaise smoothly focused in on the smallest parts of the image with absolute clarity and detail.

Another image was the complete text from “Bleak House” by Charles Dickens. In this image, all I could initially see were small rows of thin vertical lines.

Using the software, which captures the “real representation” of the text, he zoomed in on each thin line. What looked like lines were the actual paragraphs and sentences of all 67 chapters of the book.

The entire text of the book was stored on this small thumbnail image.

The text was easy to read and amazingly clear. In fact, Blaise then focused in on one letter, which filled the entire computer screen with incredible clarity and sharpness.

Seeing this caused the audience to first gasp, and then applaud.

Blaise explained how this technology would do away with the limits of the “screen real-estate” we currently have when viewing images.

He then showed an online newspaper with an advertisement in the lower right-hand corner.

The advertisement is a picture image. Normally, if we were to overly zoom-in on it, fuzziness and distortion would occur.

The ability to zoom in on a single image with clarity is normally based upon the number of pixels contained in the computer screen.

Blaise, however, demonstrated this new technology would allow the same size advertising image, but with the ability to endlessly zoom in and focus on the smallest text or picture inside the advertisement with amazing clarity, sharpness, and detail.

Blaise then demonstrated the Photosynth software technology, which builds upon the Seadragon software.

Photosynth is a technology which acquires a large collection of images and analyzes them for similarities.

All images analyzed are ‘meta-tagged’ and sorted in order for them to be linked with other similar images.

Using an algorithmic formula in the software, the combined images are presented as a single interactive three-dimensional object.

This software is, in a sense, creating “hyper-links” between images based on the content inside the image.

Say we do a Google picture search for “Eiffel Tower.” We are presented with thousands of photographs and images taken by individual cameras, cell phones, webcams, etc.

Photosynth technology gives each of these images an individual “tag” or marker.

All these visual representations of the Eiffel Tower and the embedded information from them are sorted and become “linked” together.

Think of millions of images of the Eiffel Tower being collected and individually sorted and tagged.

Let’s say I did go to France and took a photograph of the Eiffel Tower, which I uploaded into the public domain of Internet.

The Photosynth software would seek out and collect this image.

My image then becomes “meta-tagged,” and automatically becomes “enriched” with the image information contained from all the other meta-tagged photos of the Eiffel Tower.

We could then use this enhanced image of the Eiffel Tower in a new way.

With enough meta-tagged linked images, we would be able to view a moving panoramic vision of the Eiffel Tower from any angle we wanted to see it from.

Blaise explains this panoramic vision as being a cross-user “meta-verse” experience, built on the information contained from all the other photographs and images.

Meta-tagged NASA global images of the planet, for example, would result in an immensely detailed viewing of the Earth.

Soon, we will be able to view text and images in a very new way . . . based upon the embedded information each image contains.

We will be viewing images enriched from the collective-memory content of thousands or millions of similarly-tagged images.

To see an amazing demonstration, click onto: http://www.ted.com/talks/view/id/129