Monday, November 19, 2007

camera combinators

Lazy evaluation and higher order functions are the basis of a very useful programming abstraction. Using standard Haskell list functions we can easily define combinators which work on the infinite sequence of images captured by a camera.

A camera is just an action which returns an image:
Type Camera = IO Image
A camera combinator is a function which typically takes one or more cameras and other arguments and produces a new camera. For instance, we can fuse the images obtained by two cameras into a "panoramic" view:
panoramic :: Camera -> Camera -> IO Camera
Then any program using this camera automagically receives frames like this:


Pedro and Antonio working hard in our lab

(Currently the required synthetic rotations must be adjusted manually, but we are working on an automatic method. And, of course, the cameras must be very close to each other for this to work.)

We can now create a panoramic view by joining an arbitrary number of cameras by something like this:
pano <- foldM panoramic (head cams) (tail cams) 
(I will try to prepare a demo with 3 or 4 cameras...)

The real power of camera combinators in Haskell is the fact that we can work with the infinite lazy sequence of the images captured by the camera. We have defined a "virtualCamera" function which adapts ordinary list functions to work with the IO list of images. Using it we can define, for example, a camera which "intercalates" average frames:
interpolate = virtualCamera (return . inter)
where inter (a:b:rest) = a : x : inter (b:rest)
where x = 0.5 .* a |+| 0.5 .* b
We can also compute a weighted average of the sequence:
drift alpha = virtualCamera (return . drifter)
where drifter (a:b:rest) = a : drifter (x:rest)
where x = alpha .* a |+| (1-alpha) .* b

For instance, the following effect:


can be obtained by the following composition of combinators:
cam <- getCam 0 size
>>= monitorizeIn "original" (Size 150 200) id
>>= asFloat
>>= drift alpha
>>= interpolate
The "worker" function receives just a simple IO ImageFloat:
worker cam win = do
inWindow win $ do
cam >>= drawImage
But each call automatically performs all the computations defined above.

We have defined camera combinators to filter frames very different from the previous ones (movement detectors), detectors of "static" frames, feature extractors, etc. Using this technique typical acquisition and preprocessing tasks can be easily uncoupled from the "consumer" applications.

No comments: