Dmitry Gorodnichy has probably spent more time on the
phone than in the lab in the last week.
The researcher with the computational video group of
the National Research Council's Institute for
Information Technology -- henceforth referred to as
"CVG," "NRC" and "IIT," respectively, so I
can actually file this
editorial on time -- has had a deluge of media, industry
and user inquiries since New Scientist published
a story the nose-operated mouse -- or nouse -- he's
developed.
Before you conjure up images of someone face down on
his mouse pad, chicken-pecking away -- admit it, some of
you did -- it doesn't work that way. A garden-variety
USB Web cam mounted on top of the computer monitor
supplies a signal to the software, which tracks the
movement of the user's nose and translates it into
cursor movement.
The nose is the easiest feature to track. However,
the word "feature" means something different to
Gorodnichy and the CVG than to you and me. To us, a
feature is, "that thing thar in the middle of your
face." What the nouse is actually tracking is -- and
please correct me if I'm wrong, Dr. Gorodnichy -- the
extremum of the convex shape of the nose tip. This is
the point on the surface of your face that is closest to
the camera. Gorodnichy points out that this point
changes as you turn your head. It's a sort of virtual
nose tip.
Not only does the nouse track this, it does so on a
sub-pixel level. Consider the resolution of the
aforementioned garden-variety Web cam -- let's say, for
the sake of argument, 640 by 480 pixels. Then consider
the number of pixels on your run-of-the-mill 17-inch
monitor, which we can accurately count as "lots more
than that." It's no good to describe the extremum as
moving from Pixel 5 to Pixel 6. It has to be described
as going from Pixel 5.55 to Pixel 5.56.
The nouse is one element of a suite of perceptual
visual technologies the CVG is developing. Rehab
hospitals have been contacting Gorodnichy about another.
"It's more blink-detection that they're looking for,"
says Gorodnichy -- a nouse-following mouse wouldn't be
much use to a patient with a restricted range of
movement. The system can distinguish between intentional
and unintentional blinks, and use the former to execute
commands. "We can build a blink-based lexicon," says
Gorodnichy -- two blinks switched between windows, three
blinks calls up a dialogue box. Think of it as a mouse
click.
Or rather, don't. This is where some media reports
have morphed the technology to correspond to the
conventional mouse more closely than it actually does. A
Reuters report suggested that left winks could
correspond to left clicks and right winks to right
clicks, which would make a user look uncannily like
Herbert Lom playing the tic-ridden Inspector Dreyfuss,
Clouseau's boss in the Pink Panther movies. (Apropos
nothing, in one of those movies, Dreyfuss accidentally
shoots his own nose off, which is close enough to irony
for the puposes of this article.) This led one analyst
to smirk in the report at "the high silliness factor of
the nouse ... People balk at doing things that make them
look silly, and there is ample room for looking silly
here."
That simply isn't the case -- the software detects
series of blinks only, and is customized to execute a
task on the basis of that detection -- which might or
might not correspond to a mouse click. Don't take the
nouse to literally.
"Speech recognition is not meant to replace a
keyboard. Visual technology is not meant to replace a
mouse," says Gorodnichy. In fact, it's easy to imagine
how the technology, in conjunction with the usual
keyboard-and-mouse package, might speed text-intensive
applications or add an element of finesse to graphics
packages, once you're used to it. Your nose knows no
boundaries to the applications. Sorry, couldn't
resist.
dwebb@itbusiness.ca
Dave Webb reminds you that the plural of
"extremum" is "extrema."