Advogato: Recognition versus Recollection

Posted 5 Jun 2000 at 05:15 UTC by nelsonrn

Jim Gettys gave a keynote speech at Linux Expo, wherein he spoke of the need to support disabled users. He's quite right. We need to support them -- not out of any bleeding-heart concern for them. We need to support them because the interfaces that enable them to use computers *at all* enable us to make better use of them. A good interface is a good interface whether you can't see or can't hear or can't type or can do all of these. In particular, people who can't see have trouble with a GUI, and people who can't type have trouble with a command-line.

You may have noticed some tension between users of the command line and the GUI. There is a very simple explanation. The command line is based on recollection, and the GUI is based on recognition. This paper explains why each has its place, and explores ways in which they can be combined. We start by explaining how a "pure" version of each works, how they are combined in real life, and how we can use this perspective to find new avenues for exploration.

Contrasting the two

With a recollection interface, you must recall the correct next step (for example, to get a directory listing you must remember "ls"). With recognition, you have to recognize that what you are looking at is a listing of files. With recollection, you must supply the "-l" parameter to ls to get all information on files. With recognition, you have to recognize the "View" menu, and the "Details" entry underneath it.

Recognition is more approachable because it does not require much training. Recollection is more efficient because it wastes no time on the recognition search. As a consequence, class distinctions have been brought into the mix. Typically, new users prefer recognition because they can get more done. Experienced users prefer recollection because they can get more done.

These class distinctions are not firm, however, because new users are only new for a short period of time. Usage *is* training, and the more they use a recognition interface, the more they wish they had a recollection interface. Experienced users are not evenly trained, and in any case have to learn new software from time to time. They often wish they had a recognition interface, knowing the while that they'll soon wish they weren't forced to use it.

Recollection Interfaces

Exactly how does a recollection interface work? Actually, initially by recognition. The first step to recollection is to understand the current context. Most often in a recollection interface, only a small set of the possibilities is valid. In the Palm Pilot stroke interface, only some of the possible gestures are valid. Three of the gestures are prefix gestures that change the context.

Or consider the Unix command line. It typically starts with a verb followed by modifiers followed by nouns. The verb is distinguished from the nouns because it is the first word entered. In order for a user to accurately enter a verb, they must first recognize that the command line is empty.

So, in order to reduce the number of choices one must recollect, a recollection interface retains a context. A typical recollection interface will have a zero point -- a home position, along with a way to clear all context. The control-U character is often used in Unix to clear all characters currently being typed.

A recollection interface will always have a way to remove part of the input, or to clear back to a certain context. Input mistakes are an inevitable part of the human-machine interface. The command-line interface uses Back Space, the Palm pilot uses a left-moving stroke, a pie menu uses a "click in the middle of the pie".

Recognition Interfaces

Recognition interfaces are easier to learn because the possibilities are tabulated and categorized. The canonical recognition interface uses a hierarchical menu. Another form is a button bar. The advantage of a recognition interface is clear: the user need only recognize which command they desire.

One cost of a recognition interface is the need to scan through a large number of commands to locate the desired one. Commands are always grouped together by function to help reduce this cost.

Another cost of a recognition interface is the large amount of screen real-estate occupied by the interface. This problem may be addressed several ways. The items may be represented by small pictures of the function of the command. Since these pictures do not always invoke the right concept, they are sometimes accompanied by words which pop up when the mouse cursor hovers over the picture.

Entire groups of commands may be removed from the screen, only to reappear when the mouse cursor is clicked or hovers over a special area of the screen. A new set of commands are added, which expose the groups of commands. This is how a menu bar works. Sometimes the entire menu bar itself disappears.

But this creates another cost -- locating the hidden command. If a method is used to collapse some of the commands, then a command must be located in its collapsed location. This combines the difficulty of a recollection interface with a recognition interface.

Combining both interfaces

Each technique can and should borrow from the other. The solution to the shortcomings of each is to combine both approaches. It's necessary, though, to preserve the full virtues of both, and not produce a compromise.

GUIs often have a primitive, bastard-stepchild recollection interface. One may often press the two-key sequence "Alt-F O" to Open a File. The Alt modifier introduces the beginning of a command. Essentially it starts a very short command-line interface. However, it's only ever used to introduce the verb of the command. Little attention is paid to the nouns. For example, an advanced recollection system would allow one to type Alt-F O /etc/passwd <Enter>.

A command line interface could do something similar. The FO command would open a file, prompting the user for the name of the file. The F? command would list all the File commands, and simply ? would list all the types of commands.

The TOPS-10/Kermit/Cisco interface

An example of a very well-done recollection interface with recognition assist is the command interface first used in TOPS-10, later used in some implementations of the Kermit data transfer protocol, and currently available on Cisco's IOS. This interface has two magic characters which may be typed at any time: HT (tab) and question-mark. The tab character would complete the current item being entered, according to the current context. If entering a command, the command line would be consulted. If entering a filename, the list of files would be consulted. if a question-mark character is entered, the list of possibilities is printed.

Summary

Rarely do GUIs implement a recollection interface well. Rarely do command-line interfaces implement a recognition interface well. We can do better. We should do better.

Autocad, posted 5 Jun 2000 at 06:21 UTC by mblevin »

There is also the method that AutoCAD used to move from a command-line interface into the menu-ing interface. The menus were simply tacked-on as an afterthought, and the command-line was still available at the bottom of the screen. Selecting a menu would print the desired command into the text-input area at the bottom, allowing the user to learn the recollection interface as he/she was exploring via recognition. This allowed newbies to learn it while still keeping the old-timers happy.

C-x f /etc/passwd

file -> open -> use file dialogue

it's been done. people use it. a bunch of people don't use it, and they're welcome to not use it, but there's a very powerful mixed-mode interface, with online help and complete access to the internals at runtime, and it's called emacs. it's also one of the most "talkative" systems for visually impaired users, using emacspeak. does this matter to non-emacs users? not a whit, since they are deathly allergic to it.

/me scratches head and returns to doing something useful

The enlightenment window manager is actually moving towards the idea of a "desktop shell", which Mandrake describes as "like bash for your desktop". The next version of Enlightenment will have an integrated file manager with some interesting features that cross over the Recognition vs Recollection gap. Each window in the file manager provides a typebuffer that allows the user to type pretty much any valid shell command in the window and it will execute from the directory that the window is viewing. If necessary, it will pop up a term that shows output of the command (for things like make or cvs -z3 update -PAD). You can open other windows from the keyboard by typing in the pathname, or you can navigate the file system using the mouse by clicking on icons. While all this stuff is in heavy development, it bridges the gap between the graphical/mouse-based and keyboard based usage methods, the gap between newbies and inexperienced users, and the gap between "recognition and recollection" as well as any other piece of software I've used.

Bash already completes on tab, and pushing tab twice will print all the options. This works for both commands and file names. I believe this is just part of the standard GNU readline library.

Emacs goes far beyond this even, with static and dynamic completion, apropos to search for functions, ability to bind functions to keys, menus, button presses, and most functions and variables have built in documentation.

GUI interfaces are just annoying after a point. A quick <tab> <tab> on a blank command line shows that I have 2430 executables in my path. Menus may work well for a couple dozen of these, but throwing 2430 little icons on my desktop is unworkable. Even putting them in menus by grouping is very cumbersome to look through after a time. Emacs is even worse, a blank apropos shows 7364 items. Trying to fit a gui on that would be like trying to write this article by looking up every individual word in a GUI dictionary. Even the cute little task bars are annoying to me, as I'm currently running around 20 applications and I don't want the bottom third of my screen filled with little pictures of them all. If anything, I want more space on my desktop for important stuff, not less.

Well, that's my rant for the day :)

I'm not sure that I explained myself well enough. Bash's recognition function is very poor. All you can complete with Bash are filenames. You can't ask bash what can follow a -x switch in ls. It has no clue. You MUST recollect. This is not good.

And GUIs typically have a poor recollection interface because they do not support all functions on keystrokes. Most of them use linear menus instead of pie menus.

Al Koscielny pointed out in private email that a user interface should have a continuous gradient between pure recollection and pure recognition. This lets the user decide how much help they need.

Random musings:
The hand-eye interface is the real problem. A human's preferred form of immediate communication does not involve the hands and need not involve the eyes.

The mouse is really a very strange device. Not really a communication device at all. (A keyboard is much closer.) A mouse is just a "push-button activator".

A GUI is dependent on "recollection" as is a command-line interface. It's just that the recollection is made a little easier with visual hints and organization.

A natural-language interface would be ideal (not considering the cacophony that might be produced in typical work environs). Voice-ear-eye interfaces to our machines should be the norm. Current interfaces will be retained primarily for impaired users. Note: Keyboard-driven command-line interfaces should also become natural-written-language cognisant however.

What do we use our computers for? I use mine to communicate and learn. Communication and learning don't ordinarily depend on recollection in the same way that interfacing with a computer currently does. (The exception being for those who are students of computers - but *most* people aren't and don't want to be.) Computers are still "too hard" to use.

I'd have to disagree with you about natural language interfaces. The problem is that they are too liberal. Computers are very picky things and nothing is going to change that - their pickiness is their major strength. With natural language, it would take far too long to explain a command in detail enough for the computer to fulfil the task. A good domain-specific, small, formal language is much better for this task.

Other reasons for not wanting a voice interface is the lack of privacy, a major lack in speed -- typing is much faster than talking, especially for word processing, which is was most people use their computers for, along with web browsing. (could you imagine trying to use a browser with your voice?)

Certainly I agree that there are much better interfaces waiting to be discovered. But natural language is a step backwards in this area.

(I apologize if any of this doesn't make sense, etc.. I'm tired right now)

Emacspeak, posted 6 Jun 2000 at 14:34 UTC by jpick »

Check it out.

It's got tonnes of functionality that I'd like to see in mainstream use. At the first LWCE last year, I spent over an hour talking with the author, Raman, who is 100% blind. The nice thing about free software is that users that need accessibility features can build totally customized versions themselves. It was neat seeing how it could verbally read through C code, taking syntax into account (thanks to emacs) - I was surprised at how easy it was to follow.

The new versions can be used without a hardware speech synthesizer, so I want to try it out sometime when I get more time.

The current crop of Gnome/KDE style applications aren't terribly accessable. But they should be. There would be huge benefits to even normal users if they could be completely controlled from a command line.

zsh is significantly better here, especially if you use a recent version and enable all the included completion code. A few examples of what it lets me do:

I'm root and I want to change the owner of a file. I can't remember exactly what that username is though. Does he spell it Smith, Smythe, or what? "chown smit<tab>" and it completes from the users on the system.
When killing a process, tab completion operates on pids. If you hit tab twice, you get a ps listing.
Looking for a man page for perl, but there are so many. "man perl<tab>" lists them all.
I've just booted up single user and I need to remount the root filesystem read-write. I can never remember the mount options for that. I type "mount -<tab>", and get a list of its options with 1 line summaries. Type "o<tab><tab>", and here's a list of mount options. "re<tab>", and the command line looks like "mount -o remount,", and it is giving me a short list of ro and rw, with descriptions.
I'm using cvs, and I want to add a new file I've just made to the repository. "cvs add <tab>"

I could go on. I just discovered several of the examples above while writing this, it's a pretty complete system.

The description of E's file manager sounds very interesting, does it only work with E?

Recognition versus Recollection

Posted 5 Jun 2000 at 05:15 UTC by nelsonrn

Autocad, posted 5 Jun 2000 at 06:21 UTC by mblevin » (Journeyer)

flogging the dead horse, posted 5 Jun 2000 at 14:52 UTC by graydon » (Master)

enlightenment and the new "desktop shell", posted 5 Jun 2000 at 17:33 UTC by mazeone » (Journeyer)

Already have these things., posted 5 Jun 2000 at 21:40 UTC by apply » (Apprentice)

Bash is a shadow of what I want, posted 6 Jun 2000 at 04:24 UTC by nelsonrn » (Master)

Replace the hand-eye interface, posted 6 Jun 2000 at 08:15 UTC by scottyo » (Apprentice)

re: Replacing the hand-eye interface, posted 6 Jun 2000 at 10:03 UTC by fatjim » (Journeyer)

Emacspeak, posted 6 Jun 2000 at 14:34 UTC by jpick » (Master)

Re: Bash is a shadow of what I want,, posted 13 Jun 2000 at 06:49 UTC by joey » (Master)