Copyright © 1991 Andrew Oram
User documentation, long considered an unwelcome responsibility for software project teams, can actually be produced with the same processes of specification, review, and measurement as the other deliverables in a computer system. This paper describes a practical, inexpensive method that some commercial computer vendors have used to create and review their manuals. It employs a simple form of constructive specification to determine the valid operations that users can perform. The method leads to a set of usage models and a series of examples that can be integrated into automatic regression tests. Benefits include better documentation of environmental needs such as prerequisites and restrictions, clear links between user tasks and product features, and regular automatic checks on the document’s accuracy.
A French translation of this paper is available.
The software field has long held an ambivalent attitude toward user documentation. Programmers and Quality Assurance staff definitely appreciate when a good manual helps them learn about their own projects. And in the software engineering literature, one would have difficulty finding a text that fails to list user documentation as a deliverable. But on the other hand, engineers do not feel comfortable with specifications and evaluation in the area of user documentation. Thus, many relegate it to the fringes of their projects, sometimes tacking it on at the last minute. Ironically, the trend in software engineering literature [such as Boehm, 1981; ANSI/IEEE, 1986] is toward the other extreme—to treat user documentation as part of software requirements, and thus to insist unrealistically that it be largely finished before software design even begins.
This paper tries to bring the little-researched area of user documentation within the software engineering fold. I will describe a practical, inexpensive method that some commercial computer vendors are using to review and monitor their manuals. Software project managers, designers, and Quality Assurance staff can use the method to extract the formal elements of documentation and work them into specifications, test plans, and schedules.
The stage in this method resemble the informal techniques that many people use when they have to document software—roughly:
The contribution of this paper is to give these techniques a firm grounding in software engineering. This makes the difference between an unstructured play activity and a discipline that supports goal setting and resource allocation (without losing any of the fun).
In pursuit of reliability, this paper defines exactly what a “feature” is, and offers a complete list of questions that have to be answered in order to document each feature. I also show how to determine that the applications discussed are truly of value, and how to associate the models offered to readers with the actual steps they must follow to use the software. Every stage of the method includes rules for bounding the activity, recording progress, and reviewing results.
Before I launch into the theoretical underpinnings, let me describe some incidents that give the flavor of what it is like to work with this method in a commercial environment.
User documentation is the culmination of a long process of discussion and experimentation throughout a software project. Therefore, while this paper’s main impetus is to foster better publications and on-line documentation for users, some of its recommendations will affect a project’s internal documentation and staff training. Thus, the paper should interest people concerned with improving education and communication among their programming staff, and particularly with ways to disseminate the insights of project designers and senior members to other people on the team.
The next three sections—Goals, Theory, and Roles—show the method’s general fitness for software documentation. The bulk of the paper is devoted to a history of practical applications: a stage-by-stage description in Method, and a discussion of implementation details in Mechanics. I end with a summary of the method’s current status in Benefits.
By way of contrast, here are some traits that cannot be checked formally, but depend on the individual skill and subjective judgement of the document’s producers—and therefore, lie outside our discussion.
The first traits are essentially linked to features of the product and its use, while the second cover the psychological aspects of the document and its translation into a medium of distribution.
Desirable results are not enough to define a useful working method. The implementation must also be feasible in a commercial environment. Thus, a method to produce reliabile documentation should meet the following procedural requirements.
The formally reviewed traits of the documentation can be checked through regression tests at regular points in the software’s development cycle.
The method adds a relatively small burden to the existing responsibilities, schedule, and computer resources of a commercial project team.
The same essential techniques can benefit small projects (such as one-person MIS projects directed toward a few in-house users) as well as large ones (commercial software for end-users, where the documentation is the critical entry point to the product).
The techniques ring familiar to well-trained engineers and Quality Assurance staff, and can be adapted to whatever standards they are using for other software maintenance efforts.
The method can be used to produce documentation for software that has already been released, and even software whose original designer has left or whose project team has disbanded.
The critical issues determining the quality of software documentation lie in the structure of the software itself, not in stylistic choices made by the writer.
This paper will show that one can produce a complete description of a system’s use by tracing data transformations from one function to the next. The supporting theory for our endeavor is constructive specification. It may seem a surprising choice, since the theory is best known as a somewhat academic, labor-intensive method for constructing formal proofs [Jones, 1980] and as a way of deriving classes in object-oriented programming [Stoy, 1982]. But in this paper, constructive specification proves to be a simple and powerful way to link software’s use with its logical structure.
The basic idea behind constructive specification is to describe every data object in terms of the operations that the program will allow. For instance, you can write a specification for a stack by describing three operations: initializing, pushing, and popping. For the purposes of documentation, we can set both a direction and a boundary to our efforts through the following rule:
The specification of a user document is complete when it includes every operation that is valid on every data object that affects system state, within a sample application that causes a change from one user-recognizable system state to another.
Let us now decipher the key phrases “every data object that affects system state” and “user-recognizable system state.” We can then join the theory to a broader view of mental models and links.
Sample applications can emerge through both top-down and bottom-up approaches. The top-down approach, which is the more familiar one, consists of collecting benchmarks and customer applications that the product is meant to support, and breaking them down into small pieces can be independently reviewed. But this cannot ensure full coverage of all operations on all data items. Thus, it must be accompanied by the bottom-up approach, which is to trace data transformations using the method in this paper.
Here is a simple example of bottom-up design. The basic operations on a file identifier include assignment (through an open statement), reference (through read, write, and close statements), and ancillary operations (like FORTRAN’s INQUIRE or the C language’s stat). Thus, one can create a simple portal-to-portal example by opening, writing, and closing a file. Verification could consist of comparing the resulting file to a canned version, or of reading the data back into the program and checking it for consistency.
Simple as such an example is, the lessons it embodies are by no means trivial. It can be the template for sophisticated applications like imposing structures on raw binary data, and opening a pipe with non-blocking (asynchronous) access. Using the method in this paper, one can build a complete description of file handling through a series of progressively more complex examples.
This paper is the first, to my knowledge, to suggest a disciplined method using examples to assure full product coverage. I have found only one other discussion of user examples in the software engineering literature [Probert, 1984] but it considers them a source for tests rather than a training tool.
For instance, in a relational database, users might define their task initially as retrieving the entries that match certain criteria. But to begin using a typical query system, they have to redefine this task as “building a view.” This task in turn depends on a lower layer of tasks like choosing the keys to search for, creating Boolean search expressions, and sorting the entries. The documentation discussed in this paper helps users develop the necessary thought processes for figuring out how to use the software—that is, for decomposing their tasks until they reach the atoms represented by the product’s features.
Cognitive scientists and educators have focused on the concept of mental models to explain how people assimilate information and apply it in new situations. The more sophisticated research [for instance, Brown, 1986; Norman, 1986; Frese, 1988] bolsters the strategy used in this paper: that of matching the models of product use to the logical structure of the software.
Reliable documentation builds models from the structure of product itself, which offers both richness and accuracy. The models are simply the uppermost layer of user tasks, such as “searching” in the example of a data base. The user who consults the documentation in order to perform a search finds a progressive break-down into lower levels of tasks, ending perhaps in the arguments of a WHERE clause in SQL.
In this paper’s method, models map directly onto the designer’s construction of the software. For example, a real-time programming manual could divide applications into cyclic and interrupt-driven. Cyclic applications could then be broken down further into those running several independent threads, and those running several functions repeatedly in one thread. The manual can then describe the environments in which each model would be most advantageous, and implement each model through procedures and examples.
No one uses every feature in a product. But one can be certain that certain sets of users need particular combinations of features. Thus, I use the metaphor of terraces to describe the structure of a computer document. Each terrace consists of an example with its accompanying explanation. A document can have many “hills,” each consisting of a set of terraces that increases gradually in complexity.
Thus, in a database product, one hill could offer more and more complicated examples of retrieving keys, thus showing the reader various ways to build a view. Another hill could solve the problems of physically storing large databases. Users can climb the hills that they need for their particular applications, and ignore other hills entirely. If new features or new applications are added during product development, the writer can find places for them near the top of the terrace hierarchy. But the disciplined creation of sample applications ensures that users can associate tasks with product features.
How do software designers convey their insights to the less experienced team members during product implementation, and ultimately, in manuals and training courses, to the end-users of the product?
Like any planning strategy, the method in this paper is least expensive and most beneficial when it is employed from the earliest phases of a project. The managers who initiate the project can, with fairly little effort, preserve some of the user applications driving the project in the concepts and requirements documentation.
Software designers definitely have usage models in mind as they find common sub-tasks, create modules, and define system-wide data structures. The models should be explicitly documented in the software design descriptions. If the designers do not have time to create full examples, they can delegate the work to other team members—in either case, the intellectual process of creating examples helps to define the product and describe it to the team.
The method in this paper is equally valuable in the unhappy—but all too common—situation where a product has been in the field for a long time without adequate documentation, and the project team hires a writer to redress the situation. Now the method provides guidance for reconstructing the lost information on use. Categorizing and tracing the data helps to establish essential information, like what each command option is for, and what distinguishes similar commands. Where features cannot be understood, and further research on user applications is needed, the method helps the writer identify missing information and pose the right questions.
For instance, on a project with tight deadlines, some stages overlap in a pipeline. A partially-completed data analysis can be used to start developing examples, and early sets of examples can be placed in a tentative order so that the writer can start creating the text.
Changes in design or marketing strategy also complicate the method by requiring the team to reiterate completed stages. If a new feature is added, each of the documents produced in each stage must be adjusted to include the feature. One of the method’s strengths is that writers can quickly determine the ripple effect of any change on the entire user viewpoint, and make incremental changes to documentation where necessary.
Category | Programming level | Command level | Menu level |
---|---|---|---|
Flag | A Boolean variable, one that is permanently restricted to having two possible settings. | An option that is either present or absent, with no accompanying value. | An option that the user chooses without entering any value, choosing from a list, etc. |
Counter | An integer (generally unsigned) that is incremented or decremented during the course of the application, and is checked for reaching a threshold (often zero). | ||
Identifier | A value (usually integer or string) that is assigned at most once, is never changed thereafter, and is referred to in order to locate the object. This category of data covers file descriptors, channel identifiers, and other objects offered by the operating system. | ||
Table | An integer used as an offset or array subscript (assuming that the array is a set of mappings or pointers, rather than a vector with arbitrary contents). | An option accepting a fixed set of values, usually represented as character strings. | A menu of options, from which the user chooses exactly one. |
Application data | Any item other than a counter that can take a range of values, with no fixed, predefined set of possible values. Examples include file names and the data entered into a spreadsheet. |
Some people might want to use these categorizations as primitives from which to derive more specific categories. For instance, a file identifier, a process identifier, and a channel identifier all support different data transformations. Thus, if your product has many data items that fall such sub-categories, you might find it efficient to create separate lists of questions for each sub-category.
Similarly, some data items cover more than one category. For instance, a communications protocol might define several possible encodings for a single data item, where the settings of certain bits determine which encoding applies. Many UNIX and X Window System applications use C-language unions for similar purposes. Such complexities merely mean that you have to ask the appropriate questions for each possible use of the data item.
Since the focus here is on the data’s purpose, we do not need to be concerned with its type, scope, or range. (These considerations do of course appear eventually in the documentation, to describe restrictions and error conditions.) Nor are derived data types or levels of indirection important; we are concerned only with the kinds of transformations allowed.
Once a category is found for each data item, we know the information that the documentation must provide for that item. The list of questions for each category appears below.
The concrete result of Stage 1 is a list like Table 2. This table is excerpted from an actual internal document developed during the design of a programming product for window graphics. The left column is simply a list of functions, while the next column shows each function’s argument list. If the product included state data, these could be listed in the left column as well.
Call | Argument/ Data item | Categorization | Expected range | Examples |
---|---|---|---|---|
Init() | return value | table | True (success) | |
False (failure), BadAccess | ||||
display |
identifier | retrieved from server | ||
ChangeScheduler() |
display |
identifier | retrieved from server | |
client |
identifier | from ThisClient() | ||
when |
table | Immediate | ||
Sequential | ||||
params: |
||||
type |
table | RoundRobin | ||
PriorityBased | ||||
slicevalue |
counter | > 0 | ||
slicetype |
table | TimeBased | ||
RequestBased | ||||
decaylevel |
counter | MinPriority to MaxPriority | ||
decayfreq |
counter | > 0 | ||
decayunits |
table | TimeBased | ||
RequestBased | ||||
priority |
counter | MinPriority to MaxPriority | ||
priomode |
table | Absolute | ||
Relative | ||||
. | ||||
. | ||||
. |
The argument to one call is a complicated structure containing several distinct data items. Table 2 reflects this by indenting the data items under the name of the structure, params.
Some data items now require more than one row, because multiple settings must be documented. In particular:
The Categorization column reflects the criteria from Table 1. The Expected range column resembles the domain and range information collected in standard Quality Assurance practices. In general, the third and fourth columns embody the strategy for exploring the product and answering the questions described earlier.
The rightmost column is currently empty, but will be filled in during Stage 2 with a list of examples that illustrate each particular data item at each specified value.
As another example of Stage 1 output for more familiar software, Table 3 shows the data categorization for the signal call on UNIX systems (as standardized in ANSI C).
Call | Argument/ Data item | Categorization | Expected range | Examples |
---|---|---|---|---|
signal | signo | identifier | mnemonic for signal | — |
sa_handler | table entry | SIG_DFL | — | |
SIG_IGN | — | |||
function | — | |||
return value | table entry | SIG_DFL | — | |
SIG_IGN | — | |||
function | — | |||
SIG_ERR | — |
Stage 2 builds the changes for each data item into small but realistic applications. This stage contains the alchemy that transforms system states into user tasks and programming models.
The achievement of Stage 2 is to link product features to progressively higher layers of tasks. The links collectively form a cross-reference system that the writer can turn into an index for a manual, or a set of links in on-line documentation.
Example-building is a bounded activity, because the previous stages have already defined what data items must be documented, and what questions the documentation must answer for each one. The success of the documentation effort is now quantifiable. If time does not permit the full exploration of every data item, the engineering team can choose to focus on critical items and ignore obscure ones.
In the window project discussed earlier, the search for examples radically altered the document’s focus. No meaningful application could be developed that stayed within the scope of the product. Instead, the team agreed to pull in numerous tasks that lay outside the software they were building, but which were an inseparable part of the application base for the software:
The initial document created for review during Stage 2 looked like the pseudo-code shown in Table 4. The final document was a set of actual programs.
External event | Program action |
---|---|
Init(display) . . . client_id[0] = ThisClient(display) params.type = PriorityBased params_raise.type = PriorityBased params.u.p.slicevalue = long_draw + 10 params_raise.u.p.slicevalue = long_draw + 10 . . . ChangeScheduler(display, client_id[0], ¶ms, Sequential) |
|
Button press | |
for (i=0; i<num_clients; i++) ChangeScheduler(display, client_id[i], ¶ms_raise, Immediate) . . . |
A subset of Table 4 is represented by the C code below, which is an excerpt from an actual program testing features from the product. As with software testing, a complete set of user examples should include some that are supposed to fail, by deliberately causing errors or breaking the documented rules.
#include "plot_xlib.h" #include "root_defs.h" void Xserver_priority_initialize(top) DISPLAY_INFO *top; { rtxParms set_rtxp; /* initialize with privileges to change global parameters */ (void) PrivilegeInit(top->display); /* client-specific parameters will affect just this client -- actually, this fragment affects only global parameters */ top->client = ThisClient(top->display); /* this will tell library to check the set_rtxp.u.p parameters */ set_rtxp.type = PriorityBased; /* mask makes scheduler change slice and decay, but leave priority alone */ set_rtxp.mask = Slice | DecayLevel | DecayAmount | DecayFrequency; /* set the parameters of the rtxParms structure */ set_rtxp.u.p.slicevalue = NEW_SLICE; set_rtxp.u.p.dlevel = CEILING_FOR_DECAY; set_rtxp.u.p.damount = 1; set_rtxp.u.p.decayfreq = DECAY_TIME; set_rtxp.u.p.slicetype = TimeBased; set_rtxp.u.p.decayunits = TimeBased; /* Everything before was prepartion -- this call makes the change */ ChangeScheduler(top->display, top->client, &set_rtxp, Sequential); }
Some additional optional criteria can help to improve the quality of the final documentation or the maintenance effort for examples. These criteria require intuitive judgement and a sense of the user environment.
The hierarchical aspects of organization come from the models and tasks discussed under Stage 2. These help to group together the examples that users need at a particular time.
At the end of this stage, the engineering team has formally defined both the topics and the structure of the product’s documentation. The richness and authority of the information that this resource offers to technical writers cannot be matched by any other method. Writers can now prepare background information and narrative text that explains the models, tasks, and techniques. The cross-referencing system can be used to build the index.
As a brief example of a document structure designed through data analysis, here is the outline for an on-line, fully task-oriented description of ANSI C and POSIX signals [ANSI, 1989; IEEE, 1988] .
The document moves from simple issues to more complex ones, freely breaking up the discussion of a single call where task-orientation calls for it. For instance, the section on “Sending” focuses on communication with a single process (the most common case), but also offers a brief discussion of the more complicated issue of process groups. Although blocking is a critical issue, it comes late in the document because it requires a good understanding of the earlier issues. Many issues not directly related to the calls also appear in the document, such as the need to pass information between the handler and the main program using volatile, atomic data.
After this stage—when the materials are in the hands of the writers—the focus moves to reviewing the text in relation to the examples. Document review becomes much easier and more rewarding, because it can focus on small areas of the document and ask questions whose answers are fairly easy to determine: for instance, whether the written procedures accurately summarize the examples, and whether the text warns users about potential sources of error.
A simple example is furnished by a book on the UNIX system’s make utility. The data analysis included all command options, in particular an -n option that causes the utility to print a series of commands.
An interesting test for the -n option is a set of nested or recursive make commands. First, create a file named makefile with specifications for make. The following is simplified but still realistic example.
all : ${MAKE} enter testex enter : parse.o find_token.o global.o ${CC} -o $@ parse.o find_token.o global.o testex : parse.o find_token.o global.o interact.o ${CC} -o $@ parse.o find_token.o global.o interact.o
Interactively, one can test the -n option by entering the command:
make -n all
which should produce output like:
make enter testex cc -O -c parse.c cc -O -c find_token.c cc -O -c global.c cc -o enter parse.o find_token.o global.o cc -O -c interact.c cc -o testex parse.o find_token.o global.o interact.o
While some output lines vary from system to system, others are reasonably predictable. Thus, one could begin automating the test by redirecting the output to a file, and then checking to see whether one of the lines is correct:
make -n all > /tmp/make_n_option$$ grep 'cc -o enter parse.o find_token.o global.o' /tmp/make_n_option$$
Finally, we can put the whole sequence into a regression test by running it as a shell script. The final result appears below. For readers who are unfamiliar with shell scripts, I will simply say that the following one is driven by an invisible exit status returned by each command.
rm -f *.o enter testex if make -n all > /tmp/make_n_option$$ then if grep 'cc -o enter parse.o find_token.o global.o' /tmp/make_n_option$$ then exitstat=0 else echo 'make -n did not correctly echo commands from recursive make' exitstat=1 fi else echo 'make -n exited with error: check accuracy of this test' exitstat=2 fi rm -f /tmp/make_n_option$$ exit $exitstat
An interesting sidelight from this example is that it reveals incompatibilities among UNIX systems. While the test uses entirely standard, documented features, some variants of make have not implemented them.
Below you can see another style of test automation, through a short example from a section of a FORTRAN manual on parallel processing. The manual includes marginal comments explaining the procedure, which to fill an array in parallel through a loop.
REAL A, B, ARRAY(100), TMPA, TMPB C PRINT*, 'Input two reals:' READ (*,*) A, B CPAR$ PARALLEL PDO NEW(TMPA, TMPB) CPAR$ INITIAL SECTION TMPA = A TMPB = B CPAR$ END INITIAL SECTION DO 20 I = 1, 100 ARRAY(I) = ARRAY(I)/(SIN(TMPA)*TMPB + COS(TMPB)*TMPA) 20 CONTINUE
The code below shows the example augmented by Quality Assurance staff to be self-testing. The example does not include the long header comments contained in the actual test, to describe the purpose of the example and include a simple shell script for running and verifying it. The ARRAYVFY array has been added to store comparison data, and the EXITSTAT variable to indicate whether errors have been found. A verification section at the end of the program simply performs the same operation sequentially that the example performed in parallel, and checks the results. Thus, this programming example is completely self-contained. However, the more familiar technique of comparing output against a pre-existing file of correct answers is equally good, and was used by Quality Assurance for some other examples in the same test suite.
REAL A, B, ARRAY(100), TMPA, TMPB REAL ARRAYVFY(100) INTEGER EXITSTAT /0/ DO 10 I = 1, 100 ARRAY(I) = I ARRAYVFY(I) = I 10 CONTINUE PRINT*, 'Input two reals:' READ (*,*) A, B CPAR$ PARALLEL PDO NEW(TMPA, TMPB) CPAR$ INITIAL SECTION TMPA = A TMPB = B CPAR$ END INITIAL SECTION DO 20 I = 1, 100 ARRAY(I) = ARRAY(I)/(SIN(TMPA)*TMPB + COS(TMPB)*TMPA) 20 CONTINUE C C ------ VERIFY ----------------------------------------------- C DO 100 I = 1, 100 ARRAYVFY(I) = ARRAYVFY(I)/(SIN(A)*B + COS(B)*A) 100 CONTINUE DO 200 I = 1, 100 IF (ARRAY(I) .NE. ARRAYVFY(I)) THEN PRINT *, ' Error in array on element I',I, & ARRAY(I), ' <> ', ARRAYVFY(I) EXITSTAT = 1 ENDIF 200 CONTINUE CALL EXIT (1) END
To integrate the test into our regression suites, a staff member simply added the source code, and used existing test procedures to compile it, run the program, and check the exit status. I have deliberately shown the primitiveness of our procedures—relying simply on shell scripts and other standard UNIX system tools—to show how low the overhead of test development can be. While the first few tests for each project took a while to create (about one person-hour per documentation example) we soon become familiar with the procedure, and got to the point where we could turn an example into a self-verifying test in about 10 minutes.
Naturally, test development would be easy with more advanced tools. Some of the areas for further research include:
The project on signals was an on-line document, while the rest were hard-copy manuals. Most of the projects involved complex programming tools, which might skew the method. But small experiments producing end-user documentation, as well as the considerations discussed in the Theory section of this paper, suggest that the method can be successful for any audience and any computer product.
Where reliable documentation has replaced an earlier manual for the same product, comparisons are revealing. The new documents have been generally agreed to display the following benefits:
The general method for producing reliable documentation has now reached a fairly stable state, and is well-enough defined to be transferable. As use of the method spreads, I hope to create a community that can develop increasingly sophisticated tools to implement the stages of development, and more research data by which the method can be evaluated. Meanwhile, our practical successes to date, as well as the clear theoretical advance that this method represents over other documentation methods, should make it attractive to software development teams.
This work is licensed under a Creative Commons Attribution 4.0 International License.