Methods and Mechanics of Creating Reliable User Documentation

This paper was prepared for and presented at the 1991 Pacific Northwest Software Quality Conference.

Abstract

User documentation, long considered an unwelcome responsibility for software project teams, can actually be produced with the same processes of specification, review, and measurement as the other deliverables in a computer system. This paper describes a practical, inexpensive method that some commercial computer vendors have used to create and review their manuals. It employs a simple form of constructive specification to determine the valid operations that users can perform. The method leads to a set of usage models and a series of examples that can be integrated into automatic regression tests. Benefits include better documentation of environmental needs such as prerequisites and restrictions, clear links between user tasks and product features, and regular automatic checks on the document’s accuracy.

A French translation of this paper is available.

The software field has long held an ambivalent attitude toward user documentation. Programmers and Quality Assurance staff definitely appreciate when a good manual helps them learn about their own projects. And in the software engineering literature, one would have difficulty finding a text that fails to list user documentation as a deliverable. But on the other hand, engineers do not feel comfortable with specifications and evaluation in the area of user documentation. Thus, many relegate it to the fringes of their projects, sometimes tacking it on at the last minute. Ironically, the trend in software engineering literature [such as Boehm, 1981; ANSI/IEEE, 1986] is toward the other extreme—to treat user documentation as part of software requirements, and thus to insist unrealistically that it be largely finished before software design even begins.

This paper tries to bring the little-researched area of user documentation within the software engineering fold. I will describe a practical, inexpensive method that some commercial computer vendors are using to review and monitor their manuals. Software project managers, designers, and Quality Assurance staff can use the method to extract the formal elements of documentation and work them into specifications, test plans, and schedules.

The stage in this method resemble the informal techniques that many people use when they have to document software—roughly:

Decide what features to discuss.
Play with sample applications in order to learn how the software works.
Organize the models, procedures, warnings, and other insights into a reasonable sequence.

The contribution of this paper is to give these techniques a firm grounding in software engineering. This makes the difference between an unstructured play activity and a discipline that supports goal setting and resource allocation (without losing any of the fun).

In pursuit of reliability, this paper defines exactly what a “feature” is, and offers a complete list of questions that have to be answered in order to document each feature. I also show how to determine that the applications discussed are truly of value, and how to associate the models offered to readers with the actual steps they must follow to use the software. Every stage of the method includes rules for bounding the activity, recording progress, and reviewing results.

Before I launch into the theoretical underpinnings, let me describe some incidents that give the flavor of what it is like to work with this method in a commercial environment.

During the development of a real-time operating system, the examples from one manual were tested on systems with up to four processors and judged correct. However, during the next release of the product, the test team obtained an eight-processor system, reran the regression suites, and found a subtle omission in one of the examples. It took less than an hour to correct both the example and the explanatory text in the manual.
In the documentation effort for parallel-processing tools, a relatively trivial example turned up an internal compiler error in the assignment of classes to variables. The error had never been found by regular QA tests because one class was rarely used, and because the tester focused on stress testing with unrealistic programming constructs. The documentation example performed a more realistic operation mixing data of different classes, and thus triggered the error.
On a project in data base administration, the writer analyzed the material to find the first task an administrator would be likely to perform. She discovered that the required knowledge for this task (initializing the user security information) was currently divided among three separate manuals. In a few months, she produced a tutorial that covered this basic task and several others in simple, procedural fashion.

User documentation is the culmination of a long process of discussion and experimentation throughout a software project. Therefore, while this paper’s main impetus is to foster better publications and on-line documentation for users, some of its recommendations will affect a project’s internal documentation and staff training. Thus, the paper should interest people concerned with improving education and communication among their programming staff, and particularly with ways to disseminate the insights of project designers and senior members to other people on the team.

The next three sections—Goals, Theory, and Roles—show the method’s general fitness for software documentation. The bulk of the paper is devoted to a history of practical applications: a stage-by-stage description in Method, and a discussion of implementation details in Mechanics. I end with a summary of the method’s current status in Benefits.

Goals
Theory
Roles
Method
Mechanics
Benefits

Goals

The traits of good documentation that can be developed through a rigorous method are:

To list the prerequisites for each feature.
To illustrate each feature of the product in a context recognizable to readers.
To lead the reader in small steps from simple uses to complex uses.
To cross-reference between high-level concepts (like user tasks and programming models) and lower-level concepts (like system prerequisites and product features).
To reflect changes in the product through its lifetime.

By way of contrast, here are some traits that cannot be checked formally, but depend on the individual skill and subjective judgement of the document’s producers—and therefore, lie outside our discussion.

To use terms appropriate for its readers.
To offer the right amount of general background information.
To use a format and a layout on the page or screen that it is easy to follow.
To remain free of typographical errors and other problems that lie in the gap between the material that can be formally reviewed and the actual printed material.

The first traits are essentially linked to features of the product and its use, while the second cover the psychological aspects of the document and its translation into a medium of distribution.

Desirable results are not enough to define a useful working method. The implementation must also be feasible in a commercial environment. Thus, a method to produce reliabile documentation should meet the following procedural requirements.

Repeatable use.
The formally reviewed traits of the documentation can be checked through regression tests at regular points in the software’s development cycle.
Low cost.
The method adds a relatively small burden to the existing responsibilities, schedule, and computer resources of a commercial project team.
Scalability.
The same essential techniques can benefit small projects (such as one-person MIS projects directed toward a few in-house users) as well as large ones (commercial software for end-users, where the documentation is the critical entry point to the product).
Ease of integration and knowledge transfer.
The techniques ring familiar to well-trained engineers and Quality Assurance staff, and can be adapted to whatever standards they are using for other software maintenance efforts.
Value for retroactive use.
The method can be used to produce documentation for software that has already been released, and even software whose original designer has left or whose project team has disbanded.

Theory

Other articles [Oram, 1989; Oram, 1991] have laid out and justified the underlying thesis for reliable documentation:

The critical issues determining the quality of software documentation lie in the structure of the software itself, not in stylistic choices made by the writer.

This paper will show that one can produce a complete description of a system’s use by tracing data transformations from one function to the next. The supporting theory for our endeavor is constructive specification. It may seem a surprising choice, since the theory is best known as a somewhat academic, labor-intensive method for constructing formal proofs [Jones, 1980] and as a way of deriving classes in object-oriented programming [Stoy, 1982]. But in this paper, constructive specification proves to be a simple and powerful way to link software’s use with its logical structure.

The basic idea behind constructive specification is to describe every data object in terms of the operations that the program will allow. For instance, you can write a specification for a stack by describing three operations: initializing, pushing, and popping. For the purposes of documentation, we can set both a direction and a boundary to our efforts through the following rule:

The specification of a user document is complete when it includes every operation that is valid on every data object that affects system state, within a sample application that causes a change from one user-recognizable system state to another.

Let us now decipher the key phrases “every data object that affects system state” and “user-recognizable system state.” We can then join the theory to a broader view of mental models and links.

Data Affecting Product Use

Data that pertains to user documentation falls into two categories: function arguments and static data (which includes data stored in external media). One benefit of developing sample applications is that they help to reveal the user’s dependence on state information. A document based on examples is unlikely to leave out critical prerequisites or environmental needs, such as to mount a disk, to reserve unusually large buffers, or to log in with special privileges. All these environmental needs (loosely related to the concept of “non-functional requirements” in [Roman, 1985]) are stored as the internal static data mentioned earlier.

Sample Applications

Applications, according to the rule stated early, should causes changes between “user-recognizable” states. This stipulation is meant to rule out dummy examples, like one that simply converts an integer to a character string. By contrast, a simple example of calculation and report generation that include conversion between an integer and a character string can be a valuable teaching tool, because the conversion now becomes part of a larger task and is justified by the requirements of that task. Because such an example is anchored by useful, recognizable states at both the beginning and the end, I have coined the term portal-to-portal verification to describe it.

Sample applications can emerge through both top-down and bottom-up approaches. The top-down approach, which is the more familiar one, consists of collecting benchmarks and customer applications that the product is meant to support, and breaking them down into small pieces can be independently reviewed. But this cannot ensure full coverage of all operations on all data items. Thus, it must be accompanied by the bottom-up approach, which is to trace data transformations using the method in this paper.

Here is a simple example of bottom-up design. The basic operations on a file identifier include assignment (through an open statement), reference (through read, write, and close statements), and ancillary operations (like FORTRAN’s INQUIRE or the C language’s stat). Thus, one can create a simple portal-to-portal example by opening, writing, and closing a file. Verification could consist of comparing the resulting file to a canned version, or of reading the data back into the program and checking it for consistency.

Simple as such an example is, the lessons it embodies are by no means trivial. It can be the template for sophisticated applications like imposing structures on raw binary data, and opening a pipe with non-blocking (asynchronous) access. Using the method in this paper, one can build a complete description of file handling through a series of progressively more complex examples.

This paper is the first, to my knowledge, to suggest a disciplined method using examples to assure full product coverage. I have found only one other discussion of user examples in the software engineering literature [Probert, 1984] but it considers them a source for tests rather than a training tool.

Software Structure and Models for Use

One goal of computer documentation is to identify user tasks. But the defining characteristic of a “task,” in the area of computer software, is that it really consists of many tasks on different conceptual levels.

For instance, in a relational database, users might define their task initially as retrieving the entries that match certain criteria. But to begin using a typical query system, they have to redefine this task as “building a view.” This task in turn depends on a lower layer of tasks like choosing the keys to search for, creating Boolean search expressions, and sorting the entries. The documentation discussed in this paper helps users develop the necessary thought processes for figuring out how to use the software—that is, for decomposing their tasks until they reach the atoms represented by the product’s features.

Cognitive scientists and educators have focused on the concept of mental models to explain how people assimilate information and apply it in new situations. The more sophisticated research [for instance, Brown, 1986; Norman, 1986; Frese, 1988] bolsters the strategy used in this paper: that of matching the models of product use to the logical structure of the software.

Reliable documentation builds models from the structure of product itself, which offers both richness and accuracy. The models are simply the uppermost layer of user tasks, such as “searching” in the example of a data base. The user who consults the documentation in order to perform a search finds a progressive break-down into lower levels of tasks, ending perhaps in the arguments of a WHERE clause in SQL.

In this paper’s method, models map directly onto the designer’s construction of the software. For example, a real-time programming manual could divide applications into cyclic and interrupt-driven. Cyclic applications could then be broken down further into those running several independent threads, and those running several functions repeatedly in one thread. The manual can then describe the environments in which each model would be most advantageous, and implement each model through procedures and examples.

Links and Document Structure

For initial learning purposes, users tend to read in a linear manner (even though they also tend to start in the middle and skip around a lot). For reference and trouble-shooting purposes, they prefer to search a hierarchical structure, such as an index or an on-line set of hypertext links.

No one uses every feature in a product. But one can be certain that certain sets of users need particular combinations of features. Thus, I use the metaphor of terraces to describe the structure of a computer document. Each terrace consists of an example with its accompanying explanation. A document can have many “hills,” each consisting of a set of terraces that increases gradually in complexity.

Thus, in a database product, one hill could offer more and more complicated examples of retrieving keys, thus showing the reader various ways to build a view. Another hill could solve the problems of physically storing large databases. Users can climb the hills that they need for their particular applications, and ignore other hills entirely. If new features or new applications are added during product development, the writer can find places for them near the top of the terrace hierarchy. But the disciplined creation of sample applications ensures that users can associate tasks with product features.

Roles

The main goal of this paper is a good document to be delivered to the users. But its philosophy applies to internal project documentation too. Broadly speaking, this paper asks:

How do software designers convey their insights to the less experienced team members during product implementation, and ultimately, in manuals and training courses, to the end-users of the product?

Like any planning strategy, the method in this paper is least expensive and most beneficial when it is employed from the earliest phases of a project. The managers who initiate the project can, with fairly little effort, preserve some of the user applications driving the project in the concepts and requirements documentation.

Software designers definitely have usage models in mind as they find common sub-tasks, create modules, and define system-wide data structures. The models should be explicitly documented in the software design descriptions. If the designers do not have time to create full examples, they can delegate the work to other team members—in either case, the intellectual process of creating examples helps to define the product and describe it to the team.

The method in this paper is equally valuable in the unhappy—but all too common—situation where a product has been in the field for a long time without adequate documentation, and the project team hires a writer to redress the situation. Now the method provides guidance for reconstructing the lost information on use. Categorizing and tracing the data helps to establish essential information, like what each command option is for, and what distinguishes similar commands. Where features cannot be understood, and further research on user applications is needed, the method helps the writer identify missing information and pose the right questions.

Method

In a commercial environment, the production of reliable documentation falls into three distinct stages. While this paper discusses them sequentially for the sake of simplicity, the pressures of real-life project development often force variations.

For instance, on a project with tight deadlines, some stages overlap in a pipeline. A partially-completed data analysis can be used to start developing examples, and early sets of examples can be placed in a tentative order so that the writer can start creating the text.

Changes in design or marketing strategy also complicate the method by requiring the team to reiterate completed stages. If a new feature is added, each of the documents produced in each stage must be adjusted to include the feature. One of the method’s strengths is that writers can quickly determine the ripple effect of any change on the entire user viewpoint, and make incremental changes to documentation where necessary.

Stage 1: List and Categorize all Data Items Affecting the User
Stage 2: Develop Examples
Stage 3: Order the Examples

Stage 1: List and Categorize all Data Items Affecting the User

In the section on Theory, I described the input and system state data to which product designers should be alert, and showed how the use of the product is fully determined by this data. The first stage of this paper’s method categorizes data items, determining the role that each plays in the software system.

Activities

For the purposes of user documentation, a few simple categories suffice to accommodate all data items. These appear in Table 1, along with rules for determining the proper category for each item. Categorizations might not be intuitively obvious when the user operates at a level far removed from the engineers (as in the case of interactive graphics products) but even so, the same categories always apply. One must get used to looking at the object’s role in the state of the system, rather than looking at the superficial mechanisms for selecting or modifying an object.

Table 1. Basic Data Items.
Category	Programming level	Command level	Menu level
Flag	A Boolean variable, one that is permanently restricted to having two possible settings.	An option that is either present or absent, with no accompanying value.	An option that the user chooses without entering any value, choosing from a list, etc.
Counter	An integer (generally unsigned) that is incremented or decremented during the course of the application, and is checked for reaching a threshold (often zero).
Identifier	A value (usually integer or string) that is assigned at most once, is never changed thereafter, and is referred to in order to locate the object. This category of data covers file descriptors, channel identifiers, and other objects offered by the operating system.
Table	An integer used as an offset or array subscript (assuming that the array is a set of mappings or pointers, rather than a vector with arbitrary contents).	An option accepting a fixed set of values, usually represented as character strings.	A menu of options, from which the user chooses exactly one.
Application data	Any item other than a counter that can take a range of values, with no fixed, predefined set of possible values. Examples include file names and the data entered into a spreadsheet.

Some people might want to use these categorizations as primitives from which to derive more specific categories. For instance, a file identifier, a process identifier, and a channel identifier all support different data transformations. Thus, if your product has many data items that fall such sub-categories, you might find it efficient to create separate lists of questions for each sub-category.

Similarly, some data items cover more than one category. For instance, a communications protocol might define several possible encodings for a single data item, where the settings of certain bits determine which encoding applies. Many UNIX and X Window System applications use C-language unions for similar purposes. Such complexities merely mean that you have to ask the appropriate questions for each possible use of the data item.

Since the focus here is on the data’s purpose, we do not need to be concerned with its type, scope, or range. (These considerations do of course appear eventually in the documentation, to describe restrictions and error conditions.) Nor are derived data types or levels of indirection important; we are concerned only with the kinds of transformations allowed.

Once a category is found for each data item, we know the information that the documentation must provide for that item. The list of questions for each category appears below.

Flag
1. Default setting.
2. Who sets it, why, and how.
3. Who resets (clears) it, why, and how.
4. Who reads it, why, and how.
Counter
1. Who initializes it, why, and how.
2. Who increments or decrements it, why, and how.
3. Who reads it, why, and how.
Identifier
1. What object it refers to.
2. Who initializes it, and how.
3. Who refers to it, and what operations are generally performed on the object.
4. Whether it is ever deleted, and how.
Table
1. What each entry in the table is for, and how to select it (at least one example per mapping within the table).
Application data
1. What the default is, and why use it.
2. How a value is used.
3. Special considerations. These occur most often at the edges of the range. For instance, zero or null often has a special meaning that cannot be intuitively extrapolated from the normal range of values.

Result

Stage 1 achieves two major steps that are required to manage any activity. First, by breaking down a documentation effort that up to now has been undifferentiated, this stage allows the team to prioritize sub-tasks and assign them to different team members. Second, by providing a complete list of sub-tasks, it provides an admittedly crude but still useful guide toward making documentation’s progress measurable.

The concrete result of Stage 1 is a list like Table 2. This table is excerpted from an actual internal document developed during the design of a programming product for window graphics. The left column is simply a list of functions, while the next column shows each function’s argument list. If the product included state data, these could be listed in the left column as well.

Table 2. Data categorization, working document for tracking functions.
Call	Argument/ Data item	Categorization	Expected range

Init()	return value	table	True (success)
			False (failure), BadAccess
	display	identifier	retrieved from server

ChangeScheduler()	display	identifier	retrieved from server
	client	identifier	from ThisClient()
	when	table	Immediate
			Sequential
	params:
	type	table	RoundRobin
			PriorityBased
	slicevalue	counter	> 0
	slicetype	table	TimeBased
			RequestBased
	decaylevel	counter	MinPriority to MaxPriority
	decayfreq	counter	> 0
	decayunits	table	TimeBased
			RequestBased
	priority	counter	MinPriority to MaxPriority
	priomode	table	Absolute
			Relative

.
.
.

The argument to one call is a complicated structure containing several distinct data items. Table 2 reflects this by indenting the data items under the name of the structure, params.

Some data items now require more than one row, because multiple settings must be documented. In particular:

Flags require two rows.
Counters and application data sometimes require extra rows for values with special meanings (such as negative values or zero).
Tables require one row for each legal value.

The Categorization column reflects the criteria from Table 1. The Expected range column resembles the domain and range information collected in standard Quality Assurance practices. In general, the third and fourth columns embody the strategy for exploring the product and answering the questions described earlier.

The rightmost column is currently empty, but will be filled in during Stage 2 with a list of examples that illustrate each particular data item at each specified value.

As another example of Stage 1 output for more familiar software, Table 3 shows the data categorization for the signal call on UNIX systems (as standardized in ANSI C).

Table 3. Data Categorization for UNIX *signal* Call.
Call	Argument/ Data item	Categorization	Expected range	Examples
signal	signo	identifier	mnemonic for signal	—
	sa_handler	table entry	SIG_DFL	—
			SIG_IGN	—
			function	—
	return value	table entry	SIG_DFL	—
			SIG_IGN	—
			function	—
			SIG_ERR	—

Review

This stage produces a small amount of output, tied closely to the software design. Reviewers should examine the output to ensure that:

All function arguments and static data are listed, so long as these could have an effect on product use.
Each data item is correctly categorized.

Analogies within software engineering

In software engineering terms, the goal of this stage can be compared to acceptance criteria.

Analogies in technical writing

In terms used by the technical writing literature, this stage meets the goal of comprehensiveness.

Stage 2: Develop Examples

In Stage 1, the method delved deeply into the internal logic of the product. The result was a history of changes to each data item. It is time now to return upward to the user’s point of view.

Stage 2 builds the changes for each data item into small but realistic applications. This stage contains the alchemy that transforms system states into user tasks and programming models.

Activities

To some extent, examples emerge naturally from the answers to the questions listed earlier. For instance, in asking “What function assigns the identifier that appears as this argument?” one discovers the kinds of initialization required in a sample application. While following the lines of data transformation, one discovers functions that go together naturally. Eventually, one or two rock-bottom examples emerge as the simplest possible applications that use the product to do something constructive.

The achievement of Stage 2 is to link product features to progressively higher layers of tasks. The links collectively form a cross-reference system that the writer can turn into an index for a manual, or a set of links in on-line documentation.

Example-building is a bounded activity, because the previous stages have already defined what data items must be documented, and what questions the documentation must answer for each one. The success of the documentation effort is now quantifiable. If time does not permit the full exploration of every data item, the engineering team can choose to focus on critical items and ignore obscure ones.

Result

The language or medium for examples depends on the type of product being documented. Functions and programming languages are illustrated by examples of code. Operating systems and interactive utilities can be illustrated by series of prompts and commands. Point-and-click applications call for pictures or descriptions of the mouse and keyboard movements.

In the window project discussed earlier, the search for examples radically altered the document’s focus. No meaningful application could be developed that stayed within the scope of the product. Instead, the team agreed to pull in numerous tasks that lay outside the software they were building, but which were an inseparable part of the application base for the software:

Control over the total windowing environment, which could contain any number of unrelated applications.
Communication and synchronization using the channels provided by the operating system.
Responses to real-time input, of both periodic and urgent forms.

The initial document created for review during Stage 2 looked like the pseudo-code shown in Table 4. The final document was a set of actual programs.

Table 4. Basic Example of Changing the Scheduling Policy, When User Presses Button.
External event	Program action
	Init(display) . . . client_id[0] = ThisClient(display) params.type = PriorityBased params_raise.type = PriorityBased params.u.p.slicevalue = long_draw + 10 params_raise.u.p.slicevalue = long_draw + 10 . . . ChangeScheduler(display, client_id[0], &params, Sequential)

Button press
	for (i=0; i<num_clients; i++) ChangeScheduler(display, client_id[i], &params_raise, Immediate) . . .

A subset of Table 4 is represented by the C code below, which is an excerpt from an actual program testing features from the product. As with software testing, a complete set of user examples should include some that are supposed to fail, by deliberately causing errors or breaking the documented rules.

#include "plot_xlib.h"
#include "root_defs.h"

void Xserver_priority_initialize(top)
     DISPLAY_INFO *top;
{
  rtxParms set_rtxp;

  /* initialize with privileges to change global parameters */
  (void) PrivilegeInit(top->display);

  /* client-specific parameters will affect just this client --
     actually, this fragment affects only global parameters */
  top->client = ThisClient(top->display);

  /* this will tell library to check the set_rtxp.u.p parameters */
  set_rtxp.type = PriorityBased;

  /* mask makes scheduler change slice and decay, but leave priority alone */
  set_rtxp.mask = Slice | DecayLevel | DecayAmount | DecayFrequency;

  /* set the parameters of the rtxParms structure */
  set_rtxp.u.p.slicevalue = NEW_SLICE;
  set_rtxp.u.p.dlevel = CEILING_FOR_DECAY;
  set_rtxp.u.p.damount = 1;
  set_rtxp.u.p.decayfreq = DECAY_TIME;
  set_rtxp.u.p.slicetype = TimeBased;
  set_rtxp.u.p.decayunits = TimeBased;

  /* Everything before was prepartion -- this call makes the change */
  ChangeScheduler(top->display, top->client, &set_rtxp, Sequential);
}

Review

Since this stage produces a great deal of output, it is often carried out incrementally. The primary criteria are:

Correct use of each function or command.
Adherence to pre-requisites, with correct set-up and clean-up activities.
Complete coverage of the usage models defined by senior project members and project managers.
Correct delineation of tasks under each usage model.

Some additional optional criteria can help to improve the quality of the final documentation or the maintenance effort for examples. These criteria require intuitive judgement and a sense of the user environment.

Adherence to standard, recommended practices in areas outside the features of the product.
Optimality. Each example should be smallest and simplest one that could illustrate the use of the data, while still maintaining some naturalism.
Usefulness. All other things being equal, it would be valuable to build sets of examples that approach realistic applications. However, users and applications evolve unpredictably, so it is much more important for examples to be simple and convey the basic use of the product.
Machine independence. Regression testing is compromised if the example implicitly assumes a certain underlying architecture, directory structure, or other external elements that are known to change over time.
Ease of testing. For instance, an example that generates data internally or uses pre-defined input requires a lot less work during regression testing than one that interacts with a user for its input.

Analogies within software engineering

The goal of this stage can be compared to developing test suites. In fact, the examples should be integrated into regression tests, as discussed in the Mechanics section of this paper. During product design, examples validate the documentation effort by ensuring that it supports the necessary user applications and styles. The regression tests, in turn, verify that the examples reflect product operation over the entire life of the product.

Analogies in technical writing

In technical writing terms, the decomposition of user tasks fulfills the goals of usability and task orientation.

Stage 3: Order the Examples

This stage organizes the examples of Stage 2 into a structure that determines the order of presentation in the final document. The structure has elements of both a simple linear ordering and a tree hierarchy. The linear elements will be more evident in a printed manual, and the hierarchical elements in on-line documentation.

Activities

Stage 3 continues the movement started in Stage 2, away from the structure of the software and more toward the human user. The goals are to make it as easy as possible to get started with a product, and then to identify and incorporate useful enhancements. Examples provide natural criteria for ordering information in a linear fashion:

Simple applications before complex ones. Start with the simplest possible example of product use, and try to introduce only one or two new features in each successive example.
Common features before obscure ones. Special options that fulfill rare needs can be isolated near the back of the document.

The hierarchical aspects of organization come from the models and tasks discussed under Stage 2. These help to group together the examples that users need at a particular time.

Result

The result of Stage 3 is a re-ordered set of examples organized under a hierarchy of tasks. Since it represents the final structure of the manual, reviewers can examine the hierarchy to decide whether features are presented in an appropriate order and given the right amount of attention.

At the end of this stage, the engineering team has formally defined both the topics and the structure of the product’s documentation. The richness and authority of the information that this resource offers to technical writers cannot be matched by any other method. Writers can now prepare background information and narrative text that explains the models, tasks, and techniques. The cross-referencing system can be used to build the index.

As a brief example of a document structure designed through data analysis, here is the outline for an on-line, fully task-oriented description of ANSI C and POSIX signals [ANSI, 1989; IEEE, 1988] .

Trapping
- Basic signal call
- POSIX signals (sigaction)
  - Definition
  - Flags
  - Errors
- Forks (inheriting signal actions)
Handlers
Sending
- kill call
  - Basic example
  - Processes/groups
  - Privilege
  - Errors
- Interprocess communication
- raise call
- alarm call
Blocking
- Signal set (manipulating the sigset_t data type)
- Data protection (sigprocmask call)
- Handler protection (in sigaction call)
- Pending signals (sigpending call, sigismember call)
Waiting for signal (sigsuspend call)

The document moves from simple issues to more complex ones, freely breaking up the discussion of a single call where task-orientation calls for it. For instance, the section on “Sending” focuses on communication with a single process (the most common case), but also offers a brief discussion of the more complicated issue of process groups. Although blocking is a critical issue, it comes late in the document because it requires a good understanding of the earlier issues. Many issues not directly related to the calls also appear in the document, such as the need to pass information between the handler and the main program using volatile, atomic data.

Review

The review of Stage 2, if carried out thoroughly, has resolved the most important documentation questions. Reviewers should be able to accept at Stage 3 that the models, tasks, and examples are the best available. Thus, the review at Stage 3 ascertains whether:

The linear organization is best available, in terms of putting the simpler and more common applications first.
The hierarchical organization links all the layers of models, tasks, and features established in Stage 2.

After this stage—when the materials are in the hands of the writers—the focus moves to reviewing the text in relation to the examples. Document review becomes much easier and more rewarding, because it can focus on small areas of the document and ask questions whose answers are fairly easy to determine: for instance, whether the written procedures accurately summarize the examples, and whether the text warns users about potential sources of error.

Analogies within software engineering

This stage does not have a real counterpart in software engineering, because programs are generally not linear texts.

Analogies in technical writing

In technical writing terms, this stage meets the design goals of structured documentation, by filtering and pacing information for easy learning and retrieval.

Mechanics

This section shows some the tools and organizational structures that my colleagues and I have used to integrate examples into regression tests. This part of the review effort has offered the most rewards in relation to invested time and resources.

A simple example is furnished by a book on the UNIX system’s make utility. The data analysis included all command options, in particular an -n option that causes the utility to print a series of commands.

An interesting test for the -n option is a set of nested or recursive make commands. First, create a file named makefile with specifications for make. The following is simplified but still realistic example.

all :
        ${MAKE} enter testex

enter : parse.o find_token.o global.o
        ${CC} -o $@ parse.o find_token.o global.o

testex : parse.o find_token.o global.o interact.o
        ${CC} -o $@ parse.o find_token.o global.o interact.o

Interactively, one can test the -n option by entering the command:

make -n all

which should produce output like:

make enter testex
cc -O -c parse.c
cc -O -c find_token.c
cc -O -c global.c
cc -o enter parse.o find_token.o global.o
cc -O -c interact.c
cc -o testex parse.o find_token.o global.o interact.o

While some output lines vary from system to system, others are reasonably predictable. Thus, one could begin automating the test by redirecting the output to a file, and then checking to see whether one of the lines is correct:

make -n all > /tmp/make_n_option$$
grep 'cc -o enter parse.o find_token.o global.o' /tmp/make_n_option$$

Finally, we can put the whole sequence into a regression test by running it as a shell script. The final result appears below. For readers who are unfamiliar with shell scripts, I will simply say that the following one is driven by an invisible exit status returned by each command.

rm -f *.o enter testex
if
    make -n all > /tmp/make_n_option$$
then
    if
        grep 'cc -o enter parse.o find_token.o global.o' /tmp/make_n_option$$
    then
        exitstat=0
    else
        echo 'make -n did not correctly echo commands from recursive make'
        exitstat=1
    fi
else
    echo 'make -n exited with error: check accuracy of this test'
    exitstat=2
fi
rm -f /tmp/make_n_option$$
exit $exitstat

An interesting sidelight from this example is that it reveals incompatibilities among UNIX systems. While the test uses entirely standard, documented features, some variants of make have not implemented them.

Below you can see another style of test automation, through a short example from a section of a FORTRAN manual on parallel processing. The manual includes marginal comments explaining the procedure, which to fill an array in parallel through a loop.

      REAL A, B, ARRAY(100), TMPA, TMPB
C
      PRINT*, 'Input two reals:'
      READ (*,*) A, B

CPAR$ PARALLEL PDO       NEW(TMPA, TMPB)
CPAR$ INITIAL SECTION
         TMPA = A
         TMPB = B
CPAR$ END INITIAL SECTION
         DO 20 I = 1, 100
            ARRAY(I) = ARRAY(I)/(SIN(TMPA)*TMPB + COS(TMPB)*TMPA)
 20      CONTINUE

The code below shows the example augmented by Quality Assurance staff to be self-testing. The example does not include the long header comments contained in the actual test, to describe the purpose of the example and include a simple shell script for running and verifying it. The ARRAYVFY array has been added to store comparison data, and the EXITSTAT variable to indicate whether errors have been found. A verification section at the end of the program simply performs the same operation sequentially that the example performed in parallel, and checks the results. Thus, this programming example is completely self-contained. However, the more familiar technique of comparing output against a pre-existing file of correct answers is equally good, and was used by Quality Assurance for some other examples in the same test suite.

      REAL A, B, ARRAY(100), TMPA, TMPB
      REAL ARRAYVFY(100)
      INTEGER EXITSTAT /0/

      DO 10 I = 1, 100
      ARRAY(I) = I
      ARRAYVFY(I) = I
 10   CONTINUE

      PRINT*, 'Input two reals:'
      READ (*,*) A, B

CPAR$ PARALLEL PDO       NEW(TMPA, TMPB)
CPAR$ INITIAL SECTION
         TMPA = A
         TMPB = B
CPAR$ END INITIAL SECTION
         DO 20 I = 1, 100
            ARRAY(I) = ARRAY(I)/(SIN(TMPA)*TMPB + COS(TMPB)*TMPA)
 20      CONTINUE
C
C ------  VERIFY  -----------------------------------------------
C
         DO 100 I = 1, 100
            ARRAYVFY(I) = ARRAYVFY(I)/(SIN(A)*B + COS(B)*A)
  100    CONTINUE
         DO 200 I = 1, 100
            IF (ARRAY(I) .NE. ARRAYVFY(I)) THEN
                  PRINT *, ' Error in array on element I',I,
     &                ARRAY(I), ' <> ', ARRAYVFY(I)
                 EXITSTAT = 1
            ENDIF
  200    CONTINUE
         CALL EXIT (1)
         END

To integrate the test into our regression suites, a staff member simply added the source code, and used existing test procedures to compile it, run the program, and check the exit status. I have deliberately shown the primitiveness of our procedures—relying simply on shell scripts and other standard UNIX system tools—to show how low the overhead of test development can be. While the first few tests for each project took a while to create (about one person-hour per documentation example) we soon become familiar with the procedure, and got to the point where we could turn an example into a self-verifying test in about 10 minutes.

Naturally, test development would be easy with more advanced tools. Some of the areas for further research include:

Folding the documentation analysis in with the creation of the functional specification and the test plan, in order to eliminate duplication of effort. Currently, the method described in this paper is carries on completely separate from the other efforts, and any programs created by those efforts require a great deal of adaptation before being suitable for documentation examples.
Maintaining a single source of each example for both the regression test and the document. This would require a tool that extracts and formats the portions of the example actually needed in the document. The process is rife with difficulties, and might not be worth the effort. Almost any change to an example requires a corresponding change to the narrative in the document—that is the whole reason for the method in this paper. Therefore, it might be best to keep writers involved and force changes to be transferred to the document manually.
Automating the transformation from user example (shown above) to full regression test (also shown above). Like most automation of software engineering tasks, this is a tricky area.
Developing hooks in the systems being tested to permit further automation. In real-time programming, for instance, it is very hard to determine whether raising one’s priority really results in getting more CPU time. Similarly, it is hard to test a graphics product without manually using the mouse and personally observing the output. These are well-known problems in the computer industry, and extend far beyond the area of user documentation.
Developing rules that help the team predict areas of failure. This is another classic software engineering dilemma. For instance, one cannot tell whether an example resulted in a corrupted file unless the regression test checks that file.
Formalizing the assignment of responsibilities. How much example development should be done by software designers, by programmers, and by writers? At what point can these people turn a crude example over to Quality Assurance and say “Now automate it”?

Benefits

The method presented in this paper has evolved through numerous projects in which I and my colleagues applied software engineering techniques to user documentation:

Functions and techniques for controlling window graphics (excerpts of which were used in the Method section).
Programming techniques with signals.
Configuration and testing of OSI and local-area networks.
Configuration, access control, and query techniques for databases.
C libraries and FORTRAN language statements that activate parallel processing.
Real-time programming control over timing, scheduling, processor allocation, and file handling.
Language debuggers and program-building utilities.

The project on signals was an on-line document, while the rest were hard-copy manuals. Most of the projects involved complex programming tools, which might skew the method. But small experiments producing end-user documentation, as well as the considerations discussed in the Theory section of this paper, suggest that the method can be successful for any audience and any computer product.

Where reliable documentation has replaced an earlier manual for the same product, comparisons are revealing. The new documents have been generally agreed to display the following benefits:

They have far more information, while being shorter than their predecessors.
They expend a far greater amount of space on examples (often 50%), but the sparse narrative information comes out more understandable and relevant.
They find natural settings and useful applications for complicated features, which earlier documents described in such a confusing and difficult manner that many readers could not make sense of them at all.

The general method for producing reliable documentation has now reached a fairly stable state, and is well-enough defined to be transferable. As use of the method spreads, I hope to create a community that can develop increasingly sophisticated tools to implement the stages of development, and more research data by which the method can be evaluated. Meanwhile, our practical successes to date, as well as the clear theoretical advance that this method represents over other documentation methods, should make it attractive to software development teams.

References

ANSI, 1989.: ANSI, American National Standard for Information Systems—Programming Language—C, X3.159-1989, ANSI, New York, NY, 1989.
ANSI/IEEE, 1986.: ANSI/IEEE, IEEE Standard for Software Verification and Validation Plans, IEEE Std. 1012-1986, IEEE, New York, NY, 1986.
Boehm, 1981.: Boehm, Barry W., Software Engineering Economics, Prentice-Hall, Englewood Cliffs, NJ, 1981.
Brown, 1986.: Brown, John Seely, “From Cognitive to Social Ergonomics and Beyond,” in User Centered System Design: New Perspectives on Human-Computer Interaction, ed. Donald A. Norman and Stephen W. Draper, Lawrence Erlbaum Associates, Hillsdale, N.J., 1986, pp. 453-486.
Frese, 1988.: Frese, Michael, et al., “The effects of an active development of the mental model in the training process: experimental results in a word processing system,`” Behavior and Information Technology, vol. 7, no. 3, 1988, pp. 295-304.
IEEE, 1988.: IEEE, Portable Operating System Interface for Computer Environments, IEEE Std. 1003.1-1988, IEEE, New York, NY, 1988.
Jones, 1980.: Jones, Cliff B., Software Development: A Rigorous Approach, Prentice-Hall International, London, England, 1980.
Norman, 1986.: Norman, Donald A., “Cognitive Engineering,” in User Centered System Design: New Perspectives on Human-Computer Interaction, op.cit., pp. 31-61.
Oram, 1989.: Oram, Andrew, and Kathleen A. Ferraro, “Sentence First, Verdict Afterward: Finding the Prerequisites for Good Computer Documentation,” Proceedings of the ACM SIGDOC ’89 Conference, ACM, New York, 1989.
Oram, 1991.: Oram, Andrew, “The Hidden Effects of Computer Engineering on User Documentation,” in Perspectives on Software Documentation: Inquiries and Innovations, ed. Thomas T. Barker, Baywood Press, Amityville, NY, 1991.
Probert, 1984.: Probert, Robert L., and Hasan Ural, “High-Level Testing and Example-Directed Development of Software Specifications,” Journal of Systems and Software, vol. 4, pp. 317-325, 1984.
Roman, 1985.: Roman, Gruia-Catalin, “A Taxonomy of Current Issues in Requirements Engineering,” IEEE Computer, vol. 18, no. 4, April 1985, pp. 14-21.
Stoy, 1982.: Stoy, J., “Data Types for State of the Art Program Development,” in State of the Art Report: Programming Technology (Series 10, no. 2), ed. P. J. L. Wallis., Pergamon Infotech Ltd., Maidenhead, Berkshire, England, 1982, pp. 303-320.

Acknowledgments

Among the writers and Quality Assurance staff who have worked on reliable documentation with me, I particularly want to thank Sara Alfaro-Franco, Kathie Mulkerin, John Morrison, Sharon DeVoie, and Jean Sommer for using and evaluating the method. I am also grateful to MASSCOMP, Concurrent Computer Corporation, and Hitachi Computer Products for supporting the experiments and research effort toward reliable documentation.

Author’s home page

This work is licensed under a Creative Commons Attribution 4.0 International License.