Man Pages (Part 1)
Contents
Introduction
When Unix was created in the 1970s, it was well-described in technical papers and reports – not surprising, considering the industry influence Bell Labs held as a research institution. Most notably, the earliest presentation of Unix external to the Bell System was the paper The UNIX Time-Sharing System, which was delivered at the Fourth ACM Symposium on Operating Systems Principles (1973). Over 50 years later, what is presented in that paper is still recognizable to anybody familiar with Unix’s modern descendant, Linux.
What might be somewhat more surprising though considering its research origins is that Unix almost since the very beginning had a comprehensive set of online reference documentation for all its commands, system calls, file formats, etc. These are the the manual- or man-pages. On Unix systems used interactively, the man-pages have historically always been installed, space permitting.
Each new edition of this manual effectively became a snapshot of the state of Unix. That is why these early versions of Unix are usually referred to as “editions” because that is what the (printed) manuals were called:
UNIX PROGRAMMER’S MANUAL
Sixth Edition
K. Thompson
D. M. Ritchie
May, 1975
Especially in the early days of Unix, it was not uncommon to use either the “edition” name or a version: Sixth Edition may have been called v6 as well by its users and developers.
The way the manual pages have evolved and how they are used has changed over the decades. This set of posts is intended to give people unfamiliar with them an overview, as well as offer a review to seasoned users.
Text Processing Origins
After Bell Labs’ departure from MIT’s Multics project in 1969, the scientists at the Computing Science Research Center were lacking some direction. There was great interest in the group to acquire their own computer for research, but most large systems like a PDP-10 they desired were too expensive. After Ken Thompson wrote some experimental code on another department’s unused PDP-7, which showed some promising ideas for an operating system, fellow team member Joe Ossanna proposed that a text processing system would be developed if the team could purchase the then new PDP-11/20 minicomputer (1970).
The ingredients for a text processing system were an editor and a text
formatting system. Both of those came to the nascent Unix system via
MIT’s earlier CTSS time-sharing project, which predated Multics. The
QED editor, originally written at Berkeley for the SDS 940 system, was
implemented on CTSS by Ken Thompson himself. A text formatting system
called runoff also ran on CTSS. Both were ported to other systems at
Bell Labs first and eventually as ed
and roff
to the much smaller
PDP-11 under Unix. Unix system development moved quickly and by
mid-1971, three typists regularly used the PDP-11/20 to write and
format patent applications for AT&T.
Doug McIlroy, the department head who later contributed the seminal concept of “pipelines” to Unix, insisted on high quality documentation of the system. This resulted in a virtuous cycle: as new functionality was documented, changes to the implementation were made to make it easier to talk about it and in the process improve both the system and the documentation. Unix tools were used to document Unix, improving the documentation tools themselves as well. This work predated the widespread use of display terminals – the typical terminal was a Teletype Model 33 or 37, so to “display” a manual page was to actually get a print-out of one.
By Unix Second Edition (1972), roff
had been rewritten as nroff
(for new roff
) to make it more flexible. Macros were supported in
nroff
and a macro package for manual pages eventually became
available as the man
package by Sixth Edition (1975). In Fourth
Edition (1973) troff
was created to support the Graphic Systems
C/A/T phototypesetter, which provided high quality typesetting with
different fonts. The program also replaced the previous version of
nroff
and continued supporting output to fixed-width devices like
terminals, teletypes, or other printers.
Despite its importance to the Unix system, troff
has never been
standardized as part of the POSIX standard. A recent (2024) request
to do so was rejected by the POSIX maintenance group, because no known
group “would be willing to spend the time.”
The troff
program and related utilities were the core typesetting
tools in proprietary Unix distributions. A version of troff
called
GNU troff
(commonly groff
) was developed on SunOS in 1989 for the
GNU Project and remains the primary way to format manual pages on
Linux. Just like troff
, groff
is a general purpose typesetting
tool that can create output in a number of formats like HTML, PDF, or
the device-independent DVI format.
In response to some of the HTML output limitations and slow
performance of groff
, an effort to redesign man-page processing
began in 2008 for BSD-based Unix variants. While originally just an
HTML output formatter, the mandoc
language and a specialized
replacement to groff
by the same name were implemented. This tool is
available on numerous BSD, Linux, and Unix systems and has been the
default man-page formatter on macOS since Sequoia 14.1 (2023).
Manual Page Format
The basic structure of the manual pages was outlined in First Edition (1971):
- Name
- Synopsis
- Description
- Files
- See Also
- Diagnostics
- Bugs
- Owner
Dennis Ritchie wrote the earliest page, almost certainly for the cat
command. The specific format wasn’t intentionally designed and earlier
time-sharing systems included similar information in their reference
documentation. The inclusion of Bugs and Owner are particularly
interesting and speak to a degree of accountability for the system in
the early days, when it was used by very few people.
As the user base for Unix increased, the Owner section appeared less frequently. It was renamed Author by Fourth Edition and only used in section six (User-maintained programs) until completely disappearing by Seventh Edition (1979). The Berkeley distributions of Unix (BSD) kept using an Authors section as do the GNU tools commonly used in Linux distributions. This is indicative not only of the communal spirit of development in those Unix variants, but also the proliferation of email since the 1980s, which makes it easier to contact the authors despite vastly distributed development efforts.
The POSIX standard sets forth various section titles for its standards, which are different between, e.g., utilities or header files. The overall structure is still borrowed from the manual pages, no doubt because POSIX originally used the Unix System V Interface Definition and the BSD Reference Manuals as its starting point.
The specific sections in each Unix version or distribution can be quite different, but almost always at least include Name, Synopsis, Description and See Also, if applicable. A section on diagnostics, errors, return or exit codes is also usually included.
Many additional sections can be found in manual pages: security, standards conformance, examples, environment, attributes, history, and many more. While the initial intent for man-pages was to be terse and informal, more extensive manuals have been created. Manual pages for image processing applications or compilers can run into many hundreds of pages if they were to be printed out.
Creating manual pages in the nroff
languages is supported by the
macro package mentioned above. For example, each section is introduced
with the .SH
macro to format it properly or .B
is used to bold a
command or argument.
Manual Sections
The manual has traditionally been divided into numbered sections with alphabetical ordering. Each section is a collection of a type of information (e.g., commands vs. system calls) and is intentionally not grouped based on functionality. While the manual sections were numbered with Roman numerals prior to Seventh Edition, the more common Arabic numerals will be used here.
The numbering of the sections allows for information with the same
name to appear in different parts of the manual: the man
command and
the man
macro package are in sections 1 and 7, respectively. This is
why it is common to refer to specific information by its section
number like man(1)
and man(7)
(or some typographic variation
thereof.)
In the print versions of the manuals for the early version of Unix, an
introduction outlines the structure of the manual sections and an
introduction in Unix itself. This informally is “section 0” of the
man-pages. Since Second Edition (1972), a permuted index follows,
which is created by the ptx
command and makes finding the right
manual page much easier than a simply alphabetic index. Additionally,
most sections usually include their own intro
page, modeled after
the “section 0” intro, a practice which seems to have started with the
section 2 (System Calls) in Fourth Edition.
The typical section ordering used by AT&T’s Research Unix up to and including the commercial AT&T System III (1980) was also used by BSD and commercial Unixes based on either of them. Linux adopted the same structure and is still using it to date.
- (1) Commands
- (2) System Calls
- (3) Subroutines
- (4) Special files
- (5) File Formats
- (6) User-maintained programs (Games since 7th Edition and in BSD)
- (7) Miscellaneous
- (8) Maintenance (since 3rd Edition)
Section 7 was a catch-all section and had many names over the years:
- Miscellaneous (1st through 5th Editions)
- User maintained subroutines (6th Edition)
- Macro packages and language conventions (since 3BSD)
- Conventions (7th Edition)
- Miscellaneous Facilities (System III)
- Data bases and language conventions (8th and 9th Edition)
- Information sources (10th Edition)
Section 8 started out in Third Edition (1973) as a collection of
system administration tasks and commands. Interestingly, the kill
and ps
commands were initially documented in section 8 before moving
to 1 as discussed in the ps options
post.
In Seventh Edition, some of the system maintenance commands previously
in section 8 moved into a newly named sub-section 1M:
The maintenance section 8 discusses procedures not intended for use by the ordinary user. These procedures often involve use of commands of section 1, where an attempt has been made to single out peculiarly maintenance-flavored commands by marking them 1M.
By System III, section 8 became System Maintenance Procedures. In later commercial Unix variants like Solaris (see below), there was no more section 8 but instead a System Administration Guide similar to the BSD SMM, the System Manager’s Manual.
The later editions of Research Unix (8th, 9th, and 10th) were primarily used internally at Bell Labs and continued to evolve the manual section structure as seen with section 7 above. The Bell Labs Unix-follow on Plan 9 (1992) continued the concept of manual pages and reworked the structure of the manual sections to be in line with how Plan 9 worked: section 4 describes the file services and section 5 the Plan 9 file protocol to access them, for example.
Xenix Man-Page Oddity
Microsoft’s (and later SCO’s) Xenix was originally a 7th Edition Unix, which was ported to a number of architectures. By Xenix 3.0 (1983), it was based on System III and curiously abandoned the numbered manual sections in favor of letters:
- (C) Commands
- (CT) Text Processing Commands
- (CP) Programming Commands
- (S) System Services
- (F) File Formats
- (M) Miscellaneous
Sub-Sections
Starting with Seventh Edition, sub-sections grouped functionality in each manual section, counter to the original intent of how the sections were supposed to be structured.
By System V, the following sub-sections were used in section 1:
- (1C) Communications Commands
- (1G) Graphics Commands
- (1M) System Maintenance Commands
System V Release 4 (1989) was a major release that became the basis for several commercial Unix offerings. Because it merged AT&T Unix, BSD, SunOS, and Xenix, often several versions of the same command were installed that appeared in different manual sub-sections. In Solaris for example, in addition to section 1:
- (1B) SunOS/BSD Compatibility Package Commands
- (1S) SunOS Specific Commands
and in Solaris 11.2 (2014):
- (1G) Alternative open source implementations of Solaris commands (GNU)
The man-pages for libraries in System V had the following sub-sections for section 3:
- (3C) C Programming Language Library Routines
- (3S) Standard I/O Library Routines
- (3E) Executable and Linking Format Library Routines
- (3G) General Purpose Library Routines
- (3M) Math Library Routines
- (3X) Specialized Library Routines
Commercial Unixes can have dozens of sub-sections for different commands or libraries. Programming languages can install their own documentation in separate sub-section; Perl installs library documentation in section (3pm). In 4.2BSD (1983), (3F) was the Fortran library and (3N) the network library documentation.
System V Sections Reordering
While the first three sections of the reference manuals have remained more or less the same throughout the history of Unix, the remaining sections have changed. Especially the release of AT&T System V (1983) moved the traditional sections around. As a result, commercial Unix systems like Sun’s Solaris (1992) and HP’s HP-UX followed the System V ordering.
Section Name | System III → System V |
---|---|
Special Files | 4 → 7 |
File Formats | 5 → 4 |
Miscellaneous Facilities | 7 → 5 |
Sun’s earlier BSD-based SunOS (1982) and HP’s System III-based HP-UX (prior to 1985) used the Research Unix/BSD ordering. DEC’s initial Unix offering Ultrix (1984) was 4.2BSD-based and used its man-page section ordering. DEC’s later Unix OSF/1 (1992) (later renamed Digital Unix and then Tru64) used the System V section ordering. IBM’s AIX, despite being System V-based, utilized the traditional order of manual page sections, but also incorporated proprietary extensions to the references like railroad syntax diagrams.
Higher-Number Sections
Finally, section numbers higher than 8 are not common across different Unix variants, but may be used for system-specific documentation. Some examples are:
- (9)
- Teletype 5620-Related Software (8th through 10th Edition)
- Kernel Routines (Linux)
- Device Driver Interface and the Driver-Kernel Interface (DDI/DKI) (Solaris)
- Raster Image Software (Plan 9 1st and 2nd Edition)
- (10)
- Circuit Design Tools (10th Edition, Plan 9 1st Edition)
Coming Up…
This post provided a general overview of the origin and purpose of the
Unix manual pages. It focused on the history of the organization of
the manual pages and the typesetting tools behind them. The second
installment will explain on the man
command and additional and
alternative documentation sources in Unix, Linux, and related systems.