Formatting informationA beginner's introduction to typesetting with LATEXChapter 3 — Basic document structuresPeter FlynnSilmaril Consultants |
Contents
|
---|---|
This edition of Formatting Information was prompted by the generous help I have received from TEX users too numerous to mention individually. Shortly after TUGboat published the November 2003 edition, I was reminded by a spate of email of the fragility of documentation for a system like LATEX which is constantly under development. There have been revisions to packages; issues of new distributions, new tools, and new interfaces; new books and other new documents; corrections to my own errors; suggestions for rewording; and in one or two cases mild abuse for having omitted package X which the author felt to be indispensable to users. ¶ I am grateful as always to the people who sent me corrections and suggestions for improvement. Please keep them coming: only this way can this book reflect what people want to learn. The same limitation still applies, however: no mathematics, as there are already a dozen or more excellent books on the market — as well as other online documents — dealing with mathematical typesetting in TEX and LATEX in finer and better detail than I am capable of. ¶ The structure remains the same, but I have revised and rephrased a lot of material, especially in the earlier chapters where a new user cannot be expected yet to have acquired any depth of knowledge. Many of the screenshots have been updated, and most of the examples and code fragments have been retested. ¶ As I was finishing this edition, I was asked to review an article for The PracTEX Journal, which grew out of the Practical TEX Conference in 2004. The author specifically took the writers of documentation to task for failing to explain things more clearly, and as I read more, I found myself agreeing, and resolving to clear up some specific problems areas as far as possible. It is very difficult for people who write technical documentation to remember how they struggled to learn what has now become a familiar system. So much of what we do is second nature, and a lot of it actually has nothing to do with the software, but more with the way in which we view and approach information, and the general level of knowledge of computing. If I have obscured something by making unreasonable assumptions about your knowledge, please let me know so that I can correct it. Peter Flynn is author of The HTML Handbook and Understanding SGML and XML Tools, and editor of The XML FAQ. |
This document is Copyright © 1999–2005 by Silmaril Consultants under the terms of what is now the GNU Free Documentation License (copyleft). Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation; with no Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A copy of the license is included in the section entitled The GNU Free Documentation License. You are allowed to distribute, reproduce, and modify it without fee or further requirement for consent subject to the conditions in section D.5. The author has asserted his right to be identified as the author of this document. If you make useful modifications you are asked to inform the author so that the master copy can be updated. See the full text of the License in Appendix D. |
CHAPTER
|
Basic document structures |
|
LATEX's approach to formatting is to aim for consistency. This means that as long as you identify each element of your document correctly, it will be typeset in the same way as all the other elements like it, so that you achieve a consistent finish with minimum effort. Consistency helps make documents easier to read and understand.
Elements are the component parts of a document, all the pieces which make up the whole. Almost everyone who reads books, newspapers, magazines, reports, articles, and other classes of documents will be familiar with the popular structure of chapters, sections, subsections, subsubsections, paragraphs, lists, tables, figures, and so on, even if they don't consciously think about it.
Consistency is also what publishers look for. They have a house style, and often a reputation to keep, so they rightly insist that if you do something a certain way once, you should do it the same way each time.
To help achieve this consistency, every LATEX document starts by declaring what document class it belongs to.
To tell LATEX what class of document you are going to
create, you type a special first line into your file which
identifies it.1 To start a report, for
example, you would type the \documentclass
command like this as your first line:
\documentclass{report}
There are four built-in classes provided, and many others that you can download (some may already be installed for you):
The article class in particular can be used (some would say ‘abused’) for almost any short piece of typesetting by simply omitting the titling and layout (see below).
The built-in classes are intended as starting-points, especially for drafts and for compatibility when exchanging documents with other LATEX users, as they come with every copy of LATEX and are therefore guaranteed to format identically everywhere. They are not intended as final-format publication-quality layouts. For most other purposes, especially for publication, you use add-in packages to extend these classes to do what you need:
The memoir and komascript packages contain more sophisticated replacements for all the built-in classes;
Many academic and scientific publishers provide their own special class files for articles and books (often on their Web sites for download);
Conference organisers may also provide class files for authors to write papers for presentation;
Many universities provide their own thesis document class files in order to ensure exact fulfillment of their formatting requirements;
Businesses and other organizations can provide their users with corporate classes on a central server and configure LATEX installations to look there first for packages, fonts, etc.
Books and journals are not usually printed on office-size paper. Although LATEX's layouts are designed to fit on standard A4 or Letter stationery for draft purposes, it makes them look odd: the margins are too wide, or the positioning is unusual, or the font size is too small, because the finished job will normally be trimmed to a different size entirely — try trimming the margins of the PDF version of this book to 185mm by 235mm (the same as The LATEX Companion series) and you'll be amazed at how it changes the appearance!
The default layouts are designed to fit as drafts on US Letter size paper.3 To create documents with the correct proportions for standard A4 paper, you need to specify the paper size in an optional argument in square brackets before the document class name, e.g.
\documentclass[a4paper]{report}
The two most common options are a4paper and letterpaper. However, many European distributions of TEX now come preset for A4, not Letter, and this is also true of all distributions of pdfLATEX.
The other default settings are for: a) 10pt type (all document classes); b) two-sided printing (books and reports) or one-sided (articles and letters); and c) separate title page (books and reports only). These can be modified with the following document class options which you can add in the same set of square brackets, separated by commas:
- 11pt
to specify 11pt type (headings, footnotes, etc. get scaled up or down in proportion);
- 12pt
to specify 12pt type (again, headings scale);
- oneside
to format one-sided printing for books and reports;
- twoside
to format articles for two-sided printing;
- titlepage
to force articles to have a separate title page;
- draft
makes LATEX indicate hyphenation and justification problems with a small square in the right-hand margin of the problem line so they can be located quickly by a human.
If you were using pdfLATEX for a report to be in 12pt type on Letter paper, but printed one-sided in draft mode, you would use:
\documentclass[12pt,letterpaper,oneside,draft]{report}
There are extra preset options for other type sizes which can be downloaded separately, but 10pt, 11pt, and 12pt between them cover probably 99% of all document typesetting. In addition there are the hundreds of add-in packages which can automate other layout and formatting variants without you having to program anything by hand or even change your text.
Exercise 1. Create a new document
Use your editor to create a new document.
Type in a Document Class Declaration as shown above.
Add a font size option if you wish.
In North America, omit the a4paper option or change it to letterpaper.
Save the file (make up a name) ensuring the name ends with
.tex
After the Document Class Declaration, the text of your document is enclosed between two commands which identify the beginning and end of the actual document:
\documentclass[11pt,a4paper,oneside]{report} \begin{document} ... \end{document}
(You would put your text where the dots are.) The reason for marking off the beginning of your text is that LATEX allows you to insert extra setup specifications before it (where the blank line is in the example above: we'll be using this soon). The reason for marking off the end of your text is to provide a place for LATEX to be programmed to do extra stuff automatically at the end of the document, like making an index.
A useful side-effect of marking the end of the document
text is that you can store comments or temporary text
underneath the \end{document}
in the
knowledge that LATEX will never try to typeset them.
... \end{document} Don't forget to get the extra chapter from Jim!
This
\begin
...\end
pair of commands is an example of a common LATEX structure called an
environment.
Environments enclose text which is to be handled in a particular
way. All environments start with
\begin{...}
and end with
\end{...}
(putting the name of the
environment in the curly braces).
Exercise 2. Adding the document environment
Add the document environment to your file.
Leave a blank line between the Document Class Declaration and the
\begin{document}
(you'll see why later).Save the file.
The first thing you put in the document environment is almost always the document title, the author's name, and the date (except in letters, which have a special set of commands for addressing which we'll look at later). The title, author, and date are all examples of metadata or metainformation (information about information).
\documentclass[11pt,a4paper,oneside]{report} \begin{document} \title{Practical Typesetting} \author{Peter Flynn\\Silmaril Consultants} \date{December 2004} \maketitle \end{document}
The \title
, \author
,
and \date
commands are
self-explanatory. You put the title, author name, and date in
curly braces after the relevant command. The title and author
are usually compulsory; if you omit the
\date
command, LATEX uses today's
date by default.
You always finish the metadata with the
\maketitle
command, which tells
LATEX that it's complete and it can typeset the titling
information at this point. If you omit
\maketitle
, the titling will never be
typeset. This command is reprogrammable so you can alter the
appearance of titles (like I did for the printed version of
this document).
The double backslash (\\
) is the
LATEX command for forced linebreak. LATEX normally decides by
itself where to break lines, and it's usually right, but
sometimes you need to cut a line short, like here, and start a
new one. I could have left it out and just used a comma, so my
name and my company would all appear on the one line, but I
just decided that I wanted my company name on a separate line.
In some publishers' document classes, they provide a special
\affiliation
command to put your company
or institution name in instead.
When this file is typeset, you get something like this (I've cheated and done it in colour for fun — yours will be in black and white for the moment):
Exercise 3. Adding the metadata
Add the
\title
,\author
,\date
, and\maketitle
commands to your file.Use your own name, make up a title, and give a date.
The order of the first three commands is not important, but the
\maketitle
command must come last.
The document isn't really ready for printing like this, but if you're really impatient, look at Chapter 4 to see how to typeset and display it.
In reports and articles it is normal for the author to provide an Summary or Abstract, in which you describe briefly what you have written about and explain its importance. Abstracts in articles are usually only a few paragraphs long. Summaries in reports can run to several pages, depending on the length and complexity of the report and the readership it's aimed at.
In both cases (reports and articles) the Abstract or Summary is optional (that is, LATEX doesn't force you to have one), but it's rare to omit it because readers want and expect it. In practice, of course, you go back and type the Abstract or Summary after having written the rest of the document, but for the sake of the example we'll jump the gun and type it now.
\documentclass[11pt,a4paper,oneside]{report} \usepackage[latin1]{inputenc} \renewcommand{\abstractname}{Summary} \begin{document} \title{Practical Typesetting} \author{Peter Flynn\\Silmaril Consultants} \date{December 2004} \maketitle \begin{abstract} This document presents the basic concepts of typesetting in a form usable by non-specialists. It is aimed at those who find themselves (willingly or unwillingly) asked to undertake work previously sent out to a professional printer, and who are concerned that the quality of work (and thus their corporate æsthetic) does not suffer unduly. \end{abstract} \end{document}
After the \maketitle
you use the
abstract environment, in which you simply
type your Abstract or Summary, leaving a blank line between
paragraphs if there's more than one (see section 3.6 for this convention).
In business and technical documents, the Abstract is often
called a Management Summary, or Executive Summary, or Business
Preview, or some similar phrase. LATEX lets you change the
name associated with the abstract
environment to any kind of title you want, using the
\renewcommand
command to give the command
\abstractname
a new value:
\renewcommand{\abstractname}{Executive Summary}
Exercise 4. Using an Abstract or Summary
Add the
\renewcommand
as shown above to your Preamble.The Preamble is at the start of the document, in that gap after the
\documentclass
line but before the\begin{document}
(remember I said we'd see what we left it blank for: see the panel ‘The Preamble’ in section 3.4).Add an abstract environment after the
\maketitle
and type in a paragraph or two of text.Save the file (no, I'm not paranoid, just careful).
Notice how the name of the command you are renewing (here,
\abstractname
) goes in the first set of
curly braces, and the new value you want it to have goes in
the second set of curly braces (this is an example of a
command with two arguments). The environment you use is still
called abstract (that is, you still type
\begin{abstract}
...\end{abstract}
).
What the \abstractname
does is change the
name that gets displayed and printed, not the name of the
environment you store the text in.
If you look carefully at the example document, you'll see I sneakily added an extra command to the Preamble. We'll see later what this means (Brownie points for guessing it, though, if you read section 2.7).
In the body of your document, LATEX provides seven levels of division or sectioning for you to use in structuring your text. They are all optional: it is perfectly possible to write a document consisting solely of paragraphs of unstructured text. But even novels are normally divided into chapters, although short stories are often made up solely of paragraphs.
Chapters are only available in the book and report document classes, because they don't have any meaning in articles and letters. Parts are also undefined in letters.4
Depth | Division | Command | Notes |
---|---|---|---|
−1 | Part | \part |
Not in letters |
0 | Chapter | \chapter |
Books and reports |
1 | Section | \section |
Not in letters |
2 | Subsection | \subsection |
Not in letters |
3 | Subsubsection | \subsubsection |
Not in letters |
4 | Titled paragraph | \paragraph |
Not in letters |
5 | Titled subparagraph | \subparagraph |
Not in letters |
In each case the title of the part, chapter, section, etc. goes in curly braces after the command. LATEX automatically calculates the correct numbering and prints the title in bold. You can turn section numbering off at a specific depth: details in section 3.5.1.
\section{New recruitment policies} ... \subsection{Effect on staff turnover} ... \chapter{Business plan 2005--2007}
There are packages5 to let you control the typeface, style, spacing, and appearance of section headings: it's much easier to use them than to try and reprogram the headings manually. Two of the most popular are the ssection and sectsty packages.
Headings also get put automatically into the Table of Contents, if you specify one (it's optional). But if you make manual styling changes to your heading, for example a very long title, or some special line-breaks or unusual font-play, this would appear in the Table of Contents as well, which you almost certainly don't want. LATEX allows you to give an optional extra version of the heading text which only gets used in the Table of Contents and any running heads, if they are in effect . This optional alternative heading goes in [square brackets] before the curly braces:
\section[Effect on staff turnover]{An analysis of the effect of the revised recruitment policies on staff turnover at divisional headquarters}
Exercise 5. Start your document text
Add a
\chapter
command after your Abstract or Summary, giving the title of your first chapter.If you're planning ahead, add a few more
\chapter
commands for subsequent chapters. Leave a few blank lines between them to make it easier to add paragraphs of text later.By now I shouldn't need to tell you what to do after making significant changes to your document file.
All document divisions get numbered automatically. Parts get Roman numerals (Part I, Part II, etc.); chapters and sections get decimal numbering like this document, and Appendixes (which are just a special case of chapters, and share the same structure) are lettered (A, B, C, etc.).
You can change the depth to which section numbering
occurs, so you can turn it off selectively. In this document
it is set
to 3.
If you only want parts, chapters, and sections numbered, not
subsections or subsubsections etc., you can change the
value of the secnumdepth
counter using the the \setcounter
command,
giving the depth value from the table in section 3.5:
\setcounter{secnumdepth}{1}
A related counter is tocdepth, which specifies what depth to take the Table of Contents to. It can be reset in exactly the same way as secnumdepth. The current setting for this document is 2.
\setcounter{tocdepth}{3}
To get an unnumbered section heading which does not go into the Table of Contents, follow the command name with an asterisk before the opening curly brace:
\subsection*{Shopping List}
All the divisional commands from
\part*
to \subparagraph*
have this ‘starred’ version which can
be used on special occasions for an unnumbered heading when
the setting of secnumdepth
would normally mean it would be numbered.
After section headings comes your text. Just type it and leave a blank line between paragraphs. That's all LATEX needs.
The blank line means ‘start a new paragraph here’: it does not (repeat: not) mean you get a blank line in the typeset output. Now read this paragraph again and again until that sinks in.
The spacing between paragraphs is a separately definable
quantity, a dimension or
length called \parskip. This is normally zero
(no space between paragraphs, because that's how books
are normally typeset), but you can easily set it to any size
you want with the \setlength
command in the
Preamble:
\setlength{\parskip}{1cm}
This will set the space between paragraphs to 1cm. See section 2.8.1 for details of the various size units LATEX can use. Leaving multiple blank lines between paragraphs in your source document achieves nothing: all extra blank lines get ignored by LATEX because the space between paragraphs is controlled only by the value of \parskip.
White-space in LATEX can also be made flexible (what
Lamport calls
‘rubber’ lengths). This means that
values such as \parskip can
have a default dimension plus an amount of expansion minus an
amount of contraction. This is useful on pages in complex
documents where not every page may be an exact number of
fixed-height lines long, so some give-and-take in vertical
space is useful. You specify this in a
\setlength
command like this:
\setlength{\parskip}{1cm plus4mm minus3mm}
Paragraph indentation can also be set with the
\setlength
command, although you would
always make it a fixed size, never a flexible one, otherwise
you would have very ragged-looking paragraphs.
\setlength{\parindent}{6mm}
By default, the first paragraph after a heading follows the standard Anglo-American publishers' practice of no indentation. Subsequent paragraphs are indented by the value of \parindent (default 18pt).6 You can change this in the same way as any other length.
In the printed copy of this document, the paragraph indentation is set to 12pt and the space between paragraphs is set to 0pt. These values do not apply in the Web (HTML) version because not all browsers are capable of that fine a level of control, and because users can apply their own stylesheets regardless of what this document proposes.
Exercise 6. Start typing!
Type some paragraphs of text. Leave a blank line between each. Don't bother about line-wrapping or formatting — LATEX will take care of all that.
If you're feeling adventurous, add a
\section
command with the title of a section within your first chapter, and continue typing paragraphs of text below that.Add one or more
\setlength
commands to your Preamble if you want to experiment with changing paragraph spacing and indentation.
To turn off indentation completely, set it to zero (but you still have to provide units: it's still a measure!).
\setlength{\parindent}{0in}
If you do this, though, and leave \parskip set to zero, your readers won't be able to tell easily where each paragraph begins! If you want to use the style of having no indentation with a space between paragraphs, use the parskip package, which does it for you (and makes adjustments to the spacing of lists and other structures which use paragraph spacing, so they don't get too far apart).
All auto-numbered headings get entered in the Table of
Contents (ToC) automatically. You don't have to print a
ToC, but if you want to, just add the command
\tableofcontents
at the point where you want
it printed (usually after the Abstract or Summary).
Entries for the ToC are recorded each time you process your document, and reproduced the next time you process it, so you need to re-run LATEX one extra time to ensure that all ToC page-number references are correctly calculated.
We've already seen in section 3.5 how to use the optional argument to the sectioning commands to add text to the ToC which is slightly different from the one printed in the body of the document. It is also possible to add extra lines to the ToC, to force extra or unnumbered section headings to be included.
Exercise 7. Inserting the table of contents
Go back and add a
\tableofcontents
command after the\end{abstract}
command in your document.You guessed.
The commands \listoffigures
and \listoftables
work in exactly the same
way as \tableofcontents
to automatically
list all your tables and figures. If you use them, they
normally go after the \tableofcontents
command.
The \tableofcontents
command normally
shows only numbered section headings, and only down to the
level defined by the tocdepth counter (see section 3.5.1), but you can add extra entries with the
\addcontentsline
command. For example if you
use an unnumbered section heading command to start a
preliminary piece of text like a Foreword or Preface, you can
write:
\subsection*{Preface} \addcontentsline{toc}{subsection}{Preface}
This will format an unnumbered ToC entry for
‘Preface’ in the
‘subsection’ style. You can use
the same mechanism to add lines to the List of Figures or List
of Tables by substituting lof
or
lot
for toc
.