The University of Edinburgh -
Division of Informatics
Artificial Intelligence


Creating HTML Documents


What are HTML documents?

HTML, or HyperText Markup Language, is used as the markup language for documents that are made available on the World Wide Web. The markup language is used to specify the logical organisation of the document and has hypertext extensions that allow links to other documents, located either locally or on WWW servers elsewhere on the Internet.

Before you can create your own HTML documents you should read one or other introductory description of the HTML language and URLs, Uniform Resource Locators (used to name documents that are pointed to in hypertext links).


Translating to HTML from an existing document

Plain Text Files

There are, at least, three options open to you for making the document available via the Web:

  1. Do no conversion at all. Just include a hypertext link to the plain text file from an existing accessible HTML file, or, place the plain text file (or a symbolic link to the file) in a directory that is already accessible and which has no index.html file contained within it.

  2. Create a minimal HTML file and include the text of the plain text file between the start and end tags of a Preformatted Fixed Pitch Text block (between <PRE> and </PRE>).

  3. Convert the contents of the plain text file into a fully fledged HTML document (see Creating and editing HTML documents)

Converting from LaTeX to HTML

Use the latex2html program to create an HTML version of the LaTeX document, see the latex2html(1) manual page or the latex2html manual.

Converting from Texinfo format to HTML

Use the tex2ihtml program to create an HTML version of the texinfo document, see the texi2html(1) manual page.


Creating and editing HTML documents

SoftQuad HoTMetaL 2.0

If you are creating HTML documents from scratch you are strongly advised to use SoftQuad's HoTMetaL 2.0 editor. It is a special-purpose graphical editor run under X windows for creating valid HTML. It has context sensitive menus and toolbars to make the task of creating valid HTML markup much easier (you don't need to remember the names of the markup tags or the syntax of the markup language). It will guarantee that any HTML files that you write will be strictly HTML conformant and won't need to be re-edited in the near future to allow them to be viewed by future releases of Web browsers that may not support older non-conformant syntax.

There are templates for some standard official Web pages that are available to users of HoTMetaL 2.0 that can be utilised as a starting point for creation of a new HTML document; see the description of the Open Template menu item in the document below.

Two documents of interest are:

N.B. there are circumstances under which you can't use SoftQuad HoTMetaL to edit an existing HTML file. If the HTML file does not follow the correct syntax for HTML version 1 or version 2, the editor will probably refuse to load the file (it will however give the character offset within the file of the syntax error). As a consequence, it is not possible to load some files which are syntactically a mess or which contain new syntax that has been proposed for HTML version 3. Under these circumstances you must use another editor to edit the file and you should seriously consider making the file HTML version 2 compliant while you are doing it.

AOLpress 2.01

AOLpress may be more to the liking of those who want a WYSIWYG HTML editor. One irritating bug seems to be that on some SUNs it screws up the screen fonts so you might have to logout after using it!. To use it add /usr/local/AOLpress to your path, i.e. add the following line to the end of your .bashrc file:

export PATH=$PATH:/usr/local/AOLpress

The executable is called aolpress.

Other HTML Editors

There are other HTML editors available, such as ASHE (A Simple Html Editor) which can be used via the commands ashe or xhtml, and Gnu Emac's html-mode. Ashe also uses emacs-like editing and movement commands and so either might seem an obvious choice of editor for users of emacs. However neither of these enforces the syntax of HTML and both require you to have a much greater knowledge of the language.

There is also Netscape Navigator Gold available as the command netscape3g but only on colour machines. Note that the delete key deletes forward not backward which seems to be a bug.

Checking HTML Files for Errors

If you have not used an HTML compliant editor to edit an HTML file it might be of help to check it over using the htmlchek program. See the htmlchek(1) manual page.

An Example HTML File

You may wish to experiment by copying and editing an example HTML file. This file follows the conventions that have been adopted for the official HTML pages in the Department of Artificial Intelligence. It includes advice on:


Tidying HTML documents

Unfortunately the various methods used to produce HTML documents do not always produce strictly valid HTML. This isn't a problem, as most browsers can cope with the various common mistakes. However, in doing so each browser may do choose to display the page differently, as there was some ambiguity in the HTML that was generated. Documents that are strictly compliant HTML, should always render the same way, no matter what browser is used.

Fortunately there is a tool called 'tidy' that attempts to fix common HTML mistakes and produce fully compliant HTML. There is a man page available for tidy, and the official web page is available at http://www.w3.org/People/Raggett/tidy/

The quick use guide is:

  tidy myfile.html > mycompliantfile.html
Or to modify the file in place:
  tidy -m myfile.html


Further references on the World Wide Web

Finally here are some references to more general information on the World Wide Web


[Search These Pages] [DAI Home page][Computing Page][Comment]