Skip to content | Change text size
 

Choosing File and Directory Names

Digital Unix, the operating system in use on Monash web servers, allows users to name files anything they want to as long as the name is shorter than 255 characters. This allows a site maintainer enough flexibility in naming schema to make life much easier or much more difficult.

Here's some information to help you do either.

The Index File

When the web server receives a request for a directory, it looks for one of:


index.html index.html default.html default.htm index.shtml

If it doesn't find any of them, it returns a listing of all the files in the directory.

The index file allows a webmaster to refer to their site with just a directory name. This means that the URL is shorter and more meaningful. It's also independant of the file type of the index page: you can move from normal HTML files to server-parsed .shtml files without changing the URL you use. Use directories and index files: they make life much neater.

Suffixes

Known to the MS-DOS community as 'file name extensions'. In a Unix context, it refers to a section at the end of the filename beginning with a dot that suggests what sort of file the filename might refer to.

foo.html
A HyperText Markup Language document
bar.gif
A Graphical Interchange Format image
baz.au
A Sun/NeXT audio file.

The Web server looks at the suffix when serving a file, and uses it to tell the client what sort of file it's sending. For this reason, it is important that all your Web files have an appropriate suffix.

Allowed characters

Unix itself allows any character except NULL in its file (and directory) names, and the FTP server will allow you to exploit this to a large degree. Users can create file names with some very strange characters in them: beeps, tabs, mysterious glyphs and other strangenesses. Such strangenesses are difficult to work with, and while they may be supported by the FTP server and the Web server, they can cause problems with CD burning, Web server tools like indexers and validators, and Unix shell users.

Compliance with ISO-9660 file name requirements is optional. Use of shell-friendly file names is actively encouraged. Shell-unfriendly file names are actively discouraged.

CD-ROMs - ISO-9660

The ISO-9660 standard allows these characters in CD file and directory names.

 
A-Z a-z 0-9 _

and a single . separating the file's name and extension.

Basic ISO-9660 filenames are eight characters long, with a three character extension. Subsequent enhancements to ISO-9660 (such as Rock Ridge, Joliet, and Romeo) allow longer file names and more characters, but they are beyond the scope of this document, and are not necessarily supported across platforms.

Shell-friendly file names

Definition

For non-ISO-9660-compliant web trees we recommend the use of these characters:


A-Z a-z 0-9 _ .

Names may be up to 256 characters long, but anything longer than about twentyis considered unwieldy.

Explanation

There are certain characters that cause great difficulty for a Unix shell , or other Unix programs expecting shell-compliant filenames, like RCS (a revision control system, which allows recovery from deletions and mis-editings over the central web servers). Many useful programs are actually shell scripts and many more use a shell in some way.

Nowadays, all Unix programs support file names of up to 256 characters in length. They're quite happy with the ISO-9660 characters, and also happily tolerate multiple full stops. However, use of the following characters may cause problems:

- (hyphen, minus sign)
This character is fine as long as you don't start a name with one. Unix programs use the hyphen to start command-line options: programs encountering a file name starting with a hyphen may misinterpret it.
: + ^ ,
There are a grey area. They're little-used, which means that programs that would normally work fine may not have been tested using them. Avoid them.
~ * [ ] $
Shell filename shorthand tricks. Do not use them in your own filenames.
space tab ; ( ) < > & | null
These are separators and terminators. If your file name contains these, the shell may get as far as the character and then be forced to stop. It will therefore have the name of your file wrong. Do not use them in filenames.
` ' "
Quoting characters. Single and double quotes (' ") allow users of the shell to get around some of the above restrictions in filenames. Backticks (`) are used to execute subcommands. Do not put them in file names! If they're in pairs, they could be interpreted out by a shell process, leaving the file unfindable. If they're on their own, they could potentially cause the rest of the command line to be considered part of a file name. This can be very bad indeed.

More information on file names

  • File types the server doesn't recognise? Here's how to tell it about them.