[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
In a project tree, some of the files and directories are "part of
the source" -- they are of interest to ArX
. Other files and
directories may be scratch files, editor back-up files, and temporary
or intermediate files generated by programs. Those other files should
be ignored by most ArX
commands.
This chapter discusses how ArX
recognizes which files to pay
attention to, and which to ignore.
ArX
has flexible facilities for keeping track of all of the files
and directories in your project: for taking "inventories" of your
project tree. It has these facilities for three reasons:
Distinguishing Source ArX
uses a project inventory to
distinguish files and directories which are part of your project from
other files and directories which are temporary files, scratch files,
editor backup-files, and so forth.
Additionally, ArX
permits you to overlay projects: store more than
one project at a single root. When you do that, ArX
uses
inventories to sort out which files and directories belong to each
project. (The topic of overlays, however, is deferred until a later
chapter.)
Recognizing Renames Every file or directory in an ArX
inventory
has two names. One name is simply the location (path) of the file
relative to the root of the project tree. The other name is a
"logical name" for the file: a name that remains the same regardless
of where in the project tree the file is located. When ArX
compares two versions of a project tree, it uses logical names to
discover when files or directories have been moved, renamed, deleted,
or added.
For each project tree, you have a choice to make regarding how project inventories work. The options are described briefly here, then in more detail in the sections that follow.
Explicit Inventories The most familiar (and default) option
is to use an explicit inventory.
Whenever you add, delete or rename a file, you must inform
ArX
of that fact explicitly. For example, after adding the
file foo.c
, you have to tell ArX
% arx add foo.c |
If you rename foo.c
to bar.c
, then you must use ArX
to do so
% arx move foo.c bar.c |
Finally, if you delete foo.c
, you must do so with ArX
% arx delete foo.c |
Naming Conventions Another option is to simply
use naming conventions. ArX
will search your tree for files
matching certain naming patterns, and consider all of those files to
be source files.
When you use only naming conventions to take an inventory, the logical
name of a file and its location name are exactly the same. For that
reason, if you rename a file, ArX
will think you deleted a file
with the old name, and added a file with the new name. If you delete
a file, then add a file with the same name, ArX
will think that the
new file is a modified form of the old file. None of these
limitations are fatal. ArX
will still work, but they do limit the
effectiveness of ArX
at branching and merging. ("Branching" and
"merging" are topics of a later chapter.)
Implicit Inventories A third option combines some of the advantages
of using naming conventions with some of the advantages of explicit
inventories: implicit inventories. When you use an implicit
inventory, every file that passes the naming conventions is considered
source. You may explicitly add, delete, and rename files --
allowing ArX
to precisely track renames for those files and
directories. You also may store a file tag (the "logical name"
of a file) in any file. If you don't explicitly tag a file, and use
an implicit inventory, ArX
will search for those embedded tags and
use them to precisely detect new files, deleted files, and renamed
files.
Each of the three options is called a tagging method.
There is some advice at the end of this chapter about how to choose among the three tagging methods.
If you never explicitly specify a tagging method, ArX
will use
explicit inventories. You choose it specifically with this command
issued in a project tree:
% arx tagging-method explicit |
Similarly, to use either naming conventions or implicit inventories, use one of the commands:
% arx tagging-method names % arx tagging-method implicit |
To find out what method a given project tree uses, use the same command with no argument:
% arx tagging-method names |
The command arx inventory
is used to print a list of source files.
It has many options, including options to print other kinds of file
lists (such as a list of all editor backup files, or a list of all
files which are not source):
% cd source-tree |
% arx inventory --source hello.c hello.h library library/buffer.c library/buffer.h ... |
contrasted with:
% cd source-tree |
% ls hello.c hello.c.~1~ hello.h library |
(Notice that hello.c.~1~
is not included in the inventory of
source files.)
The naming conventions used by ArX
are as follows:
Control Files A control file is part of the source, but control
files are not included in the output of arx inventory
unless the
--all
flag is used. Control file and directory names match any of
these patterns:
.arch-project-tree .arch-ids .owned.* .common {arch} |
Junk Files A junk file is not part of the source. A junk file or directory name matches the pattern:
,* |
or if it contains any of the characters:
<space> <tab> <newline> [ ] |
? \ |
Note that if a directory name matches that pattern, then none of the contents of the directory are part of the source, regardless of their names.
Junk files are listed by the command:
% arx inventory --junk |
ArX sometimes creates junk files and directories of its own. When it does, those files and directories have names that match the pattern:
,,* |
You should avoid creating files and directories with names that match
that pattern. ArX
will freely delete files and directories with
names that match ,,*
whenever it needs to re-use such a name.
Usually, ArX
will delete any junk file it creates before the
command that created the junk file terminates. Sometimes, though,
when a command fails, ArX
will leave behind junk files or
directories matching ,,*
. This is a debugging feature, likely to be
removed in a future release. For now, whenever you find such a file
(and are confident it isn't being used by a currently running
command), you are free to delete it.
Backup Files If a file is not a junk file, it may be a backup file. Backup files are not part of the source. They match any of the patterns:
*~ *.bak *.modified *.orig *.original *.rej *.rejects |
Backup files are listed by the command:
% arx inventory --backups |
Precious Files If a file is not a control file, junk file, or
backup file, it might be a precious file. Precious files are not part
of the source. For all intents and purposes, ArX
treats
precious, backup, and unrecognized files the same.
Precious files and directories match one of these patterns:
+* .gdbinit =build* =install* CVS RCS TAGS |
Of course, precious files can be listed by the command:
% arx inventory --precious |
Sometimes ArX
will create its own precious files -- usually to save
some information that you might not want to lose. When it does, it
creates a file or directory matching the pattern:
++* |
You should avoid creating such filenames yourself. ArX
won't ever
delete such a file -- but if one happens to get in the way of an
ArX
command, that command will fail with an error.
Source Files If a file is not a control or junk file, it
might be an ordinary source file. Source
files are, of course, the files that ArX
stores in an archive
(along with control files).
Source files must match the pattern:
[=a-zA-Z0-9]* |
but must not match any of the patterns:
*.o *.core core |
Ordinary source files are listed by:
% arx inventory --source |
Some files which are ArX
control files are counted as source even
though they don't match the patterns above. However, these files are
not listed by default. All source files (ordinary source plus control
files) are listed by:
% arx inventory --source --all |
Unrecognized Files Any file that doesn't fall into the above categories is an unrecognized file. Unrecognized files can be listed by the command:
% arx inventory --unrecognized |
WARNING The basic pattern for source files is:
[=a-zA-Z]* |
however, you should restrict yourself to file names that do not
contain spaces. Filenames containing spaces are likely to trigger
bugs in the current release of ArX
.
Explicit inventories are the default, but if you want to set it anyway, then use this command:
% arx tagging-method explicit |
Note that you must use that command from within a working directory tree that has already been initialized.
When using explicit designation, it is (ordinarilly) necessary to add every file and directory in the source to the explicit list using the command:
% arx add FILE |
If FILE
is a directory, that will create FILE/.arch_ids/=id
. If
it is a regular file or symbolic link, it will create (in the same
directory) .arch_ids/FILE.id
. In either case, the file created will
contain an obscure string known as an "inventory tag" (inventory
tags are explained in more detail below).
If you remove a regular file or symbolic link, you must use the command:
% arx delete FILE |
It will remove FILE
and its inventory tag.
In order to remove a directory, you must yourself remove the
.arch_ids
subdirectory. That will also implicitly remove the
inventory tags of any files that ArX
thinks are stored in that
directory.
If you rename a regular file or symbolic link, you can use the command:
% arx move OLD-NAME NEW-NAME |
to move the file and its inventory tag.
If you rename a directory, it's inventory tag (and the tags for all
files and subdirectories it contains) move with it automatically
(because the .arch_ids
subdirectory has moved).
When you run arx inventory
in a working directory using explicit
designation, only explicitly designated source files are listed.
If you would rather see a list of all files passing the naming
conventions for source files, use:
% arx inventory --source --names |
If you are importing a project into ArX, it may be convenient to add everything that matches the naming conventions. An idiom for doing this is
$ arx inventory --names --source --both | xargs arx add |
Then you can clean up by explicitly add
'ing or delete
'ing files.
You should also read about tree-lint
later in this chapter.
To use implicit tagging, use the following command in your working directory:
% arx tagging-method implicit |
When implicit tagging is used, every file that passes the naming
conventions is treated as source. If a file or directory has an
explicit tag (created with add
), ArX
will use that explicit
tag to recognize when a file has moved. If a file (but not a
directory or symbolic link) lacks an explicit tag, ArX
will look
for a tag in the file itself.
A tag within a file has one of two forms. It may be either:
<punct><basename><spaces>-<spaces><tag> |
where <punct>
is an arbitrary string of punctuation and spaces,
<basename>
is the basename of the file, and <tag>
an inventory tag
for the file. Or:
<punct>tag:<spaces><tag> |
In either case, <tag>
should be unique among the files within a
directory. A tag within a file must occur within the first 1024
bytes
of the file.
One convention for source files is to add a comment to the top of every file, briefly stating the purpose of the file:
/* hello.c - `main' for the hello world program ... |
or:
/* tag: `main' for the hello world program ... |
This may cause problems if you rename the file. You'll probably want to change the name and/or the description, but ArX will think that the old file was deleted and a new one created.
Another possible convention is to use a string identifying the author and the time the file was first created (or first tagged):
/* tag: joe.hacker@gnu.org Thu Nov 29 17:25:15 PST 2001 ... |
If you use the basename
form of an implicit tag, and actually rename
a file (rather than simply move it between directories), you do need
to remember to update the tag line to reflect the new basename.
When you use implicit tagging, it is ok if a file lacks any tag at
all, either explicit or implicit. In that case, if you rename the
file, ArX
will think you've deleted the old file and added a new
one -- but aside from that, everything will work normally.
CAUTION: Leading and trailing spaces around an inventory tag are
not considered part of the tag. Within a tag, every non-graphical
character is replaced by _
. For example, you write the that tag:
`main' for the hello world program |
the actual inventory tag is:
`main'_for_the_hello____world_program |
It is possible that a future release of ArX
will slightly change
the rule -- so that multiple spaces and tabs are replaced by a single
_
.
If you are using naming conventions only to recognize source files,
then if you rename a directory or file, ArX
will conclude that you
have deleted the old file, and created a new file.
If you are using an explicit source inventory, ArX
will always
recognize when a directory is renamed (presuming that the .arch_ids
subdirectory is preserved), and it will recognize when a file is
renamed if you use move
(rather than delete
and add
).
Of course, ArX
can be fooled if you swap two files without swapping
their inventory tags.
If you are using an implicit inventory, ArX
will never recognize
when an untagged file is renamed (it will think "delete" and
"add"). If a file is tagged explicitly, ArX
will recognize when
the file is added, deleted, or renamed -- just as when using an
explicit inventory. If a file is not tagged explicitly, but has an
embedded tag, ArX
will recognize when the file is added, deleted or
moved.
The command:
% arx tree-lint |
is useful for keeping things neat and tidy.
If you use explicit tagging, it will tell you of any tags for which the corresponding file does not exist. It will tell you of any files that pass the naming conventions, but for which no explicit tag exists.
If you use implicit tagging, it will tell you of any files for which no tag can be found -- either explicit or implicit. It will tell you of any explicit tags for which the corresponding file does not exist.
In either case, or if you are using naming conventions only,
tree-lint
will tell you of any files that don't fit the naming
conventions at all.
Finally, if you use explicit or implicit tagging, tree-lint
will
check for cases where multiple files use the same tag. If any two
files do have the same tag, you must correct that, either by
editing the tag (if it is in the file itself) or by using delete
and add
to replace a duplicated explicit tag.
When ArX
considers the files and directories in a working directory
it builds a one-to-one index mapping path names (relative to the root
of the working directory tree) to inventory tags.
The inventory tag of a file is its "logical identity". The path is the position of that identity within the particular working dir.
You can see the inventory tag for each source file with the command:
% arx inventory --source --tags |
When ArX
compares two project trees, it bases the comparison on
logical identities. If both trees have a file with a particular
inventory tag, but the files are in different positions, then ArX
considers the file to have been moved or renamed. Similarly, if an
inventory tag is present in one tree, but missing in the other, then
ArX
considers the file to have been added or deleted.
If you use naming conventions only, the inventory tag of each file is
the same as its path. Thus, when using the names
tagging method,
ArX
never recognizes that a file has been moved or renamed.
When you use the explicit
tagging method, inventory tags are stored
in the .arch-ids
directories. There is a file in .arch-ids
for
each tagged file (and one file for the directory containing
.arch-ids
), and those files contain the tags.
When you use the implicit
tagging method, tags in .arch-ids
directories take precedence (if they exist). If a file is not
explicitly tagged, ArX
searches for the inventory tag in the file
itself (as described earlier in the chapter). Finally, if a file is
not tagged at all, then its path is used as the inventory tag.
Be cautious when changing tagging methods for directories already
checked-in to an ArX
revision control archive.
For example, if you change from the tagging method names
to
explicit
, then the inventory tag for every file will change. ArX
will think that you've deleted all of the files in the old tree, and
added all of the files in the new tree.
In some situations, it isn't convenient to explicitly tag every file or to add an implicit tag to every file.
You can supply a default tag for every file that doesn't have an explicit tag with the command:
% arx explicit-default TAG-PREFIX |
After that, every file in that directory which lacks an explicit tag will have the tag:
TAG-PREFIX__BASENAME |
where BASENAME
is the basename of the file. Default tags created in
this way take precedence over implicit tags embedded in files. You
can find out the default tag for a directory with:
% arx explicit-default TAG-PREFIX |
and remove the default with:
% arx explicit-default --delete |
You can also specify a default tag which has lower precedence than implicit tags:
% arx explicit-default --weak TAG-PREFIX |
and view that default:
% arx explicit-default --weak |
or delete it:
% arx explicit-default --weak --delete |
When using implicit tags, you may sometimes have a directory with many
files that have no tag (either explicit or implicit), but not want
those files to appear in a report of untagged files generated by
tree-lint
. There are two ways to tell tree-lint
to shut-up
about such files:
One is to provide a default explicit tag or weak default explicit tag
using arx explicit-default
, as described above.
The second method is to label the directory as "don't care"
directory -- which means that tree-lint
shouldn't complain about
untagged files. You can do that with:
% arx explicit-default --dont-care set |
or remove the "don't care" flag with:
% arx explicit-default --delete --dont-care |
You can find out whether the "don't care" flag is set in a given directory with:
% arx explicit-default --dont-care |
Given the choice of the names
, explicit
, and implicit
tagging
conventions, which one should you choose?
The explicit
method is the default. It requires manually
informing ArX that a particular file should be under version control.
Both names
and implicit
try to guess what kind of files
should be archived, and which shouldn't. Unless you are very careful,
and, for example, don't include any generated files in your source
directory, names
and implicit
will accidently add
unwanted files to your archive.
The names
method is best for project trees that you don't
control, and for which the maintainer does not include file tags
(either explicit or implicit). For such trees, the names
method will always work, but if you want to use the explicit
or
implicit
method, you'll have to add file tags yourself. It
also works reasonably well for scripts (such as perl, python, or
shell), because there are no object files that can be accidently
included in the archive.
The implicit
method is, for some, the most convenient. You
just get in the habit of adding a tag:
line to the bottom of
each new file and doing a single arx add
for each directory.
After those steps, you can rename files and directories freely --
without having to remember to tell ArX
in a separate command.
On the other hand, the implicit
method has two limitations.
One limitation is that you must accept the possibility of accidently
adding new files to the inventory. Any file you create that passes
the naming conventions counts as source. The other, closely related,
limitation is that if you use implicit
inventories, you will
never want to compile a program in its own source
directory. When you compile a program, that creates intermediate
files and executables. Many of those files will almost certainly pass
the naming conventions for source -- so ArX
will wrongly
include them in a source inventory. You might want to include a
safeguard in your configure
scripts that causes them to refuse
to compile my programs in the source tree.
The file {arch}/=tagging-method
defines the naming conventions used
for a particular project tree. By editing that file, you can
estalish naming conventions that are different from the defaults,
which are described above.
That file can contain blank lines and comments (lines beginning with #) and directives, one per line. The permissable directives are:
implicit explicit names specify the tagging method to use for this tree |
exclude RE junk RE backup RE precious RE unrecognized RE source RE specify a regular expression to use for the indicated category of files. |
Regular expressions are specified in Posix ERE syntax (the same syntax used by egrep, grep -E, and awk) and have default values which implement the naming conventions described above.
The exclude
pattern should match a subset of files matched by the
source
pattern. Files which match exclude
are printed by:
% arx inventory --source --control |
but not printed by:
% arx inventory --source |
Although you can define your own naming conventions, there are some minor limitations:
The file names .
and ..
are always ignored by inventory
.
File names which contain non-printing characters, spaces, or any of
the globbing characters (*
, [
, ]
, \
, ?
) are always placed
in the category unrecognized
. This is so that tools which operate
on project trees can safely presume that no source file has a name
that includes these characters.
File names which begin with ,, are always placed in the category
junk
. This is so that tools which operate on a project tree can
safely destroy or create files beginning with ,,.
The default naming conventions are given by:
exclude ^(.arch-ids|\{arch\})$ junk ^(,.*)$ backup ^.*(~|\.~[0-9]+~|\.bak|\.orig|\.rej|\.original|\.modified|\.reject)$ precious ^(\+.*|\.gdbinit|=build\.*|=install\.*|CVS|CVS\.adm|RCS|RCSLOG|SCCS|TAGS)$ unrecognized ^(.*\.(o|a|so|core)|core)$ source ^([_=a-zA-Z0-9].*|\.arch-ids|\{arch\}|\.arch-project-tree)$ |
[ << ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |