We use CA as a means of arranging all of the elements relative to axes in multidimensional space according to their similarity to each other. Most of the variation occurs on the first axis, with other axes accounting for progressively less. The advantages of CA are that it provides the same scaling of sample (locality) and character (taxa) plots, enabling direct comparison, and can accommodate 'incomplete' data matrices where some information is missing (Hill 1979; Gauch 1982), as always occurs with the fossil record (e.g. Rees and Ziegler 1999; Rees et al. 2000).
Step by step, these are the things to keep in mind as you're setting up a matrix:
1. Canoco is a Fortran program that has a very strict format that must be followed or it won't know how to read the files.
2. The horizontal numbers are the species identifiers. They do not need to be in increasing order (but it's not a bad habit). Only the species present at a locality need be included, Canoco does not accept "0.0" occurrence. The species are then listed in the first list after the matrix, which has 10 identifiers of 8 characters each per line, followed by a [Return]. For example, species 20 = Betul.
3. The vertical numbers (1,2,3,4,5,8,9,13) are the locality numbers. They need to be in increasing order but not in immediately consecutive order. Their identifiers are in the second list, and are also 10 per line of 8 characters each.
4. The overall format is the following:
1st line: You define what the data matrix is, any text is fine but keep it under 80 characters.
2nd line: This defines each line of the data sheet. In this example, I2 refers to the 2 spaces allotted to the vertical (locality) numbers (eg:.1). "7" refers to the number of presence- absence couplets per line ( the sequence "6 1.0" is a couplet). "I5" is the number of characters per species number (eg: ..149 is correct, as is ...20 ). "F5.0" defines the number of characters per presence identifier
(..1.0 is the correct 5 characters). Remember to close all parentheses.
3rd line: State again the number of couplets per line.
4th line-19th line (or more of course): This is your data matrix. Again, include only the species present at a locality. It is fine to extend into additional lines if you have more than 7 (in this example) species for a locality, but you need to start the line with your "I3" identifier of the locality (see example, localities 8,9,13). Remember to end each line with a [Return]. I recommend using a font such as Courier which lines up perfectly; it is then easier to see if you have made spacing mistakes.
20th line: Always end your data matrix with a 0 (zero) then a [Return].
lines 21-40: This is where you define your species--count across to find out what each number refers to. You can have extra definitions in, but make sure your highest species number agrees with the highest position in the matrix.
lines 41-42: This is where you define your localities; in this example, locality 8 is actually climate station 551. For both definition matrices remember to use 8 characters and only 10 identifiers per line (ending each line with a [Return]).
5.Remember: Canoco is picky! If you are one space off, it will try to run the analysis with completely wrong numbers and sometimes succeed, giving you an ordination of something completely bizarre. Take the time to do this part correctly, because it's not so easy to see the mistakes after they've been made.
6. Save the file as "Text".
II. RUNNING CANOCO
Canoco has a dialogue format, where it asks various questions and you give the answers. For the most part, it's fairly straightforward, but I have provided below the best answers to give for a correspondence analysis, questions in italics and answers in bold.
Type 0 for input from screen
RETURN
Type 1 for long dialogue
1
Type 1 for changing maximum data size
RETURN
Type name of file with species data
CANOCO.SPE
hit RETURN which pulls up a menu, then open your file as you normally do in a Mac program
Type name of file with covariables
RETURN
Type name of file with environmental data
RETURN
Type name of print file
CANOCO.OUT
type in "your choice of names".out
Type name of solution file
CANOCO.SOL
type in "your choice of names".sol
Type of analysis:
4
Scaling of ordination axes
-1
Species and sample diagnostics
0
Enter number of samples to be omitted
RETURN
Transformation of species data
RETURN
Type weight to be given to species
RETURN
Type weight to be given to samples
RETURN
Weighting of species required?
RETURN
it then prints out lists of numbers to your files and then says...
Press RETURN for more, S to skip...
RETURN
Output option for
spec-scor samp-scor
2 2
RETURN
Type 0=stop
RETURN
it should say "successful completion of run"
so hit RETURN to get out of the program.
If you have any problems during the run, "Apple key" "." gets you out.
III. USING THE OUTPUT
The ??.out file gives you a log of the dialogue you've just had and it tells you the variance per axis; it's good to keep on record.
The ??.sol file gives you the first 4 axes' scores for both species and localities; I've mostly worked with the locality plots but the species scores may be very useful in plots of fossil data.
Open both of these files in Microsoft Word and use a small font for them to make sense, stay away from the Canoplot program because it's really unwieldy.
What I tend to do at this point is to take the axis scores into a spreadsheet like Excel, and manipulate it from there. We have various ways of graphing the axes or you can find your own method.
- CANOCO for Macintosh, PC or mainframe systems may be obtained from Richard E. Furnas, Microcomputer Power, 111 Clover Lane Ithaca, N.Y. 14850. Tel: 607-272-2188. Fax: 607-272-0782.
References:
Gauch, Jr., H.G. 1982. Multivariate analysis in community ecology. In: Beck, E., Birks, H.J.B. & Connor, E.F. (eds) Cambridge studies in ecology. Cambridge University Press, New York, 298 p.
Hill, M.O. 1979. Correspondence analysis: a neglected multivariate method. Applied Statistics, 23, 340-354.
Shi, G.R. 1993. Multivariate data analysis in paleoecology and paleobiogeography - a review. Palaeogeography, Palaeoclimatology, Palaeoecology, 105, 199-234.
Ter Braak, C.J.F. 1992. CANOCO - a FORTRAN program for canonical community ordination. Ithaca, N.Y. Microcomputer Power. 95pp (plus software version 3.11, Nov. 1990).
|