ABSTRACT: The high-resolution estimates of temporal
mixing in shell beds: the evils and virtues of time-averaging.
Abstract. ---- This study explores time-averaging (temporal mixing) at very high sampling resolution: that of adjacent shells collected from the same stratum. Nine samples of the bivalve Chione fluctifraga were collected from four Holocene cheniers (beach ridges) on the Colorado Delta (Gulf of California) and 165 shells were dated using radiocarbon-calibrated amino-acid racemization (D-alloisoleucine/L-isoleucine). The age range of shells within samples averages 661 years and, in seven out of nine samples, exceeds 500 years. The sample standard deviation ranges from 73 to 294 years and averages 203 years. Thus, even within-sample estimates of time-averaging indicate extensive temporal mixing in bioclastic deposits. No matter how carefully collected, data from shell beds may not be suitable for studying processes on time-scales shorter than hundreds to thousands of years. Comparison of our data with the estimates obtained from other cheniers at coarser sampling resolutions, indicates that pooling of samples drastically increases time-averaging in paleontological data. Time-averaging is homogeneous among strata within cheniers, but varies among cheniers. Thus, deposits of seemingly identical origin may vary in their temporal resolution -- apparently comparable shell beds may differ in paleontological patterns (e.g., species diversity) due to cryptic variation in time-averaging. Age-distributions of dated shells indicate that, at 50 year resolution, the samples provide a continuous and uniform record for the entire interval. The incompleteness observed in the samples can easily be simulated by sampling a 100%-complete, uniform record. The mean sample completeness of the actual samples (63.6%) is very close to that predicted by the simulations (67.3%). Shell beds can record the optimal type of time-averaging, where paleobiological data are a time-weighted average of the faunal composition from the spectrum of environments that existed during the entire interval of time. Coordinated stasis may reflect a long-term averaging of taxa from a similar spectrum of environments, and not necessarily ecological locking. Also, within the range of radiocarbon dating, shell beds can provide a 100%-complete, high resolution record.
Michal Kowalewski. Institute of Paleobiology, Polish Academy of
Sciences, Twarda 51/55, 00-818, Warszawa, Poland. E-mail:
michael.kowalewski@uni-tuebingen.de
Glenn A. Goodfriend. Geophysical Laboratory, Carnegie Institution of
Washington, 5251 Broad Branch Rd., NW., Washington DC 20015
Karl W. Flessa. Department of Geosciences, University of Arizona, Tucson
AZ 85721
Introduction
Shell beds and shell-rich deposits are one of the primary sources
of paleontological data in the Phanerozoic fossil record. These deposits,
however, typically undergo extensive temporal mixing (time-averaging)
during their formation (e.g., Walker and Bambach 1971; Peterson 1977;
Staff et al. 1986; Wilson 1988; FŸrsich and Aberhan 1990; Kidwell and
Bosence 1991; Kidwell and Behrensmeyer 1993; Kidwell and Flessa 1995;
Kowalewski 1996a). In the last decade, high-resolution dating methods have
been used to quantitatively estimate the extent of time-averaging. These
intensive studies have shown that temporal mixing on the scale of hundreds
to tens of thousands of years is the rule rather than the exception in
marine, lacustrine, and terrestrial deposits (Behrensmeyer 1982;
Goodfriend 1987; Cohen 1989; Powell and Davis 1990; Flessa et al. 1993;
Goodfriend and Mitterer 1993; Flessa and Kowalewski 1994; Wehmiller et al.
1995; Goodfriend and Gould 1996; Martin et al. 1996; Anderson et al. 1997;
Meldahl et al. 1997).
However, due to the high cost of dating, most studies have been
based on a small number of dates, while the few larger datasets had low
sampling resolution, being either literature compilations (e.g., Flessa
and Kowalewski 1994) or based on data collected over larger areas (e.g.,
Meldahl et al. 1997). Thus, our understanding of time-averaging is itself
biased by averaging caused by pooling dates from different samples, sites,
and environments (Ôanalytical time-averagingÕ [see FŸrsich and Aberhan
1990; Behrensmeyer and Hook 1992]). Consequently, we still lack
information about time-averaging at the highest and most fundamental
resolution: that of a collection of fossils from a single sample of a
minimal stratigraphic span (i.e., confined to the smallest indivisible
stratigraphic unit that can be distinguished in the outcrop). How much
time-averaging is there likely to be within the bag or slab of fossils the
paleontologist brings back from the field and uses as the finest sample
unit in subsequent analyses?
We used extensive radiocarbon-calibrated amino-acid racemization
dating of mollusk shells from Holocene shelly deposits, to assemble a
large dataset that offers statistically meaningful insights into the
scale, variation, and internal structure of time-averaging at the highest
sampling resolution -- that of adjacent shells from within the same
stratum.
Material and Dating Technique
Study area and sampling. ---- We studied temporal resolution
in the bioclastic beach ridges, or ÓcheniersÓ, from the tidal flats of the
Colorado River Delta (Fig. 1). These cheniers are lag-concentrations
(sensu Kidwell 1991) formed through the reworking of the intertidal
mudflat during episodes of low sediment input from the Colorado River
(Thompson 1968; Kowalewski et al. 1994). The most recent episode -- caused
by diversion of the river for irrigation, power, and flood control --
began 100 years ago (Fradkin 1984). As a result, cheniers have been
forming in the upper intertidal zone. Older cheniers, situated landward in
the supratidal flats, correspond to previous episodes of mudflat reworking
caused by natural diversions of the river to the Salton trough (Thompson
1968; Kowalewski et al. 1994; Goodfriend et al. 1995).
This study focuses on a single species, the venerid bivalve
Chione fluctifraga. This is because we have already developed a
reliable, efficient, and inexpensive dating technique specifically for
dating the shells of this species (see below for details) and because this
is one of the two most common mollusks found in the cheniers (Kowalewski
et al. 1994).
Data were obtained from four cheniers situated in the central part
of the lower delta (Fig. 1B): Chenier 1 is situated in the upper
intertidal, and Cheniers 2, 3, and 4 are increasingly older ridges, partly
buried within the supratidal muds (Fig. 1C). Nine samples were collected
at various depths from five trenches excavated in the four cheniers (Fig.
1C) and 165 complete valves of C. fluctifraga were dated using
amino acid ratios. Except for sample 3-150 with 7 individuals, all samples
included from 18 to 21 dated valves (Table 1).
Each sample was collected from well-exposed trench walls by
hand-picking C. fluctifraga valves (articulated bivalve shells are
very rarely found in the cheniers [Kowalewski et al. 1994]). We collected
directly adjacent valves laterally, that is parallel to the sedimentary
layering. In those cases when neither stratification was visible nor
depositional dip indicated, we collected specimens from the same depth in
the trench. This collection method minimizes the stratigraphic and spatial
span of a sample. Out sampling technique is likely to be the
finest-resolution stratigraphic sampling available to macroinvertebrate
paleontologists.
Dating technique. ---- Each valve was analyzed for
its A/I (alloisolucine/isoleucine) ratio and that value was used to
estimate its age using a calibration equation based on radiocarbon ages
(Fig. 2) (see Goodfriend 1989 for a discussion of this method). A/I values
were determined by HPLC (high-performance liquid chromatography), using
peak area ratios calibrated against a standard A/I mixture (see Goodfriend
et al. 1997 for details of procedures). Samples were taken from the hinge
area of each shell to avoid intrashell variation in racemization
(Goodfriend et al. 1997). Shells collected from the surface were excluded
from this analysis because of possible differences in racemization rate
due to surface heating. No differences in rates were found in between
samples buried at various depths from 20 to 150 cm (Goodfriend et al.
1995).
The age of shell (in years) is estimated from the equation: Age =
(A/I-0.008)*10741, where 0.008 is the mean A/I value measured in
live-collected, pre-bomb Chione shells (based on eight analyses),
and 10741 is the 14C-calibrated racemization rate (Fig. 2). This rate is
the slope derived from simple linear regression of 14C ages against A/I
values for 15 individual Chione shells from Chenier 3, plus a point
representing living Chione (based on their mean apparent 14C age
[Goodfriend and Flessa submitted] and their mean A/I value). The
measurement error for A/I values is ²4% and thus can be ignored as a
significant source of age variation among shells (i.e., the apparent
time-averaging caused by measurement error is a few tens of years, at
most). Of 165 dated shells, 98.8 % came from the last 1,500 years, and
only two were significantly older (3,416 and 7,379 years). The age
estimates for these two outliers are uncertain because the radiocarbon
calibration was based on extrapolation of the racemization rate determined
from much younger shells (Fig. 2). The two shells were excluded from the
analysis; this makes our estimates of time-averaging more conservative.
The raw data including A/I values and the corresponding age estimates are
listed in the Appendix.
Analytical Methods and Results
Scale of time-averaging. ---- Statistically,
time-averaging is the dispersion of an age-distribution, and thus, can be
best estimated using dispersion measures. Previous workers used the age
range between the youngest and oldest shell to estimate time-averaging
(e.g., Flessa et al. 1993, Flessa and Kowalewski 1994; Meldahl et al.
1997). However, the range is an estimate that is very sensitive to sample
size, is based on extreme outliers, and difficult to handle statistically.
Time-averaging has also been estimated using shell half-life: the amount
of time needed to remove 50% of shells present initially (Cummins et al.
1986; Meldahl et al. 1997). However, this measure, based on a best-fit
exponential curve for the age-frequency distribution, assumes a continuous
input of shells through time (Meldahl et al. 1997) and, more importantly,
is sensitive to the resolution (binning) at which the data are
analyzed.
We report the age range, to enable comparison with previous
studies, but focus our analysis on the standard deviation (SD), a measure
of dispersion which largely avoids the problems of the range or half-life
approach and can be interpreted literally as the average departure of a
shellÕs age from the mean shell age. Note that it is appropriate to use
SD, and not the coefficient of variation (CV), because the dispersion
(time-averaging) is independent from the mean (average shell age):
time-averaging is not a function of the age of the deposit (except,
perhaps, at the macroevolutionary time-scale [Kidwell and Brenchley 1996;
Kowalewski 1996a]). Thus, an increase in the stratigraphic age should be
viewed as an additive transformation which shifts a distribution toward
higher values but does not affect its dispersion parameters (as do the
shifts of the mean caused by multiplicative transformations, for
example).
The confidence intervals around the SD were estimated using a
balanced bootstrap (see Hall 1992; Kowalewski 1996b: fig. 2). We used
bootstrapping because it avoids the assumptions of parametric tests (e.g.,
the form of the sample distribution), offers often more power than other
non-parametric tests, and allows the researcher to customize statistical
parameters and tests according to specific needs (see Diaconis and Efron
1983; Manly 1991). Each original sample was resampled with replacement
5,000 times (the pilot bootstrap runs showed that the estimates of SD
stabilized around 4,000 iterations). The SD was calculated for each
bootstrap sample and 0.5, 2.5, 97.5, and 99.5 percentiles of the resulting
sampling distribution were used to estimate 95 and 99 confidence intervals
around each standard deviation (Ônaive bootstrapÕ [Efron 1981]). The
bootstrap estimates showed a small bias (around 10 years) toward lower
SD-values (i.e., the means of the bootstrap distributions were slightly
smaller than the actual estimates of SD). The bias -- a common problem in
bootstrapping non-normally distributed parameters (Manly 1991) -- was
corrected by standardizing the mean standard deviation of the bootstrap
distribution to the standard deviation of the original sample. Note that
the bias correction could be further improved by complex,
computer-intensive methods such as accelerated bias correction (e.g.,
DiCiccio and Romano 1988). This seemed superfluous here, however, given
that the bias is so small that it would not have had any effect on our
interpretation even if no correction were applied.
When expressed as a range, time-averaging varies in our samples
from 190 to 1060 years with a mean sample range of 661 years (Table 1).
The standard deviation varies among samples from 50 to 294 years with a
mean of 203 years. Thus, the average shell from a chenier sample differs
by Å200 years from the mean sample age. The confidence intervals around
the SD (Table 1) indicate that the SD is significantly larger than zero in
all samples.
Scale of Analytical Time-Averaging. ---- Flessa and
Kowalewski (1994) compiled radiocarbon dates from the literature to
estimate time-averaging in nearshore and shelf environments. One of their
datasets included estimates for 49 cheniers from all over the world (for
data summary and literature sources see Flessa and Kowalewski 1994). Those
estimates were all affected by pooling of samples: i.e., shells used to
calculate each estimate were not all from a single sample but came from
different sites or strata. Nevertheless, the pooling was limited, because
the shells were typically collected from a single chenier or a single
chenier series. Because our data are unaffected by sample pooling, we can
compare them with those of Flessa and Kowalewski (1994: table 3) to test
the hypothesis that pooling of samples increases levels of time-averaging
in the data.
The result confirms the expectations (Fig. 3). Mean age range
based on 49 localities (3,289 years) is almost five times higher than the
value of 661 years obtained for our nine samples (Figs. 3A, 3B). Note that
the difference may reflect unequal sample sizes: given the right-skewness
of the distribution (Fig. 3A), the arithmetic mean will tend to decrease
for small samples because such samples are less likely to include
observations from the tail (e.g., Fig. 3B). Nevertheless, the one-tailed,
two-sample bootstrap test indicates that the observed difference is
statistically significant even when this sampling effect is accounted for
(p = 0.0046) (Fig. 3C). Note here that we compared a literature dataset
based on data for cheniers from all over the world with nine samples from
one study area. It is, thus, possible that the observed difference
reflects the fact that the Colorado cheniers are exceptionally little
affected by time-averaging relative to average cheniers. However, the
Colorado beach ridges are a classic example of cheniers (Kowalewski et al.
1994), and thus, analytical time-averaging seems a much more parsimonious
explanation (we do admit: it is also much more exciting) than some unknown
differences between the Colorado and all other cheniers.
The Age-Structure and Completeness of Time-Averaged Samples.
---- The age-frequency distributions offer an insight into the
internal temporal structure of time-averaged samples (Figs. 4A-4I). In
this study, all frequency distribution analyses have been done at a
resolution of 50 years. This is the highest realistic resolution given the
accuracy and precision of amino-acid dating and the size of our samples.
For all samples, the age-distributions are right-skewed (skewness > 0,
Table 1), i.e., older shells are increasingly less frequent. Nevertheless,
all distributions appear continuous: most, or even all, age-classes
between the oldest and youngest shell, contain at least one observation
(Fig. 4).
The age-distributions offer insight into the completeness of the
record encompassed within a time-averaged sample and can be analyzed in
the fashion analogous to estimating paleontological or stratigraphic
completeness (e.g., Sadler 1981; Allmon 1989). The temporal completeness
of a sample can be estimated as the proportion of the time-intervals
containing shells to all the time-intervals included between the oldest
and youngest shell in the sample (see also Kowalewski 1996a: fig. 1). This
definition is analogous to that for temporal paleontological completeness
(Allmon 1989; Kowalewski 1996a). The completeness of samples varies from
41 to 100% with a mean of 63.6% (Table 1). This is remarkable
completeness considering, that, at the average sample size of Å18 and at a
resolution set to 50 years (completeness is a scale-dependent phenomenon
[Allmon 1989; McKinney 1991; Kowalewski 1996a]), gaps due to sampling are
inevitable.
To explore sampling effects rigorously, we simulated
incompleteness by random sampling of a uniform distribution (i.e., the
distribution that simulates 100%-complete and uniformly distributed
record, and thus, provides the most conservative incompleteness
estimates). We performed nine independent simulations. For each of the
original nine samples, with sample size k and observed age-range r, we
drew k observations from the uniform distribution with the range r. For
each simulation, 104 random samples were generated and their completeness
was calculated at a resolution of 50 years. The mean for 104 random
samples, estimates the sample completeness expected for a 100%-complete
uniform record, whereas the proportion of random samples less complete
than the original sample estimates the probability of 100%
completeness.
For eight out of nine samples, the observed incompleteness is
statistically indistinguishable from that expected for a 100%-complete
uniform record (Table 1) and the mean expected completeness for
100%-complete record (67.3%), sampled to the same degree as our chenier
samples, is very close to the observed mean (63.6%) (Table 1).
Variation in Time-Averaging Among Samples. ---- Variation
in time-averaging among samples can be analyzed at two different levels:
among cheniers and within cheniers. Samples vary among cheniers as is
clear both from a visual comparison of the age-distributions (Fig. 4) as
well as from a more rigorous analysis of the confidence intervals around
the SD (Fig. 5, Table 1). Time-averaging is lower in samples from Chenier
1, than in six out of the seven samples from Cheniers 2-4. One sample from
Chenier 4 (4-40), shows an intermediate level of time-averaging.
Samples are very similar within cheniers. For seven out of eight
possible pairwise comparisons (1 for Chenier 1, 6 for Chenier 3, and 1 for
Chenier 4), age-distributions appear very similar visually and are
indistinguishable statistically. With the exception of two samples from
Chenier 4, the confidence intervals around the SD overlap strongly.
Implications
Even within single samples, and even when those samples were
collected to minimize their stratigraphic and lateral span, substantial
time-averaging does occur. In cheniers, mollusk shells are so extensively
mixed temporally that even directly adjacent shells collected from the
same sedimentary layer vary, on average, in age by Å200 years.
Moreover, even at a small sample size of Å18, the age range between the
oldest and youngest shell within sample exceeds, on average, 600 years.
This result is consistent with previous quantitative estimates of
time-averaging, done in a variety of settings at coarser sampling
resolution (e.g., Flessa et al. 1993; Flessa and Kowalewski 1994; Martin
et al. 1996; Meldahl et al. 1997), and suggests that the paleontological
and geochronological limitations caused by time-averaging cannot be
removed by careful sampling. No matter how carefully collected, data from
shelly deposits may not be suitable for studying processes that happen on
time-scales shorter than hundreds to thousands of years (e.g., FŸrsich and
Aberhan 1990; Kidwell and Behrensmeyer 1993; Flessa et al. 1993;
Kowalewski 1996a). Furthermore, single radiocarbon-dated shells, unless
they are found in life position (such shells estimate a depositÕs minimum
age), should not be used to estimate the age of a deposit (Goodfriend
1989).
FŸrsich and Aberhan (1990) and Behrensmeyer and Hook (1992)
pointed out that pooling of data from various samples, outcrops,
localities or regions, can result in Ôanalytical time-averagingÕ. Cheniers
offer an empirical example which shows that even a very limited pooling of
samples (i.e., confined stratigraphically and spatially to single cheniers
or chenier series), can significantly increase time-averaging.
Meldahl et al. (1997) recently showed that time-averaging can vary
substantially among different environments and subsidence settings. Here
we show that time-averaging may vary even among shelly deposits that
formed through essentially identical processes in the same setting. Such
variation most likely reflects changes through time in the
time-averaging-structure of the dead shells in the source area from which
the shelly accumulations are being generated (the intertidal mudflat in
the case of our cheniers). Because many paleontological patterns such as
diversity, morphometric variability, or size-variation can be distorted by
time-averaging (see FŸrsich and Aberhan 1990; Kidwell and Bosence 1991;
Kowalewski 1996a), cryptic variation in time-averaging among seemingly
identical shell beds may cause variation, or even trends, that may be
difficult to identify as artifacts of temporal mixing.
As has been shown previously (Flessa et al. 1993; Flessa and
Kowalewski 1994; Meldahl et al. 1997), the age-distributions of samples
are right-skewed, with older shells being increasingly more scarce. This
reflects the cumulative destruction of shells with time (Flessa and
Kowalewski 1994; Kidwell and Flessa 1995; Meldahl et al. 1997).
Nevertheless, at a resolution as fine as 50 years, chenier samples are
characterized by Ôuniform time-averagingÕ, with all time-averaged
time-intervals equally represented. This extreme case of Ôcontinuous
time-averagingÕ (sensu FŸrsich and Aberhan 1990) has three interesting
implications. First, as pointed out repeatedly (Walker and Bambach 1971;
Staff et al. 1986; FŸrsich and Aberhan 1990; Kidwell and Flessa 1995;
Kowalewski 1996a), time-averaging may be advantageous to paleontologists
because it can eliminate the noise introduced by short-term fluctuations.
This study suggests that some shell beds undergo the best type of
averaging we could ever have hoped for: uniform and continuous, with
shelly mollusks from all time-averaged time-intervals equally represented
in the samples. Thus, the samples can provide information about relative
abundance of shelly taxa weighted by the duration of their presence in the
benthic ecosystems. Second, a uniform, continuous time-averaging may
generate the pattern similar to that of coordinated stasis (sensu Brett et
al. 1996). This is because similar spectra of species generated by
fluctuating environments are being repeatedly time-averaged into
consecutive shell beds (see also Bambach and Bennington 1996). Indeed,
two adjacent generations of cheniers have very similar taxonomic
composition (Kowalewski et al. 1994). This reflects a long-term
time-averaging of benthic associations from a similar range of
environments and not some ecological phenomenon (e.g., ecological locking
[Brett et al. 1996]). Third, there is also an interesting corollary for
Quaternary studies here. When high-resolution dating is employed, some
bioclastic accumulations can provide 100%-complete paleontological record
at a resolution of 50 years, and by this, permit exceptional insights into
the rapid environmental and climatic changes in the late Pleistocene and
Holocene (Flessa et al. 1997).
Final Remarks
We would not argue that our results and their implications are
valid for the entire Phanerozoic and for all types of shell beds. Many
parameters controlling the formation of bioclastic deposits, and even
bioclasts themselves, have changed dramatically throughout the Phanerozoic
(Kidwell and Brenchley 1994, 1996; Kowalewski 1996a). Moreover, even
temporally co-eval shell beds vary in time-averaging depending on a
variety of factors (see Kidwell and Bosence 1991; Kowalewski 1996a, 1997;
Meldahl et al. 1997), and spectacular examples of shell beds little
affected by time-averaging exist (e.g., Boyajian and Thayer 1995).
Nevertheless, we do believe that our results have implications reaching
far beyond Holocene macrotidal lag deposits and that the time-averaging
patterns identified here are valid for many, or even most of, the
mollusk-dominated shell beds, especially for the Cenozoic fossil record.
Four arguments defend the general validity of our results.
First, our estimates are consistent with previous studies done in
other settings, for other types of deposits, and for other bioclast
producers. Second, time-averaging is a function of the availability of old
shells in the depositional system, and thus, any shell bed, regardless of
its mode of formation, will be time-averaged when old shells are common in
the area. In other words, it is not so much important how a given deposit
is formed but rather what is it generated from. If a major storm hit the
Colorado delta, the resulting deposit, even though formed in several hours
rather than several decades, would be made of the same bioclasts that make
up the cheniers. Because old shells are ubiquitous in modern depositional
systems (Flessa and Kowalewski 1994), we can generally expect similar
levels of time-averaging in most of the currently forming bioclastic
deposits. The consistent estimates of time-averaging yielded by studies
done in a variety of settings are, therefore, not so surprising. Third,
our estimates are conservative because we excluded outliers, confined
study to one species, and used conservative analytical methods. In
addition, and perhaps most importantly, the cheniers will likely undergo
further reworking and smearing before getting incorporated into the fossil
record. This means further temporal mixing. Thus, we can expect that the
uniform and continuous nature of time-averaging will be even further
enhanced. Finally, some of the results, especially those on analytical
time-averaging and cryptic variation in time-averaging illustrate some
important phenomena that may be encountered when studying fossil shell
beds, regardless of their age and mode of origin.
The most important conclusion of our study is a paleoecological
one. Our results suggest that even a single sample carefully collected
from one level in a single shell bed is still affected by long-term
time-averaging. Even at the highest spatial resolution, paleoecological
patterns entombed in shelly deposits reflect a long-term record of the
shelly fauna averaged from the spectrum of environments that existed
during some interval of time. The reasoning and models stemming from a
strictly ecological-neontological approach may rarely be justifiable when
studying shell beds.
Acknowledgments
Supported by NSF grants EAR-9405311 to K. W. Flessa and EAR-9405412 to G. A. Goodfriend. M. Kowalewski thanks the Alexander von Humboldt Foundation for financial support and W. Oschmann and J. Nebelsick from the University of TŸbingen for hospitality. We thank J. Nebelsick for useful comments on the manuscript. We are indebted to P.E. Hare for the use of laboratory facilities for racemization analysis. This is publication 26 of the Centro de Estudios de Almejas Muertas (/ceam).
Literature cited
C.E.A.M. research in taphonomy
Comments or questions about this web page should be sent to Michal Kowalewski at: michael.kowalewski@unituebingen.de |