TFQA: Tools for Quantitative Archaeology - Statistical Analysis Software for Archaeology

TFQA: Tools for Quantitative Archaeology - Statistical Analysis Software for Archaeology TFQA

TFQA: Tools for Quantitative Archaeology
kintigh@tfqa.com +1 (505) 395-7979

TFQA Home
TFQA Documentation
TFQA Orders
Kintigh (ASU Directory)

TWOWAY: Analysis of 2-Dimensional Contingency Tables

TWOWAY provides tests of independence and measures of association and prints tables that have been standardized with a number of techniques. Standard Chi² and G tests of independence are provided. Using Monte Carlo methods, Chi² and G tests can be performed on tables with very small expected counts. Chi² and G goodness of fit tests (with externally determined expected values) can also be calculated. Measures of association include Yule's Q, Phi, Cramer's V and proportion reduction of error measures Tau and Lambda. Table standardization methods include median polish and Mosteller (multiplicative) standardization as well as Haberman's z-score standardization for independent variables and Allison's binomial probability-based z-score standardization. It will also print row, column, and cell percents, Chi² cell contributions, and Chi² expected values. Twoway is generally intended for exploratory analysis of relatively small two way tables. The G and Chi² statistics for very large tables (df>50) have not been verified and on large tables, Mosteller Standardization may not be possible. Check for uniformity of row and column marginals to see if it has been able to obtain a Mosteller standardized table. Program limits are 180 rows and 25 columns.

The TWOWAYCSV program, not separately documented,takes CSV input and will process even larger problems.

SEQUENCE OF PROGRAM PROMPTS

Input File Name (CON for Keyboard) {.ADF} ?
Reading x Rows and y Columns

Read Row Label File {N} ?
Row Label File Name {TWOWAY.ARL} ?

Read Column Label File {N} ?
Column Label File Name {TWOWAY.ACL} ?

Program Output (CON for Screen, PRN, or <filename>) {TWOWAY.LST} ?

Requests name of input data set in Antana form and inquires if there are row or column label files that can be used in labeling the printout. Then it requests the name of the file where program output is to be written.

Print Tables of [R]ow, [C]olumn, C[E]ll, [A]ll Percents or [N]one {N} ?

The program will print labeled tables of percents, row percents, column percents, cell percents, all three or none. Separate tables are produced for each.

Monte Carlo Estimate Probability of G & ChiSq >= Observed {N} ?

Number of Trials {100} ? 500
Fix [N]either, [R]ow, [C]olumn, [B]oth {N} ?
xx/yyy xx/yyy

It is well know that the ?² test does not provide a reliable test for tables that contain low expected values; a common rule of thumb warns against expected values less than 5. Similar problems apply to the use of the G (sometimes called G²) statistic. However, as Sokal and Rohlf (1981: 698-699) point out, it is possible to use Monte Carlo procedures to produce an estimate of the probability of the observed distribution relative to any of four assumptions concerning the marginal totals of the data.

Perhaps most often in archaeology [N]either the row nor column marginals are fixed by the method of data collection. Thus, if one looks at all of the sherds in a level by form and temper type (as observed under a hand lens), the marginal totals are, simply whatever they turn out to be. This is Sokal and Rohlf's (1981: 735) Model I. Reply "N" in this case.

It is sometimes the case that either the [R]ow or [C]olumn marginals are fixed by the data collection. Thus, one may choose 100 bowls and 50 jars sherds for petrographic analysis of the temper. This case is Sokal and Rohlf's Model II. Reply R or C depending on which marginal is fixed.

Data collection is sometimes designed so that [B]oth Row and Column marginal totals are fixed (Sokal and Rohlf's model III; this is hard to imagine in archaeology). Monte Carlo estimates are not needed in the 2x2 case since the Fisher's Exact test will provide exact probabilities, whatever the expected cell values are. (You can verify this by running the program). If you have a case like this, choose B.

For tables with total counts some Monte Carlo analyses, especially those executed under Model III, may be very time consuming. A warning to this effect will be printed. The progress of the Monte Carlo Analysis is reported by the xx/yyy values. The first represents the number of Monte Carlo trials with observed ?² and G values (with Williams correction applied), respectively greater than those observed in the actual data compared with the number of trials.

Compute Chi Square Based on External Expected Values {N} ?
File With Expected Frequencies (CON for Keyboard) {TWOWAY.ACL} ?
Reading 2 Rows and 5 Columns

To do a goodness of fit test it is necessary to compare an actual table with expected frequencies derived from some external hypothesis (e.g. genetically-based) about the nature of the relationship. In this case, one would be testing for a significant difference between the expected, based on this extrinsic hypothesis and the observed values. (The expected values, obtained in the ordinary way, based on the marginals are not the only possible expected values.) If such a test is required, specify the file containing a table of expected counts in Antana format. Of course, it must have the same number of rows and columns as the observed table. Significance his evaluated with ?² and G tests.

Sample Sizes Equalization Can Done for Margin Sensitive Procedures
Adjust Sample Size [R]ow-wise, [C]olumn-wise, [B]oth or [N]either {N} ?

Some measures of association are margin-sensitive, that is values will vary depending solely on the marginal totals making comparisons between tables difficult. In some cases, it may make sense to equalize the marginal totals row-wise (or column-wise), through multiplying each row (or column) by the number that would make its marginal total equal 100. Thus each row (or column) is standardized to a sample size of 100. Requesting "Both" does the two standardizations in sequence. Note that this is usually not a sensible thing to do.

Mosteller Standardization of Table
?[R]ows, ?[C]ols, ?[T]able, [E]xpected Cell= 1.0, [S]kip or [Q]uit {E} ?

What I am calling Mosteller-standardized tables (Mosteller 1968) can be displayed in a number of completely equivalent ways. The [R]ows or [C]olumns may be standardized to sum to 1, the standardized [T]able cells can sum to 1, of the Expected cell values can be set to 1. I prefer the last option. [S]kip skips the Mosteller standardization, [Q]uit exits the program immediately.

[A]nalyze Another Table or [Q]uit {Q} ?

One may analyze more than 1 table in a single run. Reply A to continue.

SAMPLE INPUT

#Faunal Recovery by Screen Size#
2 #Sites (Rows) 1=some 1/8" 2=all 1/4"#
5 #Classes (Columns 1=Rabbit 2=Artiodactyl 3=Lmammal 4=SMammal 5=Other#
5 4 12 8 3
5 7 33 4 6

SAMPLE PROGRAM OUTPUT

Input Table
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"           5           4          12           8           3          32
1/4"           5           7          33           4           6          55
 SUM          10          11          45          12           9          87

Cell Percents, Marginal Percent Totals
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"         5.7         4.6        13.8         9.2         3.4        36.8
1/4"         5.7         8.0        37.9         4.6         6.9        63.2
 SUM        11.5        12.6        51.7        13.8        10.3       100.0

ChiSquare Expected Frequencies
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"         3.7         4.0        16.6         4.4         3.3        32.0
1/4"         6.3         7.0        28.4         7.6         5.7        55.0
 SUM        10.0        11.0        45.0        12.0         9.0        87.0

ChiSquare Cell Contributions
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"         0.5         0.0         1.3         2.9         0.0         4.7
1/4"         0.3         0.0         0.7         1.7         0.0         2.7
 SUM         0.8         0.0         2.0         4.6         0.0         7.4

ChiSq      =     7.39  ChiSq Prob. =  0.117  D.F. =   4
PhiSq      =    0.085  Phi         =  0.291
Cramer's V =    0.291
G          =     7.24  ChiSq Prob. =  0.124
G(Williams)=     6.52  ChiSq Prob. =  0.143

Here various tests of independence and measures of association are reported. ChiSq (?²) and G are tests of independence. Sokal and Rohlf argue persuasively that G is superior to the ?² test. The G(Williams) is corrected using Williams correction (Sokal and Rohlf 1981: 744-745. The adjustment to G is small when the sample size is large. PhiSq (?²) and Cramer's V are measures of strength of association.

  Est. Prob of ChiSq >= Obs =  0.146  (500 Trials)
  Est. Prob of G >= Obs     =  0.186  (500 Trials)  Time:   11.3 Sec.
  (Precise Proportions Used)

Results of Monte Carlo estimate of ?² and G significance. The fraction of the Monte Carlo trials has a ?² or Williams-corrected G greater than the observed value. The precise proportions comment has to do with internal methods of calculation.)

Lambda and Tau Coefficients
  Lambda|C   =  0.125  Error|Row % =  0.368  Error|Cell % =  0.322
  Tau|C      =  0.085  Error|Row % =  0.465  Error|Cell % =  0.426
  Lambda|R   =  0.000  Error|Col % =  0.483  Error|Cell % =  0.483
  Tau|R      =  0.030  Error|Col % =  0.674  Error|Cell % =  0.653

Lambda (?) and Tau (?) are proportional reduction of error (PRE) measures of association (Reynolds 1977: 32-45). These measures row or column standardization may be helpful (but be sure you understand what you are doing). Basically, these measures are based on the general formula

             Probability of Error under Rule 1 - Probability of Error under Rules 2
PRE Measure= ----------------------------------------------------------------------
                                 Probability of Error under Rule 1

Where the probability under rule 1 constitiutes the probability of properly guessing the value of an arbitrary observation on the row (or column) variable with no information about column and Rule 2 is the probability of a correct guess knowing the value of the column variable. If the row variable is somewhat dependent on the column variable, then there is some association. If the variables are independent, then the value will be 0. Goodman and Kruskal's Lambda (?) and Goodman and Kruskal's Tau (?) both use this general concept but operationalize it somewhat differently. Assume that we have converted the table to a set of cell proportions (adding to 1) with total row and column marginal proportions, again, adding to 1.

For ?, rule 1 is to guess the row with the largest marginal proportion p, so the proportion of errors with this guess is 1-p. Rule 2 is to guess, for the given column, the row with the largest proportion in that column. The total error is then 1- the sum over the columns of the largest proportion in that column. In the output Lambda|C represents a .125 reduction of error in guessing the row given a knowledge of the column. The proportion of errors based simply on the row proportions, labeled Error | Row %, is 1-.632=.368 (the largest marginal row proportion--see the table of cell percents). Given a knowledge of the column the proportion of errors, labeled Error | Cell %, is (1-.057-.080-.379-.092-.069=.322. The proportional reduction of error is (.368-.322)/.368=.125. To guess the columns and the columns given the rows rather than the rows you simply substutute the rows for the columns and vice versa in the above description. The answer in general will be different. If the variables can be thought oif independent and dependent, you probably want the measure | independent variable. That is, if the independent variable is the column variable you are probably looking for Lambda|C or Tau|C not the Lambda|R or Tau|R

For ?, the idea is similar but the rules are a bit more complex. They are described by Shennan (1997:118-121) and, as above, in Reynolds (1977: 32-45).

Median Polished Table
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"         1.5         0.0        -9.0         3.5         0.0        -1.5
1/4"        -1.5         0.0         9.0        -3.5         0.0         1.5
 SUM        -0.5         0.0        17.0         0.5        -1.0         5.5

Median polish of data as discussed by Lewis 1986. This is an EDA based technique using medians to standardize a two way table. The values are read as deviations from expectations.

Mosteller Standardized Matrix
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     Smammal       Other         SUM
1/8"        1.16        0.88        0.67        1.47        0.82        5.00
1/4"        0.84        1.12        1.33        0.53        1.18        5.00
 SUM        2.00        2.00        2.00        2.00        2.00       10.00

Mosteller standardization (Mosteller 1968) presents deviations from expectations based on a different model. Here the table is manipulated by sequentially multiplying a values in individual row and column values by constants with the goal of equalizing the marginal totals. (This is more useful than it might sound.) The result is that values indicate multiplicative deviations from equalized expected values, in this case 1.0. Thus you get fewer rabbits and small mammals in the larger screen than you would expect.

More on Mosteller Standardization and Median Polish

Basically Median Polish is an additive way to standardize tables and Mosteller Standardization is a multiplicative one. Lets take Mosteller Standardization first. This is a technique championed by Albert Spaulding to his students. Assume that you have a table of counts where rows represent proveniences and columns pottery types. There is a sense in which the interaction in the table (that is, the relationship between the variables) isn't changed if you multiply all values in a row by a constant. This is equivalent to doubling the sample size from a provenience, which shouldn't change the relationship between the variables. It is perhaps less obvious, but the for essentially the same reason, interaction is also preserved by multiplying all values in a column by a constant. What Mosteller Standardization does is go through an iterative procedure consisting of multiplying all values in each rows and column by (different) constants until all the expected cell values are 1 (this is the default option in my program). Thus, from an interaction standpoint, anything more than 1 is more than expected and anything less than 1 is less than expected. The resultant table thus reveals something about the multiplicative structure of the relationship between the variables. Thus, a value of 6 is seen as 3 times bigger than a value of 2. (As an aside, this can be done with multiway tables (3 or more way tables), though not by this program.)

Median polish is explored in the Lewis article cited above (to get the idea, don't get bogged down in the mechanics of how this is done). Here, the idea is to decompose a 2 way table into 4 components (called "effects"): table, row, column, and cell effects. This is done by adding (or subtracting) constant values from all cells in a row and column where the values are based on the row or column medians (hence the name). The net effect of this iterative procedure is the median polished table that contains all of the information in the original table re-allocated into these 4 effects. The table effect is displayed in the lower right of the table (where the table total n usually appears), the row and column effects are in the locations where the row and column totals usually appear, and the cell effects are in the individual cells. All the numbers are interpreted as counts and, amazingly enough, if you start with the median polished table you can reconstruct the original counts by adding to the each cell effect, the table effect and the row and column effects for the row and column in which the cell appears. (In Mosteller Standardization, you have no way to go back to the original.) Clearly, some of these values are going to have to be negative to make all this add up. In a not very precise sense, you'd interpret the negative and positive values as, less or more than a median derived value. In this additive model of table composition, a value of 6 is to be interpreted a 4 more (not 3 times) a value of 2.

In contrast, the ChiSquare contributions tells you where the values contributing to a significance come from, but it is less closely related to association or nature of the relationship.

One needs to try this on tables you can conceptualize to see what they really do. There is no really straightforward interpretation--because association and interaction in tables is inherently difficult to deal with because it is such a multidimensional problem. Unfortunately, in archaeology, these tables are things that we really need to deal with all the time and neither measure or significance or association are adequate to the task because we are usually interested in more than an aggregate measure; we care about the details.

Allison's Binomial Cell Probability*100 (- for below expected)
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     SMammal       Other
1/8"        30.7        58.0       -13.2         7.4       -57.7
1/4"       -38.8        54.9        17.7       -11.5        50.7

Allison's Z-scores based on Binomial Cell Probability
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     SMammal       Other         SUM
1/8"        0.70       -0.02       -1.24        1.75       -0.17        1.02
1/4"       -0.55        0.02        1.04       -1.36        0.13       -0.72
 SUM        0.16       -0.01       -0.20        0.39       -0.04        0.30

James Allison has suggested a means of looking at cell probabilities based on binomial distribution. Results are provided in two formats, as a cell probability (expressed as a percent) or as a Z score. TWOWAY provides a table of binomial tail probability for each cell in tables with 10,000 or smaller total counts. The binomial probabilities are computed based on the expected proportion, p, in a given cell (in a Chi² sense, rowsum*colsum/total²), the observed count, k, and the total, n, for the table. The binomial tail probability is the probability in n trials of getting a count as or more extreme than the observed, given the expected proportion. The model is that n observations are chosen at random, where the probability of success is p. Here, p is the overall table likelihood of having the characteristics represented by that cell. If k<expected the binomial probability is the probability of, by chance, getting a count such that 0<=count<=k. If k>expected, then the binomial probability is the probability of getting a count such that k<=count<=n. While these probabilities are often interpretively useful, they are not independent of one another.

Binomial probabilities are output as percents (probability * 100). Binomial probabilities associated with the left tail are preceded by a '-' (obviously, these are not negative probabilities), those associated with the right tail are shown as positive. When the expected is exactly the same as the observed, the probability is reported as 100. Thus, a value of -7 (%) means that a count as small or smaller than the observed is relative unlikely given a expected proportion for that cell and the table n. A value of 51 means that given the binomial model, a count greater than or equal to observed would occur about half the time by chance. (A probability of >.5 can occur because both left and right tails include the probability of the observed count.) These probabilities are probably more useful than the Z scores because for low expected counts the binomial distribution is quite asymmetric.

An alternative formulation uses the familiar Z scores to indicate probability for the cells, i.e. standard deviations above or below the expected based on a binomial model of cell probability. The Z-score deviation from the expected is calculated as the observed count less the expected count, all divided by the standard deviation of the binomial distribution with the parameter p, which is ?(Np(1-p)).

Haberman's z-scores for Independent Variables
     COLUMN...
 ROW      Rabbit Artiodactyl     Lmammal     SMammal       Other         SUM
1/8"        0.92       -0.03       -2.03        2.31       -0.23        0.95
1/4"       -0.92        0.03        2.03       -2.31        0.23       -0.95
 SUM       -0.00        0.00        0.00        0.00        0.00        0.00

Page Last Updated: 17 October 2020