TFQA: Tools for Quantitative Archaeology

TFQA: Tools for Quantitative Archaeology
kintigh@tfqa.com +1 (505) 395-7979

TFQA Home
TFQA Documentation
TFQA Orders
Kintigh (ASU Directory)

DIVM: Compute Diversity Measures

Program DIVM computes several different diversity measures for a set of data. The input to the program is a set of counts for some number of observations, for any number of variables (up to 1000). The program, for each observation, computes and lists the value of each diversity measure. The program also creates an output file of the computed diversity measures that can be used for further analysis. The Pascal statements that do the computation of the measures are included in the file DIVMEAS.PAS so that they may be examined and used in other programs.

Each diversity measure computed is not described in detail here; however, references are given. In addition, for each measure, a "scaled" value is included. The scaled value is the value of the measure scaled so that minimum diversity (1 class) has the value 0.0 and the maximum diversity (equal counts for each variable) has the value 1.0 (for the number of classes specified). In many cases, the raw diversity measures do not have this range.

DIVERSITY MEASURES COMPUTED

The program computes the following diversity measures along with scaled values for each measure.

Richness. The number of classes present, often denoted by S.

Brillouin. The Brillouin index, H, is analogous to the Shannon Index for small populations that are fully censussed. (Pielou 1977: 299-301)

Shannon. The Shannon index, often denoted H', is intended for use with populations that are indefinitely large. The Shannon index is computed with common (base 10) logarithms. The scaled value is J'=H'/H'max. (Pielou 1975: 7-8; Pielou 1977: 293-301, 311)

Simpson. The Simpson index, C (or lambda), measures concentration, the converse of diversity. For n categories with proportions p_i, the formula is C=? p_i². In order to obtain a scaled diversity value scaled from 0 to 1, C is first converted to a diversity measure, by subtracting C from 1 and dividing 1-C by its maximum value, 1-1/n. The scaled diversity measure C_s=(1-C)/(1-1/n) has the vaue of when p=1 for one category and p=0 for others and is 1when for all categories p_i=1/n . (Pielou 1975: 8-9; Pielou 1977: 309-311).

Simpson Est. The unbiased estimator of the Simpson index, C (or lambda), given by Pielou. This value indicates concentration, not diversity. The scaled diversity value uses C converted to a diversity measure, 1-C. (Pielou 1977: 309-311)

Renyi. A general function with one parameter (alpha) that produces a continuum of diversity measures, including a function of richness, the Shannon index and a close relative of the Simpson index. For the parameter 0, the function takes on the value log(richness); for the parameter 1, this function is the Shannon index; for the parameter 2, it is takes on the value -log C where C is the Simpson index (the negative log also works to convert concentration into diversity). (Pielou 1977: 309-311)

Delta. This general function of one parameter (beta) produces a continuum of diversity measures, including Richness (minus 1), and the Shannon and (1 minus) the Simpson indices. With the parameter -1, Delta is Richness (expressed as S-1); with the parameter 0 it is the Shannon Index (computed with natural, i.e. base e, logarithms); and with the parameter 1, it is the Simpson index (expressed as 1-C). The larger the parameter, the less sensitive it is to rare classes, thus the less sensitive it is to richness. (Dennis and Patil 1979)

PROGRAM OPERATION

To run the program, have a copy of DIVM.EXE in the default directory (or the path) and type: DIVM<Enter>. You can interrupt the program at any time with <Ctrl><Break>. In general, the program prompts are self explanatory, but they are described in detail below. Detailed information about the general form of the prompts and Antana format data files is provided in the section: "Program Conventions."

PROGRAM PROMPTS

Compute Renyi Values {Y} ?

Renyi - Starting Alpha {0.0} ?

Renyi - Ending Alpha {2.5} ?

Renyi - Alpha Increment {0.25} ?

This sequence of prompts defines the amount of output for the Renyi function. If the reply to the first prompt is N, then the next three prompts will not be issued. If the Renyi function is desired, the default responses will usually be adequate, so Enter can be pressed 4 times.

Compute Delta Values {Y} ?

Delta - Starting Beta {-1.0} ?

Delta - Ending Beta {1.5} ?

Delta - Beta Increment {0.25} ?

This sequence of prompts defines the amount of output for the Delta function. This sequence operates just like the Renyi prompts just described.

File or Device Name for Printed Results {.LST} ?

At this point the program is asking where you want the results listed. Any file (or path) name (without an extension) will put the output in a file with that name and the default extension .LST; for example, the reply MYFILE will put the output in MYFILE.LST. To list the results on the PC screen (a case at a time) reply CON (CON, for console is the device name of the PC screen). If the specified file already exists, the program will ask you before it overwrites it. If you have your printer hooked up and turned on, replying PRN will send the output directly to the printer.

File or Device Name for Input Counts ?

Here the program is asking where to find the data. At this point, the reply CON will prompt for data from the keyboard; the reply of a file name will read that file. If you reply CON, the program will ask you:

Number of Rows (Observations) ?

Number of Columns (Variables) ?

To these prompts type the number of observations and variables that you have. You may enter data 1 observation at a time, simply by entering 1 to the Number of Rows prompt. If you have told it that you will be reading 2 rows with 4 variables each, you will see:

Enter 8 Values Followed by <Enter>

At this point you can enter the four counts for the first case followed by the four counts for the second case. You can enter these on any number of lines, separated by blanks, commas, or tabs. Be sure to hit Enter or <Ctrl>Z after you finish typing the last value.

If you reply with a file (or path) name, the file must be in the free format with the conventions used by the Antana statistical package. The first two numbers in the file are the number of rows and columns of data, respectively. The next nrow*ncol numbers are the counts, read observation by observation. The numbers in the file can be separated by any number of blanks, commas, and tabs, and may be on any number of lines (see "Program Conventions" section).

File or Device for Output Matrix {.DIV} ?

The program will place output results in a data file that can be used by another program for further analysis. Answer this prompt with the name of the file (default extension .DIV) in which this output should be placed, or NUL if you do not want an output file created. The output file is in a free format following the conventions used by Antana (described elsewhere). It consists of a header that lists the number of rows and columns in the file, the name of the input file, and a description of the output columns (variables). For each observation (row) the output file includes the observation number and all requested diversity values.

Read Another Set of Counts {Y} ?

The program will do more diversity computations, if you wish. If you reply N, the program will stop. If the reply is enter or Y, then the program starts again at the "File or Device for Input Counts" prompt. The printed output will continue to go in the file originally specified, but new files for input and output matrices are requested. Using this option and specifying the input and listing files as CON, you can easily experiment with the program interactively.

PRINTED OUTPUT

Diversity Measures for Distribution  -  Row 1  -  Total    12
     2     4     1     5

                Unscaled  Scaled                     Unscaled  Scaled
Richness          4.0000  1.0000     Simpson Est.      0.2576  0.9074
Brillouin         0.4155  0.8884     Simpson           0.3194  0.9074
Shannon           0.5371  0.8921     

Renyi alpha     Renyi  Scaled     Renyi alpha     Renyi  Scaled
       0.00    0.6021  1.0000            0.25    0.5836  0.9693     
       0.50    0.5665  0.9409            0.75    0.5510  0.9152     
       1.00    0.5371  0.8921            1.25    0.5247  0.8715     
       1.50    0.5138  0.8533            1.75    0.5041  0.8373     
       2.00    0.4956  0.8232            2.25    0.4881  0.8107     

Delta  beta     Delta  Scaled     Delta  beta     Delta  Scaled
      -1.00    3.0000  1.0000           -0.75    2.3193  0.9514     
      -0.50    1.8395  0.9198           -0.25    1.4930  0.9011     
       0.00    1.2367  0.8921            0.25    1.0428  0.8901     
       0.50    0.8930  0.8930            0.75    0.7750  0.8992     
       1.00    0.6806  0.9074            1.25    0.6037  0.9167

OUTPUT FILE

#Rows# 2  #Cols# 71  #Input File DIVM.TST#
#Description of Col: 1=Row 2,3=Richness 
#  4,5=Brillouin 6,7=Shannon 8,9=Simpson Est. 10,11=Simpson
#  12-41=alpha, Renyi, Scaled Renyi  - for each alpha
#  42-71=beta,  Delta, Scaled Delta  - for each beta
       1  4.0000  1.0000  
  0.4155  0.8884  0.5371  0.8921  0.2576  0.9074  0.3194  0.9074
  0.0000  0.6021  1.0000  0.2500  0.5836  0.9693  0.5000  0.5665  0.9409
  0.7500  0.5510  0.9152  1.0000  0.5371  0.8921  1.2500  0.5247  0.8715
  1.5000  0.5138  0.8533  1.7500  0.5041  0.8373  2.0000  0.4956  0.8232
  2.2500  0.4881  0.8107
 -1.0000  3.0000  1.0000 -0.7500  2.3193  0.9514 -0.5000  1.8395  0.9198
 -0.2500  1.4930  0.9011  0.0000  1.2367  0.8921  0.2500  1.0428  0.8901
  0.5000  0.8930  0.8930  0.7500  0.7750  0.8992  1.0000  0.6806  0.9074
  1.2500  0.6037  0.9167
       2  4.0000  1.0000  
  0.4307  0.8881  0.5364  0.8910  0.2667  0.9126  0.3156  0.9126
  0.0000  0.6021  1.0000  0.2500  0.5820  0.9666  0.5000  0.5644  0.9375
  0.7500  0.5493  0.9124  1.0000  0.5364  0.8910  1.2500  0.5254  0.8727
  1.5000  0.5160  0.8570  1.7500  0.5079  0.8436  2.0000  0.5009  0.8320
  2.2500  0.4949  0.8219
 -1.0000  3.0000  1.0000 -0.7500  2.3093  0.9473 -0.5000  1.8304  0.9152
 -0.2500  1.4877  0.8979  0.0000  1.2351  0.8910  0.2500  1.0440  0.8911
  0.5000  0.8958  0.8958  0.7500  0.7787  0.9034  1.0000  0.6844  0.9126
  1.2500  0.6075  0.9224

Page Last Updated: 21 June 2022