TFQA: Tools for Quantitative Archaeology - Statistical Analysis Software for Archaeology

TFQA: Tools for Quantitative Archaeology - Statistical Analysis Software for Archaeology TFQA

TFQA: Tools for Quantitative Archaeology
kintigh@tfqa.com +1 (505) 395-7979

TFQA Home
TFQA Documentation
TFQA Orders
Kintigh (ASU Directory)

ADFUTIL: Antana Data File Utility

ADFUTIL generates random data sets and manipulates files in the data format used by the analysis programs. It also allows the creation of random data sets of any size. Random variables may be uniform or normally distributed variables with user specified ranges or means standard deviations. ADFUTIL allows the deletion of columns (variables), selective deletion of rows (observations) based on values in a column, replacement of values in a column, randomization of columns for Monte Carlo analysis, the addition of new columns from another data set, and selection of a random sample of cases.

The possible actions are displayed on the screen by the program. The user types in the one letter abbreviation to execute that option. Program prompts should be self explanatory. Multiple manipulations of a single data file can be performed. For most actions, the entire dataset must reside in memory, so the number of values (rows*columns) is restricted to available memory (memavail/8 values). Random datasets of any size can be generated. Using ADFUTIL to do a number of operations at one time is possible, however you should check the output data set to make sure that the program has done what you intended that it do (it will do what you tell it; what you intend may be different).

ADFUTIL OPERATION

The program is initiated by typing: ADFUTIL<Enter>

ADFUTIL Actions

    Arithmetic: Casewise arithmetic
    Concat: Concatenates 2 ADF files, side to side
    Note: files must have same number of rows
    Delete: Deletes columns
    End: Modifications complete; output resulting file
    Gener: Generate uniform or normal random data
    Modify: Replaces a given value in a column with another
    Quit: Terminate program do not output data file
    Random: Randomizes the order of the values in a column
    rOtate: Rotate two dimensions around joint mean
    Transform: Row % or Column +/-, Z score, ln, or arcsin
    transPose: Transpose Matrix
    Sample: Selects a random sample (w/o replacement) of the rows
    Value: Selects rows that should be included or excluded based on the values in a specified column

[A] [C] [D] [E] [G] [M] [Q] [R]an R[o]t [S] [T]ransform trans[P]ose [V] ?

Select the action desired by typing the first lettter of the action as highlighted in the brackets, e.g. "G" for generate random data. At this point the sequence of program prompts varies depending on the desired action. In many cases, one may perform a sequence of actions on the file. If there is no further action desired, press "E" for end. A description of these actions follows:

Arithmetic: copy, add, subtract, multiply, or divide columns

[C]onstant, [T]otal, copy[=], or [+], [-], [*], [/] Columns

ADFUTIL allows you to do casewise arithmetic on ADF files. Resulting operations always end up in a new column (added on the right). Seven operations are possible. A constant can be inserted in a new column, a column can be copied to a new column, selected columns can be [T]otalled, or one column can be added, subtracted, multiplied, or divided by another column. (The program will halt on division by 0).

For example, assume that cases represent excavated levels and you have only 3 columns in the data set where column 1 contains a sherd count for the level, column 2 a lithic count for the level, and column 3 a volume in cubic meters. To get a sherd density per liter (1000 liters=1m3) you would choose the [A]rithmetic option from the main menu and the [/] option from the arithmetic menu. Then you would supply column 1 and column 3 as the two operands for division. Column 4, would then contain sherds/m3. To convert the density to liters, you would again choose [A]rithmetic from the main menu, then [C]onstant from the Arithmetic menu, then supply 1000 as the constant value. This will create column 5 with 1000 in it for all rows. Finally you will again choose Arithmetic, and *, multiplying column 4 (sherds/m³) by column 5 (liters/m³) to get the result you desire in a new column 6. You can [E]xit at this point to write the file or you can choose [D]rop to eliminate columns 4 and 5 from the output data set before the final file is written.

Similarly, the program can be used to calculate percentages but totaling selected columns, dividing the total by 100 and then dividing the total/100 into each of the original count columns. However, this will be a bit cumbersome if you do it for a large number of columns. Note that the [T]ransform option will total all rows and convert the counts to percentages in a single step.

Concat: Concatenates 2 ADF files, side to side

This option allows you to combine two data sets with the same set of cases (rows). It creates a new data set with all of the variables (columns) in both data sets. If variables are duplicated in the two data sets, they will be duplicated in the resulting file (since the program is unaware of the specific content of the variables), but can be eliminated with the delete option.

Delete: Deletes columns for all cases.

Requests list of columns to be deleted. (Columns may be listed in any order. The deletion is not done until no more columns are listed.)

End: Modifications complete; output resulting file

Gener: Generate uniform or normal random data

Integer uniform random data with any range or real uniform or normal data may be generated with any range or mean and standard deviation.

Modify: For a column replaces one specific value with another

Quit: Terminate the program; do not output data file

Random: Randomizes the order of the values in a single column.

This is useful for testing for structure in data where variable associations of the original data are randomized but the set of observed values is maintained (see Kintigh and Ammerman 1982 for an application).

Rotate: Rotate two variables (dimensions) a specified angle around their joint mean. This is mainly useful for spatial data for rotating a distribution of 2 dimensional points.

Sample: Selects a random sample (w/o replacement) of the rows for all columns.

Transform: Row % or Transform values in a column (using Arcsin, ln, √, or Z score of a column, X, composed of values x1, x2,...)

+/- Presence/Absence converts to 0 if x<=min; 1 if x>min. By default, any value greater than 0 indicates presence, however, an alternate minimum can be set.

Arcsin: y=Arcsin(√x) applies to proportions 0<=x<=1. This is useful for emphasizing differences in proportion symmetrically at either end of the scale. Note: this applies to proportions 0<x<=1, not percents 0<%<100.

Ln: y=Ln(x+c): applies to x>0. For data sets with 0 values, there is an option to add a constant, c, to each value before transforming

Row %: Replace each value in the data set (all variable, all cases) with the row percent.

Sqrt: y=√x: Sqrt applies to positive numbers');

Z Score: y=mean(x) applies to any set of values such t');

[+]/- [A]rcSin(√p), [L]n(x), [S]qrt(x), [Z] Score ?

transPose: Transpose a data matrix. When this command is selected, the program outputs the transpose of the input matrix (the first column becomes the first row, etc.). The effect is to make variable cases and cases variables.

Value: Selects rows that should be included or excluded in the output data set based on the values in a specified column.

Output File Name {.ADF} ?

Enter the name of the output data file.

Page Last Updated: 21 June 2022