TFQA: Tools for Quantitative Archaeology

TFQA: Tools for Quantitative Archaeology
kintigh@tfqa.com +1 (505) 395-7979

TFQA Home
TFQA Documentation
TFQA Orders
Kintigh (ASU Directory)

GRID: Convert Point Provenience to Grid Count Data

GRID reads information concerning point-provenienced objects and creates a file with counts of objects in arbitrary grid units. In its simplest form, GRID reads pairs of X-Y coordinates, each one of which is assumed to locate a single object, and compiles counts of objects by grid unit. If, in addition to the coordinates, the objects are identified by a numeric type or class (e.g. tool type) between (0 and 255), counts are compiled for each type in each grid unit, and a total for the grid unit is also given. Whether or not the points are identified by type, a count can be associated with each X-Y pair. This count indicates the number of objects (of that type) that are found at that location, and this count is compiled for each grid unit (for each type). Both the origin of the grid system and the length and width of the grid units are specified by the user (with helpful suggestions by the program).

This program is generally used to aggregate point provenience data into counts by grid square so that grid-based methods of spatial analysis (such as Whallon's unconstrained clustering) can be used. Because the grid origin and grid unit size is specified by the user, different grid layouts can be used to identify patterning at different scales, or to ensure that results are not highly dependent on a specific arbitrary grid layout.

For reference, points that fall precisely on a grid unit boundary are counted in the unit above or to the right of the boundary in question. The only exception to this is for points that fall exactly on the top or right boundary of the entire grid system. These points are counted with the squares to the left and down in order to avoid creating new grid units for these boundary points.

The program can read data for up to 32767 points, representing any number of total objects, with any range of X and Y coordinates (e.g., coordinates could range from 7.5 to 7.6 or from -1000000 to +1000000). However, because the program compiles the counts in memory, the number of cells in the grid (defined by the product of the number of grid units in the X and Y directions) times the number of different types is limited by the PC's available memory and the constraint that no grid can have more than 32767 cells. To be a bit more concrete, a standard PC with 256K will handle a 150 x 200 cell grid and 6 types.

For this reason, two versions of the program are supplied, GRIDI and GRIDR. GRIDI runs faster than GRIDR and permits grids with about three times as many cells (for the same number of types) as will GRIDR. However, GRIDI will only deal with integer counts and will only handle grids in which the maximum count of items within any cell does not exceed 32767. In most cases in which objects are counted, these restrictions will not apply. However, GRIDR can be used to compile totals (by type) of data in which counts exceed 32767 or in which non-integer data, such as weights, are totaled. GRIDR has no restriction on the smallest or largest values it will handle in a cell.

For most purposes you can run GRIDI and not worry about it. If you run into any of the restrictions the program will tell you and let you know what to do. If you have specified a grid with too many cells, it will allow you to redefine the grid using larger grid unit sizes. If the maximum count on GRIDI is exceeded, the program will stop and tell you to use GRIDR. In no case will your input file be damaged. For information on using these programs with lots of types or very large grids, see below.

SEQUENCE OF PROGRAM PROMPTS

GRIDI and GRIDR are operated in an almost identical manner. The programs are started by typing the program name, e.g. GRIDI followed immediately by <Enter>. For more information about the conventions used in these prompts, see the section, "Program Conventions."

Input File Name {.ADF} ?

This requests the name of the input file. The first numbers read by the file are the number of rows or points (NROW) and the number of columns or variables (NCOL) in the file. After that, for each of the NROW points, the program reads NCOL numbers. If there are only two columns per point (i.e., NCOL=2), they must represent the X and Y coordinates, in that order. If there are four or more columns per point, then the first four numbers will be assumed to represent the X and Y coordinates, the type and the count associated with that type at that point, in that order, and the remainder of the entries for that point are ignored. If there are three columns per point the program asks:

3rd Data Column Indicates [T]ype or [C]ount ?

Here a reply of T will indicate that each point has values for X, Y and type (with an assumed count of 1), and a value of C will indicate that each point has values for X, Y, and count (with no differentiation of types).

Reading Data from infile.ext

Scan Input File to Define Bounds

Number of Points: ? Number of Objects: ?

Number of Types: ? Largest Class No.: ?

Minimum X in Data: ? Maximum X in Data: ?

Minimum Y in Data: ? Maximum Y in Data: ?

Next the program reads through the input file in order to see how many different type identification numbers are used and to find the minimum and maximum X and Y values (which are reported to you).

Define the Grid Boundaries

Xmin {?} ?

Xmax {?} ?

Ymin {?} ?

Ymax {?} ?

It then asks for you to define the grid origin (Xmin, Ymin) and the grid extreme in both directions (Xmax, Ymax). For each of the four values it will suggest the nearest integral value such that all points will be included in the grid. However, you may define the grid boundaries in such a way that all input points are not included.

Define the Grid Unit Size (need not be square)

Maximum Number of Grid Cells xxxxxx

Grid Unit Length in X Direction {?} ?

Grid Unit Length in Y Direction {?} ?

Grid ? x ? Cells

Grid Definition OK ?

The program now tells you how many cells the grid can have and asks you to give it the size of each grid unit in the X and Y directions. If the grid you specify will not fit in memory, the program asks again to have the grid unit size defined. Once an acceptable size has been specified, the program asks, in the last prompt listed directly above, for permission to proceed.

Rereading Input File to Compile Counts

Points Read: ?

Invalid Type or Count ?

Points Outside Grid ?

Minimum Count to Output Grid Unit {1} ?

In many cases you will not want to analyze cells with fewer than some minimum number of points. For example, it may not make sense to calculate percentages on the basis of very low counts (e.g. 4). If you want all cells, including the cells with no points reply with 0 or <Enter>. Otherwise, reply with the minimum count unit you want to see. In general, for unconstrained clustering, you will want the minimum to be 1 or more.

Output Grid Unit [C]ounts or [P]ercents {C} ?

The program now asks if you want type counts per grid unit or type percentages per grid unit in the output data set.

Output Total in Addition to Counts or Percents {N} ?

If you supplied type identifiers on the input records, the program asks if you want a total count for the grid unit in addition to the counts or percentages you asked for. In general the Y reply will be fine. However, if you have a program that expects only the coordinates and the counts (or percentages) you should reply N. (For k-means analysis reply N.)

Scanning Matrix for Largest Count

The program now rereads the input file and accumulates counts for each type for each grid cell. It reports a count of the points read and excluded. It then looks for the largest count in the grid in order to make sure that GRIDI's limit is not exceeded and so that it can figure out how many spaces are required for the output counts.

Output File Name {infile.GRD} ?

Here you indicate the file name (or device and path name) where the grid should be output. The output file will be preceded with an Antana header which indicates the number of rows (grid cells) and columns written out, and labels the output columns. (For use with other programs you will probably want to remove the Antana header with a text editor before further analysis). For each cell the data columns written are the X and Y coordinates of the center of the grid unit counts for each type specified (if any) and a total count for the cell. The cells are written out from top to bottom and from left to right, as you would look at a grid (thus the maximum row of Y values is written out first). All counts for each cell are written on a single line, even if the line exceeds 80 characters in length. (To read lines longer than 80 characters some programs, like SYSTAT, may require you notify them of this fact).

# of Digits After the Decimal in Output Counts {0} ?

This question is only asked by GRIDR. GRIDI always outputs integer counts, however GRIDR allows you to accumulate non-integer values, such as weights.

Writing Grid Unit Counts

Program End

These are program termination messages, indicating that the program has come to a successful end. If you do not get these messages by instead get a run error the disk with the output file probably did not have enough room for the file, and the program failed. In this happens, you will need to start again, and either direct the output to a different drive that has more room or, prior to pressing <Enter> after supplying the output file name, replace the disk to which the output will be written with a blank formatted disk (or a disk with sufficient room).

WORKING WITH LARGE GRIDS OR MANY DIFFERENT TYPES

If you need to compile counts for a grid larger than the program can handle, this can be done by dividing the entire area into comprehensive but non-overlapping rectangular areas and running the program separately for each area. Because the program ignores points outside the user-specified bounds (other than to tell you how many points fell outside the grid), the resulting output files can be concatenated to get counts for the entire grid.

SAMPLE INPUT FILE

Input File in which the third column is a type identifier:
 
16 #points# 3 #variables: x y type# 
 1 1 0 
 1 2 0 
 2 1 0 
 3 3 0 
 8 2 2 
 9 1 0 
 9 3 2 
10 2 0 
 6 10 10 
 6 11 11 
 5 12 12 
 7 12 13 
12 8 4 
13 7 2 
14 9 2 
15 7 4

SAMPLE OUTPUT FILE

Output File obtained using the responses of the Sample Session:
 
#Units# 11 #Vars# 9 #File: GRID.ADF # 
# Xcenter Ycenter 0 2 4 10 11 12 13 
      6.00 12.00 0 0 0 0 1 1 0 
      8.00 12.00 0 0 0 0 0 0 1 
      6.00 10.00 0 0 0 1 0 0 0 
     14.00 10.00 0 1 0 0 0 0 0 
     12.00  8.00 0 0 1 0 0 0 0 
     14.00  8.00 0 1 1 0 0 0 0 
      4.00  4.00 1 0 0 0 0 0 0 
     10.00  4.00 0 1 0 0 0 0 0 
      2.00  2.00 3 0 0 0 0 0 0 
      8.00  2.00 0 1 0 0 0 0 0 
     10.00  2.00 2 0 0 0 0 0 0

Page Last Updated: 21 June 2022