TFQA Logo

Tools for Quantitative Archaeology
   Email:
             Kintigh email

TFQA Home
TFQA Documentation
TFQA Orders
Kintigh's ASU Home


Programs using CSV (Commas Separated Value) Input


There have been several requests to simplify the data interface with spreadsheets and databases. I have written beta versions of several of the most-used programs (kmeans, lden, twoway, and boone--with more to come) that will accept comma separated value (CSV) input rather than the standard program input formats described elsewhere in the documentation. To access these new programs run the .exe files that end in ...csv.exe, (i.e. kmeanscsv.exe, ldencsv.exe, twowaycsv.exe, and boonecsv.exe.) rather than the standard versions (kmeans.exe, etc.). Except for prompting for the input data, these programs have the same prompts and produce the same out put as the non-csv versions.

 

I have also written a program, adf2csv.exe, that will convert a set of .adf, .arl, and .acl files to .csv format.) I have not had time to rewrite the documentation but for preparing data files to analyze with Excel or another spreadsheet, this format is much easier to use. From Excel, or many other programs, you can use File>Save-As>CSV.

 

In addition, these new versions of the programs use dynamic storage allocation which allows them to do problems of almost unlimited size. At the same time I changed them to use comma separated value files for input rather than separate files for data and row and column labels. These are both steps on the road to a Windows interface, but these new programs retain the DOS-like interface though they are true windows programs. The paragraphs below describe the input format. Like the other programs, they are easiest to use if you copy the program to the directory with the data files and just double click on the program.


All of the input to the programs with names ending in csv goes in as one file rather than a data (.adf) file, a column label file (.acl) , and a row label file (.arl). They read the input pretty flexibly as a comma separated value (csv) file that can be used directly (in and out) with Excel or most other spreadsheets. Each line consists of some number of distinct values separated by commas or spaces (text including spaces can be enclosed in double quotes (e.g., "xx xx". ) Values must be separated by at least one space or with one comma (with optional spaces). However each comma separates a value, so A,,2,3 generates 4 values, A, an empty string, 2, and 3. Howver, as noted below, the analytical programs do not allow missing values.


If the first line consists only of text values, the program assumes that they represent variable (column) labels. The labels cannot have embedded spaces. The easiest solution is to substitute underscores for spaces. If you want numbers for labels, enclose them in double quotes, e.g., "35". If the first line has any numeric values it treats it as a data line and thedefault variable labels are V1, V2 etc. All data lines must have the same number of values as the label line (if any).

 

The first data line is used to decide if a given value is a character variable or a numeric variable and, if there is no label line, it is used to determine the number of variables in a case. All the data for a single case must be on a single line, but the lines can be arbitrarily long. Missing numeric values are not allowed. You must provide a value for each numeric value. The character variables are ignored except that you may choose one as a case label. (The case labels can be arbitrarily long though they will be truncated by the programs for printing.)

 

These versions will take problems of basically unlimited size. Otherwise the programs work the same as the earlier, documented ones. These can however, produce .plt files that are too big for the corresponding plot programs to process. I have only done limited testing, as have a few colleagues and they do seem to work. There is also a program(adf2csv.exe) to convert adf rlf and clf files to the csv format. Unless you see a special need, you can keep using the other programs which have almost all been revised to run very large problems.


All new purchases (and indeed al purchases since some time in 2004) have included these programs. These revised programs are available to registered users with earlier purchase dates with the purchase of an update.


Home Top Overview Ordering Documentation

Page Last Updated - 30-Sep-2009