Tools for Quantitative Archaeology |
BINOMIAL: Compute Binomial Probabilities
Binomial calculates binomial probabilities and finds an interval of population parameters around the sample proportion. All output input is from the keyboard and output is to the screen.
PROGRAM EXAMPLE
[B]inomial Probabilities, [E]stimate, [Q]uit {B} ?
The program has two sections. In the first, straightforward [B]inomial probabilities are computed. While these probabilities have wide application in archaeology they are sometimes difficult to compute and are thus not used when they should be. In the second, a population parameter interval is [E]stimated around a sample proportion. Results of the first type of computation should require little explanation.
Binomial Probability: P=C(n,k)*p^k*(1-p)^(n-k) Probability P of exactly k successes in n independent trials where the probability of success in each trial is p Number of trials, n {0} ? 25 Number of successes, k {0} ? 4 Probability of success, p {0.0000} ? .1 Binomial Probability (k=4; n=25; p=0.1000) k | P(k) P(<=k) P(>=k) --------------------------------------- 0 | 0.071790 0.071790 1.000000 1 | 0.199416 0.271206 0.928210 2 | 0.265888 0.537094 0.728794 3 | 0.226497 0.763591 0.462906 > 4 | 0.138415 0.902006 0.236409 5 | 0.064594 0.966600 0.097994 ... 24 | 0.000000 1.000000 0.000000 25 | 0.000000 1.000000 0.000000 Binomial Summary (k=4; n=25; p=0.1000) P(k)=0.138415 P(1-tailed)=0.236409 P(2-tailed)=0.308198 P(<=k)=P(0)+P(1)+...+P(k) = 0.902006 P(>=k)-P(k)+p(k+1)+...+P(n)=0.236409
That is, the probability of getting exactly 4 out of 25 given a population proportion of .1 is .14. The probability of getting 4 or fewer in a sample of 25 is .90; the probability of getting 4 or more is .24. (The latter two do not sum to 1 because p(4) is included in each sum).
Compute Another Probability {Y} ? [B]inomial Probabilities, [E]stimate, [Q]uit {B} ? E Sample Size, n {0} ? 100 Number Observed In Sample, k {0} ? 4 Number of Population Proportions to Examine {100} ? 500 Display Tail Probs for Each Proportion Calculated {N} ? Y Prop LTail RTail... | 0.000 1.000 0.000 | 0.002 1.000 0.000 | 0.004 1.000 0.001 | 0.006 1.000 0.003 | 0.008 0.999 0.009 | 0.010 0.997 0.018 | 0.012 0.993 0.033 | 0.014 0.986 0.052 | 0.016 0.977 0.077 | 0.018 0.965 0.107 | 0.020 0.949 0.141 | 0.022 0.930 0.179 ... | 0.104 0.018 0.994 | 0.106 0.015 0.995 | 0.108 0.013 0.996 | 0.110 0.011 0.997 | 0.112 0.010 0.997 | 0.114 0.008 0.998 | 0.116 0.007 0.998 | 0.118 0.006 0.998 | 0.120 0.005 0.999 | 0.122 0.004 0.999 | 0.124 0.004 0.999 | 0.126 0.003 0.999 Best Estimate of Population Proportion, p-hat = 4/100 = 0.0400 Prob in Min p s.t. Prob in Max p s.t. Prob in Each Tail RTail<=Prob RTail LTail<=Prob LTail 0.001 0.004 (0.001) 0.142 (0.001) 0.005 0.006 (0.003) 0.122 (0.004) 0.010 0.008 (0.009) 0.112 (0.010) 0.025 0.010 (0.018) 0.100 (0.024) 0.050 0.012 (0.033) 0.090 (0.047) 0.100 0.016 (0.077) 0.080 (0.090) 0.200 0.022 (0.179) 0.068 (0.182) 0.250 0.024 (0.220) 0.064 (0.226) 0.500 0.036 (0.487) 0.048 (0.473) 0.750 0.050 (0.742) 0.034 (0.746) 0.800 0.054 (0.795) 0.032 (0.783) 0.900 0.064 (0.889) 0.026 (0.880) 0.950 0.074 (0.944) 0.020 (0.949) 0.975 0.084 (0.973) 0.018 (0.965) 0.990 0.096 (0.989) 0.014 (0.986) 0.995 0.104 (0.994) 0.012 (0.993) 0.999 0.124 (0.999) 0.008 (0.999) p=p-hat=> 0.040 (0.571) 0.040 (0.629) Perform Another Estimate {Y} ?
Perform Another Estimate {Y} ?
Here the results are a bit less obvious. The problem is given an observed sample proportion of 4 out of 100, or 4%, what can we say about the population from which it was drawn. This question is examined by calculating a confidence interval. (However see the discussion below for what is generally a better approach.)
One way to do this is done by calculating binomial coefficients for different population proportions at set intervals. For example, if 100 population proportions are examined, then population proportions of 0.0, .01, .02, ... .99, 1.0 are examined.
The first part of the output lists the probability, for each proportion considered, of getting k or fewer successes (left tail) and k or more successes (right tail) out of n trials. Let us first consider a confidence interval of .98 level. Looking through the list, you can see that an actual proportion of .008 will produce a 4 or more observed successes (the right tail) about 1% (.09) of the time. (Any higher proportion will produce 4 or more successes more than 1% of the time.) Looking further on, we can see that a proportion of .112 will produce 4 or fewer successes, about 1% of the time. Thus, the range from about .008 to .112 seems a reasonable candidate for a 98% confidence interval (with 1% in each tail) around the observed p-hat of ,04.
The remainder of the output tabulates information so that you can find a number of reasonable confidence intervals around p-hat with approximately equal probabilities in each tail (assuming 0<k<n). Thus, if we look for .05 in each tail (a 90% interval), the probability that an actual population with a .012 parameter would have a sample value as high or higher than the observed k of .04 is .033, the value in the right tail. Similarly, the probability that an actual population with a .09 parameter would have a value as low or lower than .04 is .047, the probability of the left tail. Thus, only about 5% (3.3%) of the samples drawn from a population with a value of .012 have sample values as high as .04; about 5% (4.7%) drawn from a population with a value of .09 have sample values as low as .04. The difference between the probability level, i.e., 0.05 and the tail probabilities is due to the fact that only a restricted set of values are tested. If greater precision is required, increase the number of population proportions examined to 1000 or more.
Perform Another Estimate {Y} ?
The analysis can be repeated for other parameters; otherwise the program ends.
However, this is an indirect, and generally not the best way to compute a confidence interval for a binomial distribution. The BAYES program using a flat prior will provide the appropriate confidence intervals for proportions. For the same problem (k=4, n=100, resolution =.002) the 90% confidence interval is .014-.084.
Home | Top | Overview | Ordering | Documentation |
Page Last Updated - 02-Jun-2007