IES 612/STA 4-573/STA 4-576
Spring 2005
Week 8 – IES612-week08-lecture.doc
Design Issues (Ch 14 text)
“Design of experiment” = “process of establishing a framework through which the comparison of treatments or groups can be made in terms of recorded response” (OL 14, p. 829)
Note: balance between the control of condition and depiction of reality must be maintained – “ecological validity” – when can you use the lab vs. when must you work in the field?
Study types:
1. Observational – factors not manipulated, sampling from populations where factors (trts) already present and want to compare populations with respect to some response (e.g. samples, polls, surveys, epi. studies)
2. Experimental – randomly assign subjects to treatment conditions and observe the response of interest
Examples
* Community study where elderly receive care using either a consumer-directed system or a traditional case manager system.
* Perch growth in the presence of gobi and/or zebra mussels
* Species presence as a function of stream characteristics
* Web weight as a function of temperature and species of spider
* Contaminant level in effluent as a function of temperature and pressure
Research Plan ingredients (OL 14, p. 831)
1. Objectives
2. Study Factors
3. Extraneous Factors
4. Response
5. Randomization method
6. Protocol for recording responses
7. Replications needed
8. Resources
Principles to consider when designing an experiment
1. Randomization – create groups as similar as possible prior to an experiment
2. Control – comparison group (concurrently conducted with the study)
3. Replication – how many experimental units? Sensitivity to detect important differences.
How might you do a RANDOMIZATION?
Suppose we wanted to randomly assign 12 experimental units (here pieces of meat) to one of four packaging conditions with 3 units assigned to each condition.
Step 1: Assign a unique number (label) to each experimental unit – say “1” to “12” (=nT)
Step 2: Randomly permute the labels.
Step 3: Assign the units corresponding to the first n1=3 permuted labels to group 1, the next n2=3 permuted labels to group 2, etc.
Let’s make this concrete …
ods rtf;
proc plan;
title "generate
randomization/allocation scheme for 12 steaks";
factors meat=12;
run;
ods rtf close;
|
Select |
Levels |
Order |
|
|
meat |
12 |
12 |
Random |
|
11 |
9 |
12 |
1 |
7 |
8 |
3 |
6 |
4 |
5 |
10 |
2 |
So,
Condition 1 = Steaks 11, 9, 12; Condition 2 = Steaks 1, 7, 8
Condition 3 = Steaks 3, 6, 4; Condition 4 = Steaks 5, 10, 2
|
meat |
|||||||||||
|
11 |
9 |
12 |
1 |
7 |
8 |
3 |
6 |
4 |
5 |
10 |
2 |
|
1 |
2 |
3 |
4 |
||||||||
|
Packaging Condition |
|||||||||||
As an aside, you can also use this to generate a random sample (which is essentially the first “n” of “N” permuted labels).
NOTE: At the start of processing, random number seed=581671001.
ods rtf;
proc plan;
title "generate
random sample of 10 from 40 in sampling frame";
factors n=10 of 40;
ods rtf close;
The SAS log noted that NOTE: At the start of processing, random number seed=581671001.
(in case you wanted to replicate this stream of random numbers).
|
n |
|||||||||
|
29 |
18 |
11 |
37 |
2 |
5 |
19 |
39 |
26 |
33 |
* you can use random number tables or other devices to do a randomization.
Why have control groups? Does
control group = untreated group?
Guaranteed treatment for the common cold – I call it “chicken soup” – You will be cured after 3 days or your money back! Justification for guarantee? I did a study where I gave soup to 25 people with colds and they all felt better when I asked them 3 days later. Reaction?
Lots of types of
control groups:
1. Untreated
2. Placebo
3. Sham (often in surgery or neuroanatomy studies)
4. Standard treatment (may not be ethical to have an untreated group)
5. Vehicle (sometimes you have to give a treatment in some medium)
6. Historical (can be problematic)
What does
“blinding” mean in an experiment?
Single-blind study (subject doesn’t know treatment)
Double-blind study (subject & physician/experimenter don’t know treatment)
Triple-blind study (subject, experimenter & analyst don’t know treatment)
[code is broken after completion of the study]
Treatment Structure
Factor = manipulation/population of interest (analogous to independent variable in regression)
Level = unique value of factor
Treatment = (single factor study) level of factor
Treatment = (multiple factor study) unique combination of factor levels
Factor = packaging condition
Levels = vacuum, mixed, CO2, plastic
|
Packaging Condition (Factor) |
|||
|
vacuum |
mixed |
CO2 |
plastic |
|
1 |
2 |
3 |
4 |
|
Treatment |
|||
Factor A = gobi (levels = present/absent)
Factor B = Zebra mussel (levels = present/absent)
|
|
B (Zebra mussel present) |
B (Zebra mussel absent) |
|
A (gobi present) |
1 |
2 |
|
A (gobi absent) |
3 |
4 |
Experimental Units (EU) and Measurement Unit (MU)
Experimental Units (EU) = entity to which treatment is randomly assigned or is randomly sampled from one of the “treatment” population [OL 14, p. 833]
Measurement Unit (MU) = entity on which a measurement is taken
Example: Meat packaging study: EU = MU = piece of meat
Example: Teratology study: EU=dam/litter; MU=pup
NOTE: Sometimes, MU called “pseudoreplicate” if MU doesn’t equal the EU (in ecology literature)
Experimental Error = variation among EUs assigned to the same treatment and observed under the “same” experimental conditions
Sources of Experimental Error?
1. natural differences in EUs
2. variation in devices that record the MUs
3. variation in the treatment conditions
4. effects of extraneous factors
Controlling Experimental Error? OL 14.4
1. Procedures for conducting study standardized (“local control” of Kuehl) – train data collectors/lab technicians, standard protocol for recording data and conducting experiment, etc.
2. Choice of EU/MU (e.g. same age/size class, same level of disability, etc.) – randomly select from population and then randomly assign treatments, select EUs that are similar (although if too similar that generalizability may be questioned) – e.g. transgenic rodents (knockout mice)
3. Measurement procedure
4. Blocking (a “design” control – before conducting study) – EUs placed in groups
5. Covariates (an “analysis” control – possible after study conducted)
Blocking Designs
* A blocking design imposes a CONSTRAINT on the randomization of experimental units to treatments
* EUs are placed in groups (BLOCK) that are similar with respect to some important characteristic that may affect the response.
* EUs are randomly assigned to treatments WITHIN each group/block
* Some criteria for determining blocks (OL 14, p. 839)
i. physical characteristic (e.g. age, weight, size class, sex, health, education)
ii. related units (e.g. twins, animals from the same litter)
iii. spatial location (e.g. neighboring plots of land, oven, table, rack)
iv. time (e.g. day of week, time of day)
v. person conducting study (e.g. technician, operator, rater)
Using a covariate
* covariate is a variable related to the response variable (you might have blocked on this covariate as an alternative to controlling the covariate)
* e.g. DEPTH could be used as a covariate in an analysis comparing two lakes with respect to dissolved oxygen.
* This is essentially a REGRESSION problem where the TREATMENT could by represented by one or more indicator variables and the COVARIATE is some numeric variable.
* It is common to assume that the relationship between the response and the covariate is the same in all treatment groups; however, this should be tested. Basically, the ANCOVA tests for whether intercepts differ between treatments (assuming equal slopes with the covariates).
* Basic Model (3 level treatment and assuming a parallel
response):
![]()
To test group equality ADJUSTING for the covariate, test H0: b2= b3=0.
* Model to test the equal slope assumption:
![]()
To test the equal slope assumption, test H0: b4= b5=0.
* Could you do an ANCOVA if there was a nonlinear relationship with the covariate?
Example: Does dissolved oxygen differ (high human
activity site) “Tahoe Keys” differ from (low human activity site) “
data lake;
input obs depth do lakeid
$;
logDO = log10(do);
datalines;
1 0 10.40 E
2 1
7.50 E
3 2
6.60 E
4 3
6.10 E
5 4
5.70 E
6 5
5.40 E
7 6
5.10 E
8 11
2.90 E
9 16
2.00 E
10 21 1.20
E
11 26 1.00
E
12 0 9.26
T
13 1 7.63
T
14 2 5.05
T
15 3 2.52
T
16 4 1.95
T
17 5 1.47
T
;
ods html;
proc gplot data=lake;
plot logDO*depth=lakeid;
run;
proc glm;
class
lakeid;
model logDo
= depth lakeid depth*lakeid;
run;
ods html close;

|
The SAS System |
The GLM Procedure
|
Class
Level Information |
||
|
Class |
Levels |
Values |
|
lakeid |
2 |
E T |
|
Number of
Observations Read |
17 |
|
Number of
Observations Used |
17 |
The GLM Procedure
Dependent Variable:
logDO
|
Source |
DF |
Sum of
Squares |
Mean
Square |
F Value |
Pr > F |
|
Model |
3 |
1.63564934 |
0.54521645 |
212.92 |
<.0001 |
|
Error |
13 |
0.03328897 |
0.00256069 |
|
|
|
Corrected
Total |
16 |
1.66893830 |
|
|
|
|
R-Square |
Coeff Var |
Root MSE |
logDO Mean |
|
0.980054 |
8.648742 |
0.050603 |
0.585094 |
|
Source |
DF |
Type III
SS |
Mean
Square |
F Value |
Pr > F |
|
depth |
1 |
0.76620168 |
0.76620168 |
299.22 |
<.0001 |
|
lakeid |
1 |
0.00897392 |
0.00897392 |
3.50 |
0.0839 |
|
depth*lakeid |
1 |
0.31441717 |
0.31441717 |
122.79 |
<.0001 |
So, evidence that the relationship between log(DO) and depth
differs between Tahoe Keys and
An alternative analysis using PROC REG …
data lake;
input obs depth do lakeid
$ @@;
iTK = (lakeid=”T”); * defining the indicator variable;
logDO = log10(do);
iTK_depth = iTK*depth;
datalines;
1 0 10.40 E 2 1
7.50 E 3
2 6.60 E
4 3
6.10 E 5
4 5.70 E
6 5 5.40
E
7 6
5.10 E 8
11 2.90 E
9 16 2.00
E
10 21 1.20
E 11 26 1.00
E 12 0 9.26
T
13 1 7.63 T 14
2 5.05 T 15
3 2.52 T
16 4 1.95
T 17 5 1.47
T
;
proc reg data=lake;
model logDO = depth iTK
iTK_depth;
test iTK=0, iTK_depth=0;
run;
How many observations should be on test? Determining replicates
Warning: don’t confuse the determination of a replicate with the replication of an entire experiment. This is the cause of much confusion.
You could specify the margin of error for estimating a
particular population mean …(OL 14.6)
Goal: A (specified) margin of error of “E” for estimating mi
[assuming equal replication r = n1 = n2 = … = nt desired]
![]()
would be the sample size required to by 100(1-a)% confident that the estimator is within “E” units of the treatment mean mi
a, E = desired accuracy (specified by experimenter)
Estimated standard deviation?
Sources: Pilot study; Similar past experiments (literature); range/4?
Example: Meat Packaging study (revisited)
Sample size required to estimate the mean log(bacterial count) with a margin of error = 0.20 and confidence level 95%. Use the MSE=0.12 from this study as an estimate of the variance.
![]()
You could specify the desired power to detect a specified
difference among the means (OL p. 846)
[assuming equal replication r = n1 = n2 = … = nt desired]
H0: m1 = m2= m3= … = mt
Ha: mi ≠ mj [at least two population means differ]
Need to specify
1. a, significance level
2. D = pairwise difference that represents an important difference = | mi - mj |
or S ( mi - m. )2 which can be used to determine

3. Power = 1-b
4. Variance s2
Table 14 in the Appendix has power calculations indexed by ![]()
Example: Meat Packaging study (revisited)
Sample size required to detect a “0.5” difference in log(bacterial counts) with a power of say 0.8. Use Type I error rate of 5% and the MSE=0.12 from this study as an estimate of the variance.
|
r |
|
|
n2=4(r-1) = nT-4 |
* |
|
Power (from App. 14) |
|
5 |
5.21 |
1.14 |
16 |
7.81 |
1.40 |
|
|
6 |
6.25 |
1.25 |
20 |
9.38 |
1.53 |
|
|
7 |
7.29 |
1.35 |
24 |
10.94 |
1.65 |
|
|
8 |
8.33 |
1.44 |
28 |
12.50 |
1.77 |
~0.81 |
|
9 |
9.38 |
1.53 |
32 |
14.06 |
1.88 |
~0.84 |
|
10 |
10.42 |
1.61 |
36 |
15.63 |
1.98 |
|
|
11 |
11.46 |
1.69 |
40 |
17.19 |
2.07 |
|
|
12 |
12.50 |
1.77 |
44 |
18.75 |
2.17 |
|
|
13 |
13.54 |
1.84 |
48 |
20.31 |
2.25 |
~0.96 |
* assuming m1 = m2= m3= 3.0 and m4= 3.5
This calculation can also be done using software. For example, SAS now has a power and sample size application that is a web application and it has sample size procedures – PROC POWER and PROC GLMPOWER.
Example: Using SAS PROC GLMPOWER to calculate required sample size to test for group equality among four packaging conditions.
data meats;
input
condition $ logbact CellWgt;
datalines;
Plastic
3.5 1
Mixed 3
1
CO2 3
1
Vacuum 3
1
;
proc glmpower data=meats;
class
condition;
model
logbact = condition;
weight
CellWgt;
power
stddev = .3464
alpha
= 0.05
ntotal = .
power
= 0.8;
run;
The GLMPOWER Procedure
|
Fixed
Scenario Elements |
|
|
Dependent
Variable |
logbact |
|
Source |
condition |
|
Weight
Variable |
CellWgt |
|
Alpha |
0.05 |
|
Error
Standard Deviation |
0.3464 |
|
Nominal
Power |
0.8 |
|
Test
Degrees of Freedom |
3 |
|
Computed N
Total |
||
|
Error DF |
Actual
Power |
N Total |
|
32 |
0.854 |
36 |
Example: Using SAS PROC POWER to calculate required sample size to test for group equality among four packaging conditions.
ods html;
proc power;
onewayanova
npergroup
= .
power = 0.80
stddev = 0.3464
groupmeans = 3 | 3 | 3 | 3.5;
plot
x=power min=.5 max=.95;
run;
ods html close;
The POWER Procedure
Overall F Test for
One-Way ANOVA
|
Fixed
Scenario Elements |
|
|
Method |
Exact |
|
Group Means |
3 3 3 3.5 |
|
Standard
Deviation |
0.3464 |
|
Nominal
Power |
0.8 |
|
Alpha |
0.05 |
|
Computed N
Per |
|
|
Actual Power |
N Per
Group |
|
0.854 |
9 |
An alternative perspective where you look at power achieved by different sample sizes
ods html;
proc power;
onewayanova
npergroup = 5 to 15 by 1
power = .
stddev = 0.3464
groupmeans =
3 | 3 | 3 | 3.5;
plot x=n min=5 max=15;
run;
ods html close;
The POWER Procedure
Overall F Test for
One-Way ANOVA
|
Fixed
Scenario Elements |
|
|
Method |
Exact |
|
Group Means |
3 3 3 3.5 |
|
Standard
Deviation |
0.3464 |
|
Alpha |
0.05 |
|
Computed
Power |
||
|
Index |
N Per
Group |
Power |
|
1 |
5 |
0.528 |
|
2 |
6 |
0.637 |
|
3 |
7 |
0.727 |
|
4 |
8 |
0.798 |
|
5 |
9 |
0.854 |
|
6 |
10 |
0.896 |
|
7 |
11 |
0.927 |
|
8 |
12 |
0.950 |
|
9 |
13 |
0.966 |
|
10 |
14 |
0.977 |
|
11 |
15 |
0.984 |
