IES 612/STA 4-573/STA 4-576

Spring 2005

 

Week 8 – IES612-week08-lecture.doc

 

Design Issues (Ch 14 text)

 

“Design of experiment” = “process of establishing a framework through which the comparison of treatments or groups can be made in terms of recorded response” (OL 14, p. 829)

 

Note:  balance between the control of condition and depiction of reality must be maintained – “ecological validity” – when can you use the lab vs. when must you work in the field?

 

Study types:

1.  Observational – factors not manipulated, sampling from populations where factors (trts) already present and want to compare populations with respect to some response (e.g. samples, polls, surveys, epi. studies)

 

2. Experimental – randomly assign subjects to treatment conditions and observe the response of interest

 

Examples

 

* Community study where elderly receive care using either a consumer-directed system or a traditional case manager system.

 

*  Perch growth in the presence of gobi and/or zebra mussels

 

*  Species presence as a function of stream characteristics

 

*  Web weight as a function of temperature and species of spider

 

* Contaminant level in effluent as a function of temperature and pressure

 

Research Plan ingredients (OL 14, p. 831)

 

1.  Objectives

2.  Study Factors

3.  Extraneous Factors

4.  Response

5.  Randomization method

6.  Protocol for recording responses

7.  Replications needed

8.  Resources

 

 

Principles to consider when designing an experiment

 

1.         Randomization – create groups as similar as possible prior to an experiment

2.         Control – comparison group (concurrently conducted with the study)

3.         Replication – how many experimental units?  Sensitivity to detect important differences.

 

How might you do a RANDOMIZATION?

 

Suppose we wanted to randomly assign 12 experimental units (here pieces of meat) to one of four packaging conditions with 3 units assigned to each condition.

 

Step 1:  Assign a unique number (label) to each experimental unit – say “1” to “12” (=nT)

 

Step 2:  Randomly permute the labels.

 

Step 3:  Assign the units corresponding to the first n1=3 permuted labels to group 1, the next n2=3 permuted labels to group 2, etc.

 

Let’s make this concrete …

 

ods rtf;

proc plan;

title "generate randomization/allocation scheme for 12 steaks";

  factors meat=12;

  run;

ods rtf close;

 

Factor

Select

Levels

Order

meat

12

12

Random

 

meat

11

9

12

1

7

8

3

6

4

5

10

2

So,

Condition 1 = Steaks 11, 9, 12;  Condition 2 = Steaks 1, 7, 8

Condition 3 = Steaks 3, 6, 4;      Condition 4 = Steaks 5, 10, 2

 

meat

11

9

12

1

7

8

3

6

4

5

10

2

1

2

3

4

Packaging Condition

 

As an aside, you can also use this to generate a random sample (which is essentially the first “n” of “N” permuted labels).

NOTE: At the start of processing, random number seed=581671001.

 

ods rtf;

proc plan;

title "generate random sample of 10 from 40 in sampling frame";

  factors n=10 of 40;

ods rtf close;

 

The SAS log noted that NOTE: At the start of processing, random number seed=581671001.

(in case you wanted to replicate this stream of random numbers).

 

n

29

18

11

37

2

5

19

39

26

33

 

* you can use random number tables or other devices to do a randomization.

 

Why have control groups?  Does control group = untreated group?

 

Guaranteed treatment for the common cold  – I call it “chicken soup” – You will be cured after 3 days or your money back!  Justification for guarantee?  I did a study where I gave soup to  25 people with colds and they all felt better when I asked them 3 days later.  Reaction?

 

Lots of types of control groups:

1.         Untreated

2.         Placebo

3.         Sham (often in surgery or neuroanatomy studies)

4.         Standard treatment (may not be ethical to have an untreated group)

5.         Vehicle (sometimes you have to give a treatment in some medium)

6.         Historical (can be problematic)

 

What does “blinding” mean in an experiment?

Single-blind study (subject doesn’t know treatment)

Double-blind study (subject & physician/experimenter don’t know treatment)

Triple-blind study (subject, experimenter & analyst don’t know treatment)

[code is broken after completion of the study]

 

Treatment Structure

 

Factor = manipulation/population of interest (analogous to independent variable in regression)

 

Level = unique value of factor

 

Treatment = (single factor study) level of factor

 

Treatment = (multiple factor study) unique combination of factor levels

 

Single factor

 

Factor = packaging condition

Levels = vacuum, mixed, CO2, plastic

 

Packaging Condition (Factor)

vacuum

mixed

CO2

plastic

1

2

3

4

Treatment

 

Multiple factor

 

Factor A = gobi  (levels = present/absent)

Factor B = Zebra mussel  (levels = present/absent)

 

 

B (Zebra mussel present)

B (Zebra mussel absent)

A (gobi present)

1

2

A (gobi absent)

3

4

 

 

Experimental Units (EU) and Measurement Unit (MU)

 

Experimental Units (EU) = entity to which treatment is randomly assigned or is randomly sampled from one of the “treatment” population [OL 14, p. 833]

 

Measurement Unit (MU) = entity on which a measurement is taken

 

Example:  Meat packaging study:  EU = MU = piece of meat

 

Example:  Teratology study:  EU=dam/litter;  MU=pup

 

NOTE:  Sometimes, MU called “pseudoreplicate” if MU doesn’t equal the EU (in ecology literature)

 

Experimental Error = variation among EUs assigned to the same treatment and observed under the “same” experimental conditions

 

Sources of Experimental Error?

1.  natural differences in EUs

2.  variation in devices that record the MUs

3.  variation in the treatment conditions

4.  effects of extraneous factors

 

 

Controlling Experimental Error?  OL 14.4

 

1.   Procedures for conducting study standardized (“local control” of Kuehl) – train data collectors/lab technicians, standard protocol for recording data and conducting experiment, etc.

 

2.  Choice of EU/MU (e.g. same age/size class,  same level of disability, etc.) – randomly select from population and then randomly assign treatments,  select EUs that are similar (although if too similar that generalizability may be questioned) – e.g. transgenic rodents (knockout mice)

 

3.  Measurement procedure

 

4.  Blocking (a “design” control – before conducting study) – EUs placed in groups

 

5.  Covariates (an “analysis” control – possible after study conducted)

 

Blocking Designs

 

* A blocking design imposes a CONSTRAINT on the randomization of experimental units to treatments

 

*  EUs are placed in groups (BLOCK) that are similar with respect to some important characteristic that may affect the response. 

 

*  EUs are randomly assigned to treatments WITHIN each group/block

 

*  Some criteria for determining blocks (OL 14, p. 839)

i.          physical characteristic (e.g. age, weight, size class, sex, health, education)

ii.          related units (e.g. twins, animals from the same litter)

iii.         spatial location (e.g. neighboring plots of land, oven, table, rack)

iv.         time (e.g. day of week, time of day)

v.         person conducting study (e.g. technician, operator, rater)

 

Using a covariate

 

* covariate is a variable related to the response variable (you might have blocked on this covariate as an alternative to controlling the covariate)

 

* e.g. DEPTH could be used as a covariate in an analysis comparing two lakes with respect to dissolved oxygen.

 

* This is essentially a REGRESSION problem where the TREATMENT could by represented by one or more indicator variables and the COVARIATE is some numeric variable.

 

* It is common to assume that the relationship between the response and the covariate is the same in all treatment groups; however, this should be tested.  Basically, the ANCOVA tests for whether intercepts differ between treatments (assuming equal slopes with the covariates).

 

* Basic Model (3 level treatment and assuming a parallel response):   

 

To test group equality ADJUSTING for the covariate, test H0:  b2= b3=0.

 

* Model to test the equal slope assumption:

 

To test the equal slope assumption, test H0:  b4= b5=0.

 

*  Could you do an ANCOVA if there was a nonlinear relationship with the covariate?

 

Example:  Does dissolved oxygen differ (high human activity site) “Tahoe Keys” differ from (low human activity site) “Eagle Lake”?

 

data lake;

  input obs depth do lakeid $;

  logDO = log10(do);

  datalines;

 1     0 10.40      E

 2     1  7.50      E

 3     2  6.60      E

 4     3  6.10      E

 5     4  5.70      E

 6     5  5.40      E

 7     6  5.10      E

 8    11  2.90      E

 9    16  2.00      E

10    21  1.20      E

11    26  1.00      E

12     0  9.26      T

13     1  7.63      T

14     2  5.05      T

15     3  2.52      T

16     4  1.95      T

17     5  1.47      T

;

 

ods html;

proc gplot data=lake;

  plot logDO*depth=lakeid;

  run;

 

proc glm;

  class lakeid;

  model logDo = depth lakeid depth*lakeid;

  run;

ods html close;

 

 

 

The SAS System

 

The GLM Procedure

Class Level Information

Class

Levels

Values

lakeid

2

E T

 

Number of Observations Read

17

Number of Observations Used

17

 



 

The GLM Procedure

 

Dependent Variable: logDO

Source

DF

Sum of Squares

Mean Square

F Value

Pr > F

Model

3

1.63564934

0.54521645

212.92

<.0001

Error

13

0.03328897

0.00256069

 

 

Corrected Total

16

1.66893830

 

 

 

 

R-Square

Coeff Var

Root MSE

logDO Mean

0.980054

8.648742

0.050603

0.585094

 

Source

DF

Type III SS

Mean Square

F Value

Pr > F

depth

1

0.76620168

0.76620168

299.22

<.0001

lakeid

1

0.00897392

0.00897392

3.50

0.0839

depth*lakeid

1

0.31441717

0.31441717

122.79

<.0001

 

So, evidence that the relationship between log(DO) and depth differs between Tahoe Keys and Eagle Lake.  Additional analyses would likely focus on slope differences versus intercept differences.

 

An alternative analysis using PROC REG …

data lake;

  input obs depth do lakeid $ @@;

  iTK = (lakeid=”T”);   * defining the indicator variable;

  logDO = log10(do);

  iTK_depth = iTK*depth;

  datalines;

 1     0 10.40      E  2     1  7.50      E  3     2  6.60      E

 4     3  6.10      E  5     4  5.70      E  6     5  5.40      E

 7     6  5.10      E  8    11  2.90      E  9    16  2.00      E

10    21  1.20      E 11    26  1.00      E 12     0  9.26      T

13     1  7.63      T 14     2  5.05      T 15     3  2.52      T

16     4  1.95      T 17     5  1.47      T

;

proc reg data=lake;

  model logDO = depth iTK iTK_depth;

  test iTK=0, iTK_depth=0;

  run;

 

How many observations should be on test?  Determining replicates

 

Warning:  don’t confuse the determination of a replicate with the replication of an entire experiment.  This is the cause of much confusion.

 

You could specify the margin of error for estimating a particular population mean …(OL 14.6)

 

Goal:  A (specified) margin of error of “E” for estimating mi

 

[assuming equal replication r = n1 = n2 = … = nt desired]

 

 

would be the sample size required to by 100(1-a)% confident that the estimator is within “E” units of the treatment mean mi

 

a, E = desired accuracy (specified by experimenter)

 

Estimated standard deviation? 

Sources:  Pilot study; Similar past experiments (literature);  range/4?

 

Example:  Meat Packaging study (revisited)

 

Sample size required to estimate the mean log(bacterial count) with a margin of error = 0.20 and confidence level 95%.  Use the MSE=0.12 from this study as an estimate of the variance.

 

 

You could specify the desired power to detect a specified difference among the means (OL p. 846)

[assuming equal replication r = n1 = n2 = … = nt desired]

 

H0: m1 = m2= m3= … = mt

Ha: mi mj [at least two population means differ]

 

Need to specify

1.         a, significance level

2.         D = pairwise difference that represents an important difference = | mi - mj |

            or S ( mi - m. )2 which can be used to determine

           

 

3.         Power = 1-b

4.         Variance s2

Table 14 in the Appendix has power calculations indexed by

 

Example:  Meat Packaging study (revisited)

 

Sample size required to detect a “0.5” difference in log(bacterial counts) with a power of say 0.8.  Use Type I error rate of 5% and the MSE=0.12 from this study as an estimate of the variance.

 

r

n2=4(r-1)

    = nT-4

*

Power (from App. 14)

5

5.21

1.14

16

7.81

1.40

 

6

6.25

1.25

20

9.38

1.53

 

7

7.29

1.35

24

10.94

1.65

 

8

8.33

1.44

28

12.50

1.77

~0.81

9

9.38

1.53

32

14.06

1.88

~0.84

10

10.42

1.61

36

15.63

1.98

 

11

11.46

1.69

40

17.19

2.07

 

12

12.50

1.77

44

18.75

2.17

 

13

13.54

1.84

48

20.31

2.25

~0.96

* assuming m1 = m2= m3= 3.0 and m4=  3.5

 

This calculation can also be done using software.  For example, SAS now has a power and sample size application that is a web application and it has sample size procedures – PROC POWER and PROC GLMPOWER.

 

Example:  Using SAS PROC GLMPOWER to calculate required sample size to test for group equality among four packaging conditions.

 

 

data meats;

      input condition $ logbact CellWgt;

      datalines;

                  Plastic   3.5   1

                  Mixed     3     1

                  CO2       3     1

                  Vacuum    3     1

                  ;

proc glmpower data=meats;

      class condition;

      model logbact = condition;

      weight CellWgt;

      power

         stddev = .3464

         alpha  = 0.05

         ntotal = .

         power  = 0.8;

   run;

 

The GLMPOWER Procedure

Fixed Scenario Elements

Dependent Variable

logbact

Source

condition

Weight Variable

CellWgt

Alpha

0.05

Error Standard Deviation

0.3464

Nominal Power

0.8

Test Degrees of Freedom

3

 

Computed N Total

Error DF

Actual Power

N Total

32

0.854

36

 

 

 

Example:  Using SAS PROC POWER to calculate required sample size to test for group equality among four packaging conditions.

 

ods html;

proc power;

  onewayanova

  npergroup = .

  power = 0.80

  stddev = 0.3464

  groupmeans =  3 | 3 | 3 | 3.5;

  plot x=power min=.5 max=.95;

  run;

ods html close;

 

The POWER Procedure

Overall F Test for One-Way ANOVA

Fixed Scenario Elements

Method

Exact

Group Means

3 3 3 3.5

Standard Deviation

0.3464

Nominal Power

0.8

Alpha

0.05

 

Computed N Per
Group

Actual Power

N Per Group

0.854

9

 

An alternative perspective where you look at power achieved by different sample sizes

 

ods html;

proc power;

  onewayanova

  npergroup = 5 to 15 by 1

  power = .

  stddev = 0.3464

  groupmeans =  3 | 3 | 3 | 3.5;

  plot x=n min=5 max=15;

  run;

ods html close;

 

The POWER Procedure

Overall F Test for One-Way ANOVA

Fixed Scenario Elements

Method

Exact

Group Means

3 3 3 3.5

Standard Deviation

0.3464

Alpha

0.05

 

Computed Power

Index

N Per Group

Power

1

5

0.528

2

6

0.637

3

7

0.727

4

8

0.798

5

9

0.854

6

10

0.896

7

11

0.927

8

12

0.950

9

13

0.966

10

14

0.977

11

15

0.984