IES 612/STA 4-573/STA 4-576

Spring 2005

 

Week 10 – IES612-week10-lecture.doc

 

Topics:

i.    Fractional Factorial

ii.    Response Surface Methods

iii.   Nesting/Crossing and Split-Plots

iv.   Fixed/Random/Fixed Effects Models

 

i. Fractional Factorials

-Notes from Dr. R. Schaefer (10 March 2005)

C:\MyDocs\Class\IES 612\Bailer's 612 Winter 2005\Fractional Factorials.doc

 

GOAL:      To determine which factors from among a very large number are the most important to investigate further in a more thoroughly and completely in a factorial analysis.

 

EXAMPLE

Suppose we want to optimize the operation of a Waste Water Treatment Plant (WWTP).  A WWTP is made up of three operations:  Physical Unit Operation, the Chemical Unit Operation, and the Biological Unit Operation.  Each of these units is made up of five to seven processes.  The Physical Unit Operation is comprised of:  screening, comminution, flow equalization, sedimentation, flotation, and granular-medium filtration.  The Chemical Unit Operation depends on:  chemical precipitation, adsorption, disinfection, dechlorination, and other chemical applications, such as grease removal.  Finally, the Biological Unit Operation consists of:  activated sludge process, aerated lagoons, trickling filters, rotating biological contactors, pond stabilization, anaerobic digestion, and biological nutrient removal.

 

Suppose the manager of such a facility wants to optimize the facility.  Above we have identified eighteen potential variables that could be investigated in an experimental study.  To further simplify, let’s assume that each variable will only be investigated at one of two levels.  The dilemma is that to completely investigate these 18 factors would require we obtain a response measurement for every combination of factors.  This would entail 218 or 262,144 observations!  Suppose further that the WWTP needs to be run a week at each experimental combination for the system to stabilize.  To complete the experiment would require 37,450 weeks or 720 years!  Clearly, this is not a very effective use of time or money.

 

The solution is to eliminate many, if not most, of the factors in a preliminary screening process that will identify those variables that have the largest or greatest impact on the WWTP.

 

The method that is used is known as Fractional Factorial designs.  Such designs attempt to accomplish two goals:

 

1.   Reduce the number of experimental observations by observing ONLY a fraction (say, ½, ¼, etc) of the entire 218 design.

 

2.   Rather than focus on all possible effects normally investigated in a factorial design, notably the main effects and interactions, fractional factorial designs allows the investigator to focus on the main effects and selected “lower order” interaction effects, usually up to degree three.  Recall that in a design with 18 factors, one COULD investigate main effects, two-way interactions, three-way interactions, four-way interactions, all the way up to the eighteen-way interaction. 

 

BACKGROUND

 

We need several concepts and terms to be recalled and/or defined.

 

Two-Way ANOVA Model

 

Recall that in a Two-Way Model, there are two factors, say A and B.  If we observe every combination of levels A and B (that is, we have one observation of the response for each combination of A and B) AND we have at least one replication then we can investigate the effects of A and B (the MAIN EFFECTS) as well as the A*B effect (the INTERACTION EFFECT).

 

Multi-Way ANOVA Model

 

If we take the Two-Way ANOVA and extend it to a Multi-Way ANOVA, a Factorial ANOVA with many factors (say there are “p” factors), we could then investigate the MAIN EFFECTS and INTERACTIONS.  In this Multi-Way case, there are likewise MANY INTERACTIONS.  There are two-way interactions in which the interaction between two of the factors is investigated.  There are three-way interactions in which the interaction between three factors is investigated, all the way to the p-way interaction or interactive effect of all p factors.

 

Now, while we may be able to investigate such interactions the interpretation of higher-order interactions ( beyond two- or three-way interactions ) is difficult.  Hence, higher-order interactions are typically not included in Multi-Way Factorial ANOVA’s since in many cases their effect are usually much smaller than the main and lower-order interaction effects.

 

2k ANOVA Model

 

In this special case of a Multi-Way Factorial Model, we have k factors, but each has exactly 2 levels, usually denoted High and Low, but could actually represent any two values.  However there must be exactly two.

 

For such designs there are 2K design points.  A design point is a combination of the factors at given levels.  For example, in a 23 design, with factors labeled A, B, and C, there are 8 different design points.  Further assume that the two levels of each factor are High and Low.  The eight different design points are designated in the following table.  In the second last column is a convention for concisely labeling each point.

 

Design Point Number

Level of Factor A

Level of Factor B

Level of Factor C

Label

Y

1

L

L

L

(1)

Y(1)

2

H

L

L

a

Ya

3

L

H

L

b

Yb

4

H

H

L

ab

Yab

5

L

L

H

c

Yc

6

H

L

H

ac

Yac

7

L

H

H

bc

Ybc

8

H

H

H

abc

Yabc

 

Note!  DO NOT CONFUSE THE DESIGN POINT LABEL WITH AN EFFECT.  For example, design point “a” is NOT the point in the design that measures the effect of factor A.  It is simple a labeling convention.

 

Now to estimate the effect of a factor, we compare the average of Y’s at the High value of the factor to the average of the Y’s at the Low value of the factor.  For example, the “effect” of factor A would be

 

Confounding in a Design

 

Two effects are CONFOUNDED in an analysis, if the effect of one is not separable from the other.

 

Example:    Suppose we wish to compare two brands of gasoline (say SuperAmerica and Mobil) with respect to Miles Per Gallons (MPG).  In our study, we use eight automobiles (4 large sedans and 4 small compacts).  The four large sedans use Mobil exclusively for a month and the 4 compacts use SuperAmerica for a month.  After the month, we compare the MPG’s for the two brands.


In this context note that if we observe a difference in mean MPG’s for the two brands, WE CAN NOT CONCLUDE that the difference is due to the brands since the SIZE of the car is confounded with gasoline brand.  Any difference we observe could be due to the effect of the different Brands of gas OR the different sizes of cars.  We do not know which!

 

Now while confounding is usually not a wise thing, it can serve a purpose when the number of factors is large.

 

FRACTIONAL FACTORIAL DESIGN

A Fractional Factorial Design is an experiment in which there are k factors of interest, each factors having exactly two levels, and a fraction of the entire design is used in the analysis.  Various fractional designs will choose certain combinations of the design points resulting in

 

1.   a smaller number of observations to be run ( ½, ¼, etc) and

 

2.   certain higher-order interaction effects confounded with main effects and lower order interaction effects.

 

For example, suppose we decide to take a ½ fraction of the 23 design seen earlier with our goal of investigating the Main Effects of A, B, and C.  Further suppose that the 4 ( = ½ of 8 ) design points we choose are 1, 4, 5, and 8.

 

Design Point Number

Level of Factor A

Level of Factor B

Level of Factor C

Label

Y

1

L

L

L

(1)

Y(1)

4

H

H

L

ab

Yab

5

L

L

H

c

Yc

8

H

H

H

abc

Yabc

 

While we only need to obtain 4 observations in this fractional design rather than the original 8, we have not met our goal; we have confounded the effects of factors A and B.  Notice that when we estimate the effect of factor A we obtain:

 

 

And if we estimate the effect of factor B we obtain:

 

 

Note that our estimates of the effect of factors A and B are the same and hence if we use the four design points above, we have confounded the main effects of factors A and B, which is not a very wise idea or result.

 

The goal of fractional factorial designs is to choose the design points to be used in the experiment, so that effects of interest (main and lower-order interactions) are confounded with higher order interactions which are usually small and relatively uninteresting in this preliminary screening phase of the analysis.

 

DESIGN RESOLUTIONS

 

Fractional factorial designs are categorized by their Resolution.  The resolution of a fractional factorial design determines the degree of confounding.  Fractional factorials can be of Resolution III, IV, or V; these are the most common.  The definitions of these degrees of resolution are:

 

Resolution III:         These designs have no main effects confounded with other main effects.  However, main effects will be confounded with two-way interactions and two-way interactions may be confounded with other two-way interactions.


So in a resolution III design, higher-order interactions (higher than three) are confounded with the main effects and two-way interaction effects.

 

Resolution IV:         These designs have no main effects confounded with any other main effect or any other two-way interaction.  Two-way interactions are confounded with other two-way interactions.

 

Resolution V:          These designs have no main effects or two-way interactions confounded with any other main effect or any other two-way interaction.

 

Clearly the “best” design is a resolution V design.

 

EXAMPLE

 

Suppose now instead of the above design we use the following design.  Like the one above we have only 4 points so this design is also a ½ fractional factorial of the 23 or a 23-1 fractional factorial. 

 

Design Point Number

Level of Factor A

Level of Factor B

Level of Factor C

Label

Y

2

H

L

L

a

Ya

3

L

H

L

b

Yb

5

L

L

H

c

Yc

8

H

H

H

abc

Yabc

 

Notice that when we estimate the effect of factors A, B, and C we obtain:

 

Note that each is different and we have not confounded any of the main effects with any other main effect.  But now the question arises, what effect is confounded with a main effect.

 

To investigate this question, let’s consider our present design of four observations in a simple Two-Way ANOVA with factors A and B.  Presenting the data in terms of 2x2 table of factors A and B alone we obtain:

 

 

 

B

A

 

L

H

L

Yc

Yb

H

Ya

Yabc

 

Notice that Yc value is an observation where Factor C was observed at a High value, while Factors A and B were observed at the Low value.

 

Further recall that an interaction effect between two factors is said to exist if the effect of one factor is different within different levels of the other factor.  In the table above, A and B would have an interactive effect if the effect of B (High value of B minus Low value of B) is different for Low and High values of A.  The estimate of this interaction effect between A and B is

 

 

Comparing the last value to the estimated C effect we note they are identical.  So in this case, the C effect is confounded with the AB interaction.  Since we have a main effect confounded with a two-way interaction, the four design points we used is a Resolution III design.  Unfortunately, when k = 3, this is the best you can do in terms of resolution.  For larger values of k, one can find (and prefer) designs with higher resolution.

 

While your present text does not contain such designs, many more advanced design texts will contain tables with different designs of varying resolution for different values of k.  One such text is:

 

Design and Analysis of Experiments, Douglas C. Montgomery, Wiley.

 

Alternatively, if you provide SAS with the value of k and desired resolution of your design, if possible, SAS will provide you with the design points you will need to obtain.  As we saw above, if k = 3, there are no resolution IV or higher designs possible.  However, as k, the number of factors increases, higher resolution designs are possible.

 

SUMMARY

 

Fractional Factorial Designs are used to screen a large number of factors and identify those factors to be used in a traditional factorial design.  Once the important factors are identified, a full factorial model with these factors is fit to the fractional design and effects can then be formally tested using ANOVA techniques.

 


FRACTIONAL FACTORIAL ANALYSIS USING SAS

 

OPTIONS LS=110 PS=60 NODATE PAGENO=1;

TITLE 'FRACTIONAL.SAS';

TITLE2 'MM EXAMPLE 4.2';

DATA ONE;

    INPUT RESIST A B C D E;

    CARDS;

 15.1 -1 -1 -1 -1  1

 20.6  1 -1 -1 -1 -1

 68.7 -1  1 -1 -1 -1

101.0  1  1 -1 -1  1

 32.9 -1 -1  1 -1 -1

 46.1  1 -1  1 -1  1

 87.5 -1  1  1 -1  1

119.0  1  1  1 -1 -1

 11.3 -1 -1 -1  1 -1

 19.6  1 -1 -1  1  1

 62.1 -1  1 -1  1  1

103.2  1  1 -1  1 -1

 27.1 -1 -1  1  1  1

 40.3  1 -1  1  1 -1

 87.7 -1  1  1  1 -1

128.3  1  1  1  1  1

;

PROC PRINT;

PROC FACTEX;

TITLE3 'STEP 1:  CREATING A FRACTIONAL FACTORIAL DESIGN';

    FACTORS A B C D E;

    SIZE DESIGN=16;

    MODEL RESOLUTION=5;

    EXAMINE ALIASING;

PROC GLM DATA=ONE;

TITLE3 'STEP 2:  ANALYZING THE FRACTIONAL FACTORIAL USING PROC GLM';

    MODEL RESIST=A|B|C|D|E @2/SOLUTION;

PROC GLM DATA=ONE;

TITLE3 'STEP 3:  ANALYZING ONLY THE IMPORTANT EFFECTS';

    MODEL RESIST=A B A*B C/SOLUTION;

RUN;

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

                                                FRACTIONAL.SAS                                               1

                                                MM EXAMPLE 4.2

 

                                 Obs    RESIST     A     B     C     D     E

 

                                   1      15.1    -1    -1    -1    -1     1

                                   2      20.6     1    -1    -1    -1    -1

                                   3      68.7    -1     1    -1    -1    -1

                                   4     101.0     1     1    -1    -1     1

                                   5      32.9    -1    -1     1    -1    -1

                                   6      46.1     1    -1     1    -1     1

                                   7      87.5    -1     1     1    -1     1

                                   8     119.0     1     1     1    -1    -1

                                   9      11.3    -1    -1    -1     1    -1

                                  10      19.6     1    -1    -1     1     1

                                  11      62.1    -1     1    -1     1     1

                                  12     103.2     1     1    -1     1    -1

                                  13      27.1    -1    -1     1     1     1

                                  14      40.3     1    -1     1     1    -1

                                  15      87.7    -1     1     1     1    -1

                                  16     128.3     1     1     1     1     1

 

                                   FRACTIONAL.SAS                                               2

                                                MM EXAMPLE 4.2

                               STEP 1:  CREATING A FRACTIONAL FACTORIAL DESIGN

 

                                             The FACTEX Procedure

 

Aliasing Structure

 

 A

 B

 C

 D

 E

 A*B = C*D*E

 A*C = B*D*E

 A*D = B*C*E

 A*E = B*C*D

 B*C = A*D*E

 B*D = A*C*E

 B*E = A*C*D

 C*D = A*B*E

 C*E = A*B*D

 D*E = A*B*C

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

                                   FRACTIONAL.SAS                                               3

                                                MM EXAMPLE 4.2

                          STEP 2:  ANALYZING THE FRACTIONAL FACTORIAL USING PROC GLM

 

                                              The GLM Procedure

 

                                   Number of Observations Read          16

                                   Number of Observations Used          16

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

                                   FRACTIONAL.SAS                                               4

                                                MM EXAMPLE 4.2

                          STEP 2:  ANALYZING THE FRACTIONAL FACTORIAL USING PROC GLM

 

                                              The GLM Procedure

Dependent Variable: RESIST

                                                     Sum of

             Source                      DF         Squares     Mean Square    F Value    Pr > F

             Model                       15     23260.21937      1550.68129        .       .

             Error                        0         0.00000          .

             Corrected Total             15     23260.21937

 


 

                             R-Square     Coeff Var      Root MSE    RESIST Mean

 

                             1.000000           .               .       60.65625

 

             Source                      DF       Type I SS     Mean Square    F Value    Pr > F

 

             A                            1      2155.28062      2155.28062        .       .

             B                            1     18530.01563     18530.01563        .       .

             A*B                          1       693.00563       693.00563        .       .

             C                            1      1749.33063      1749.33063        .       .

             A*C                          1         7.98063         7.98063        .       .

             B*C                          1         3.70563         3.70563        .       .

             D                            1         7.98062         7.98062        .       .

             A*D                          1        26.78063        26.78063        .       .

             B*D                          1        28.89063        28.89063        .       .

             C*D                          1         3.15062         3.15062        .       .

             E                            1         0.60063         0.60063        .       .

             A*E                          1        26.78063        26.78063        .       .

             B*E                          1         0.39063         0.39063        .       .

             C*E                          1        14.25063        14.25063        .       .

             D*E                          1        12.07563        12.07563        .       .

 

             Source                      DF     Type III SS     Mean Square    F Value    Pr > F

 

             A                            1      2155.28063      2155.28063        .       .

             B                            1     18530.01563     18530.01563        .       .

             A*B                          1       693.00563       693.00563        .       .

             C                            1      1749.33063      1749.33063        .       .

             A*C                          1         7.98063         7.98063        .       .

             B*C                          1         3.70563         3.70563        .       .

             D                            1         7.98062         7.98062        .       .

             A*D                          1        26.78063        26.78063        .       .

             B*D                          1        28.89063        28.89063        .       .

             C*D                          1         3.15062         3.15062        .       .

             E                            1         0.60062         0.60062        .       .

             A*E                          1        26.78063        26.78063        .       .

             B*E                          1         0.39063         0.39063        .       .

             C*E                          1        14.25063        14.25063        .       .

             D*E                          1        12.07563        12.07563        .       .

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

                                              The GLM Procedure

 

Dependent Variable: RESIST

                                                        Standard

                      Parameter         Estimate           Error    t Value    Pr > |t|

 

                      Intercept      60.65625000               .        .         .

                      A              11.60625000               .        .         .

                      B              34.03125000               .        .         .

                      A*B             6.58125000               .        .         .

                      C              10.45625000               .        .         .

                      A*C             0.70625000               .        .         .

                      B*C             0.48125000               .        .         .

                      D              -0.70625000               .        .         .

                      A*D             1.29375000               .        .         .

                      B*D             1.34375000               .        .         .

                      C*D             0.44375000               .        .         .

                      E               0.19375000               .        .         .

                      A*E             1.29375000               .        .         .

                      B*E            -0.15625000               .        .         .

                      C*E             0.94375000               .        .         .

                      D*E            -0.86875000               .        .         .

 

                                   FRACTIONAL.SAS                                               6

                                                MM EXAMPLE 4.2

                                STEP 3:  ANALYZING ONLY THE IMPORTANT EFFECTS

 

                                              The GLM Procedure

 

                                   Number of Observations Read          16

                                   Number of Observations Used          16

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

 

                                   FRACTIONAL.SAS                                               7

                                                MM EXAMPLE 4.2

                                STEP 3:  ANALYZING ONLY THE IMPORTANT EFFECTS

 

                                              The GLM Procedure

 

Dependent Variable: RESIST

                                                     Sum of

             Source                      DF         Squares     Mean Square    F Value    Pr > F

             Model                        4     23127.63250      5781.90813     479.69    <.0001

             Error                       11       132.58687        12.05335

             Corrected Total             15     23260.21937

 

                             R-Square     Coeff Var      Root MSE    RESIST Mean

                             0.994300      5.723720      3.471794       60.65625

 

             Source                      DF       Type I SS     Mean Square    F Value    Pr > F

             A                            1      2155.28062      2155.28062     178.81    <.0001

             B                            1     18530.01563     18530.01563    1537.33    <.0001

             A*B                          1       693.00563       693.00563      57.49    <.0001

             C                            1      1749.33063      1749.33063     145.13    <.0001

 

 

             Source                      DF     Type III SS     Mean Square    F Value    Pr > F

             A                            1      2155.28063      2155.28063     178.81    <.0001

             B                            1     18530.01563     18530.01563    1537.33    <.0001

             A*B                          1       693.00563       693.00563      57.49    <.0001

             C                            1      1749.33063      1749.33063     145.13    <.0001

 

 

                                                        Standard

                      Parameter         Estimate           Error    t Value    Pr > |t|

                      Intercept      60.65625000      0.86794845      69.88      <.0001

                      A              11.60625000      0.86794845      13.37      <.0001

                      B              34.03125000      0.86794845      39.21      <.0001

                      A*B             6.58125000      0.86794845       7.58      <.0001

                      C              10.45625000      0.86794845      12.05      <.0001

 


ii. Response Surface Methods

From Dr. R. Noble

 

Goal:  Determine factor level combination that leads to an optimal response.

 

Natural units:  X1: L1 to H1, X2: L2, H2

 

Conversion to coded units from natural units:

Conversion to natural units from coded units:

 

 

   C1  C2   d     X1           X2

  -1  -1   0     L1           L2

   1  -1   0     H1           L2

  -1   1   0     L1           H2

   1   1   0     H1           H2

   0   0   1   0.5(L1+H1)    0.5(L2+H2)

   0   0   1   0.5(L1+H1)    0.5(L2+H2)

   0   0   1   0.5(L1+H1)    0.5(L2+H2)

   0   0   1   0.5(L1+H1)    0.5(L2+H2)

 

 

Model

Yi = b0 + b1C1i + b2C2i + b3di + ei    where ei N(0,s2)

 

 

 

 


/* x1: 11 to 16

   x2: 25 to 32 */

data phase1;

  input x1 x2 center y;

  code1 = (2*x1 - 11 - 16)/(16 - 11);

  code2 = (2*x2 - 25 - 32)/(32 - 25);

  cards;

11   25   0 7.52

11   32   0 5.01

16   25   0 6.01

16   32   0 9.27

13.5 28.5 1 5.45

13.5 28.5 1 8.6

13.5 28.5 1 7.11

13.5 28.5 1 5.12

;

proc reg;

  model y = code1 code2 center;

run;

 

 

------------------------------------------------------------------------------

 

                              The REG Procedure

                                Model: MODEL1

                            Dependent Variable: y

 

                             Analysis of Variance

 

                                    Sum of           Mean

Source                   DF        Squares         Square    F Value    Pr > F

 

Model                     3        2.32386        0.77462       0.19    0.8964

Error                     4       16.09263        4.02316

Corrected Total           7       18.41649

 

 

             Root MSE              2.00578    R-Square     0.1262

             Dependent Mean        6.76125    Adj R-Sq    -0.5292

             Coeff Var            29.66583

 

 

                             Parameter Estimates

 

                          Parameter       Standard

     Variable     DF       Estimate          Error    t Value    Pr > |t|

 

     Intercept     1        6.95250        1.00289       6.93      0.0023

     code1         1        0.68750        1.00289       0.69      0.5307

     code2         1        0.18750        1.00289       0.19      0.8608

     center        1       -0.38250        1.41830      -0.27      0.8007
Gradient vector:

 

      =  [0.9648 0.26312]

 

 

Multiplier   C1       C2      X1     X2      Y

   2       1.9295   0.5262   18.3   30.3   14.83

   3       2.8943   0.7894   20.7   31.3   11.75

   4       3.8591   1.0525   23.1   32.2   13.39

   5       4.8238   1.3156   25.6   33.1   20.23

   6       5.7886   1.5787   28.0   34.0   31.15

   7       6.7533   1.8418   30.4   34.9   38.81

   8       7.7181   2.1049   32.8   35.9   43.55

   9       8.6829   2.3681   35.2   36.8   41.49

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



Phase 2 center: X1 = 32.8, X2 = 35.9

 

/* x1: 32.8 +/- 4

   x2: 35.9 +/- 4 */

data phase2;

  input x1 x2 center y;

  code1 = (2*x1 - 29 - 37)/(37 - 29);

  code2 = (2*x2 - 32 - 40)/(40 - 32);

  cards;

32.8 35.9 1 43.55

28.8 31.9 0 21.89

36.8 31.9 0 17.22

28.8 39.9 0 66.87

36.8 39.9 0 32.89

32.8 35.9 1 41.64

32.8 35.9 1 47.43

32.8 35.9 1 44.54

;

proc reg;

  model y = code1 code2 center;

run;

 

 

 

------------------------------------------------------------------------------

 

                              The REG Procedure

                                Model: MODEL1

                            Dependent Variable: y

 

                             Analysis of Variance

 

                                    Sum of           Mean

Source                   DF        Squares         Square    F Value    Pr > F

 

Model                     3     1476.32676      492.10892       8.48    0.0330

Error                     4      232.26122       58.06531

Corrected Total           7     1708.58799

 

 

             Root MSE              7.62006    R-Square     0.8641

             Dependent Mean       39.50375    Adj R-Sq     0.7621

             Coeff Var            19.28946

 

 

                             Parameter Estimates

 

                          Parameter       Standard

     Variable     DF       Estimate          Error    t Value    Pr > |t|

 

     Intercept     1       34.71750        3.81003       9.11      0.0008

     code1         1       -9.66250        3.81003      -2.54      0.0642

     code2         1       15.16250        3.81003       3.98      0.0164

     center        1        9.57250        5.38820       1.78      0.1503

      =  [–0.5374    0.8433]

 

 

Multiplier   C1       C2      X1     X2      Y

   2      -1.0748   1.6866   28.5   42.6   70.84

   3      -1.6122   2.5300   26.4   46.0   67.84

 

 

 

 

 

 

 

 

 

 

 

 

 



Phase 3 center: X1 = 28.5, X2 = 42.6

 

/* x1: 28.5 +/- 4

   x2: 42.6 +/- 4 */

data phase3;

  input x1 x2 center y;

  code1 = (2*x1 - 24.5 - 32.5)/(32.5 - 24.5);

  code2 = (2*x2 - 38.6 - 46.6)/(46.6 - 38.6);

  cards;

28.5 42.6 1 70.84

24.5 38.6 0 34.47

32.5 38.6 0 57.62

24.5 46.6 0 53.78

32.5 46.6 0 47.79

28.5 42.6 1 69.68

28.5 42.6 1 75.45

28.5 42.6 1 71.08

;

proc reg;

  model y = code1 code2 center;

run;

 

 

------------------------------------------------------------------------------

 

                                The SAS System                             

 

                              The REG Procedure

                                Model: MODEL1

                            Dependent Variable: y

 

                             Analysis of Variance

 

                                    Sum of           Mean

Source                   DF        Squares         Square    F Value    Pr > F

 

Model                     3     1186.29551      395.43184       6.83    0.0472

Error                     4      231.53617       57.88404

Corrected Total           7     1417.83169

 

 

             Root MSE              7.60816    R-Square     0.8367

             Dependent Mean       60.08875    Adj R-Sq     0.7142

             Coeff Var            12.66153

 

 

                             Parameter Estimates

 

                          Parameter       Standard

     Variable     DF       Estimate          Error    t Value    Pr > |t|

 

     Intercept     1       48.41500        3.80408      12.73      0.0002

     code1         1        4.29000        3.80408       1.13      0.3225

     code2         1        2.37000        3.80408       0.62      0.5670

     center        1       23.34750        5.37978       4.34      0.0123

 

Significant curvature

 

Run axial runs in order to estimate quadratic effects

 

 

  C1       C2      X1     X2    

-1.4142   0        22.8   42.6

 1.4142   0        34.2   42.6

 0       -1.4142   28.5   36.9

 0        1.4142   28.5   48.3

 

 

Model

Yi = b0 + b1x1i + b2x2i + b3 +b4x1ix2i + b5 + ei     where ei N(0,s2)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



data quadratic;

  input x1 x2 y;

  cards;

28.5 42.6 70.84

24.5 38.6 34.47

32.5 38.6 57.62

24.5 46.6 53.78

32.5 46.6 47.79

28.5 42.6 69.68

28.5 42.6 75.45

28.5 42.6 71.08

22.8 42.6 30.69

34.2 42.6 41.58

28.5 36.9 46.21

28.5 48.3 63.77

;

proc rsreg;

  model y = x1 x2 / nocode;

run;

 

 

------------------------------------------------------------------------------

                             The RSREG Procedure

 

                       Response Surface for Variable y

 

                   Response Mean                  55.246667

                   Root MSE                        3.186475

                   R-Square                          0.9761

                   Coefficient of Variation          5.7677

 

                               Type I Sum

   Regression          DF      of Squares    R-Square    F Value    Pr > F

 

   Linear               2      280.145763      0.1099      13.80    0.0057

   Quadratic            2     1996.161673      0.7830      98.30    <.0001

   Crossproduct         1      212.284900      0.0833      20.91    0.0038

   Total Model          5     2488.592337      0.9761      49.02    <.0001

 

                                           Sum of

            Residual           DF         Squares     Mean Square

 

            Total Error         6       60.921730       10.153622

 

 

                                           Standard

    Parameter    DF        Estimate           Error    t Value    Pr > |t|

 

    Intercept     1    -2285.208774      205.175353     -11.14      <.0001

    x1            1       80.792050        6.146719      13.14      <.0001

    x2            1       54.857762        7.222680       7.60      0.0003

    x1*x1         1       -1.059339        0.077885     -13.60      <.0001

    x2*x1         1       -0.455313        0.099577      -4.57      0.0038

    x2*x2         1       -0.479006        0.077885      -6.15      0.0008

 

 


                             Sum of

      Factor     DF         Squares     Mean Square    F Value    Pr > F

 

      x1          3     2223.109257      741.036419      72.98    <.0001

      x2          3      744.013683      248.004561      24.43    0.0009

 

 

                             The RSREG Procedure

                    Canonical Analysis of Response Surface

 

                                          Critical

                            Factor           Value

 

                            x1           28.765411

                            x2           43.590782

 

                Predicted value at stationary point: 72.445880

 

 

                                         Eigenvectors

                  Eigenvalues              x1              x2

 

                    -0.400358       -0.326531        0.945186

                    -1.137986        0.945186        0.326531

 

                        Stationary point is a maximum.

 


Summary:

 

D1 = Design 1

P1 = Path of steepest ascent 1

D2 = Design 2

P2 = Path of steepest ascent 2


D3 = Design 3

 


Truth (always unknown in practice)

 

     

 

e ~ N(0, 4)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 


Predicted max @ X1 = 28.765411, X2 = 43.590782

 

 

Actual max @ X1 = 28.80, X2 = 43.97


iii.   Nesting/Crossing and Split-Plot Designs

 

Factorial treatment structures represent CROSSING of factors – all levels of one factor are combined with all levels of other factors.

 

If Factors are NESTED, then levels of some factor occur within levels of another factor.

 

Classic example:  Split-Plot Designs (OL 17.6, p. 1014+)

 

Can only apply treatment to a WHOLE PLOT while other treatments can be randomly assigned to SUB-PLOTS.

 

Factor A:  fertilizer (2 levels A1 and A2) – can only be applied to WHOLE PLOT

 

Factor T:  varieties (3 levels T1, T2, and T3) – can be randomly assigned to SUB-PLOTS

 

Fertilizer applied to WHOLE PLOTS

A= A1

 

A=A2

 

A=A2

 

A=A1

T2

 

T3

 

T1

 

T3

T1

 

T2

 

T3

 

T1

T3

 

T1

 

T2

 

T2

 

 Treatments/varieties randomly assigned to SUB-PLOTS within the WHOLE PLOTS

 

Notice that VARIETY is NESTED in PESTICIDE.  A model for this design (p. 1015) is

 

yijk = m + ai + tj + (at)ij +dik + eij

 where Sai = Stj = Si (at)ij = Sj(at)ij = 0,  djk ~ ind. N(0,  ),  and eij ~ ind. N(0,  ) [djk and eij ind.]

 

* Error terms may change for testing hypotheses about model parameters:

 

 Between WHOLE PLOTs:  H0: a1 =a2 =…=aa =0 tested by Fobs = MSA/MS(A)

 

Within Wholeplots:  H0: atij = 0 tested by Fobs = MSAT/MSE

 

Within Wholeplots:  H0: t1 =t2 =…=tt =0 tested by Fobs = MST/MSE

 

* You could also define WHOLE PLOTS within BLOCKS as in Example 17.11 (p. 1017).

*  This type of design might be appropriate for the analysis of artificial mesocosms/ponds.  For example, you may need to apply a sediment treatment to an entire pond that is then subdivided into sections for other treatments.

*  MORE MAY BE ADDED

 

 


iv.   Fixed/Random/Mixed Effects Models

 

Fixed Effect = levels of Factor are of particular interest for inference [how do mean responses differ at different factor levels?]

 

Random Effect = levels of Factor are selected from population of possible factor levels – levels are not of particular inference [is this factor an important source of variation in the distribution of responses?]

 

Fixed Effects models = comprised of ONLY Fixed factors

e.g. yij = m + ai + eij  where Sai = 0 and eij ~ ind. N(0,  )

H0: a1 =a2 =…=at =0

E[yij] = m + ai

V[yij] = 

 

Random Effects models = comprised of ONLY Random factors

e.g. yij = m + ai + eij  where ai ~ N(0,  ) and eij ~ ind. N(0,  )

H0:  =0

E[yij] = m

V[yij] =  +

 

Mixed Effects models = has both fixed and random factors

 

Random Effects Models

 

yij = m + ai + eij  where ai ~ N(0,  ) and eij ~ ind. N(0,  )

 

title "Random effect";

title2 "Ott/Longnecker p. 981 - example 17.1";

data draneff;

  input station intensity @@;

  datalines;

1 20 1 1050 1 3200 1 5600 1 50

2 4300 2 70 2 2560 2 3650 2 80

3 100 3 7700 3 8500 3 2960 3 3340

;

proc glm;

  class station;

  model intensity=station;

  random station;

run;

 

ods html close;

 

Random effect

Ott/Longnecker p. 981 - example 17.1

 

The GLM Procedure

Class Level Information

Class

Levels

Values

Station

3

1 2 3

 

Number of Observations Read

15

Number of Observations Used

15

 


Random effect

Ott/Longnecker p. 981 - example 17.1

 

The GLM Procedure

 

Dependent Variable: intensity

Source

DF

Sum of Squares

Mean Square

F Value

Pr > F

Model

2

20259573.3

10129786.7

1.38

0.2884

Error

12

87989600.0

7332466.7

 

 

Corrected Total

14

108249173.3

 

 

 

 

R-Square

Coeff Var

Root MSE

intensity Mean

0.187157

94.06622

2707.853

2878.667

 

Source

DF

Type I SS

Mean Square

F Value

Pr > F

Station

2

20259573.33

10129786.67

1.38

0.2884

 

Source

DF

Type III SS

Mean Square

F Value

Pr > F

Station

2

20259573.33

10129786.67

1.38

0.2884

 

The GLM Procedure

Source

Type III Expected Mean Square

Station

Var(Error) + 5 Var(station)

 

 

Mixed Effects Models

 

yij = m + ai + bj + eij  where Sai = 0,  bj ~ ind. N(0,  ),  and eij ~ ind. N(0,  )

 

Repeated measurements Examples:

1.   Respondents in the same household

2.   Students in the same classroom

3.   Pups in the same litter

4.   Multiple measurements of the same individual [repeated measurements, longitudinal, time series, growth curves]

 

options nocenter formdlim="-";

/* data from Verbeke and Molenberghs 2.5

   age(yrs); dist (center of pituitary to maxillary fissure)

*/

data growth;

  input girl age dist @@;

  datalines;

1 8 21 1 10 20 1 12 21.5 1 14 23

2 8 21 2 10 21.5 2 12 24 2 14 25.5

3 8 20.5 3 10 24.0 3 12 24.5 3 14 26

4 8 23.5 4 10 24.5 4 12 25.0 4 14 26.5

5 8 21.5 5 10 23 4 12 22.5 4 14 23.5

6 8 20 6 10 21 6 12 21 6 14 22.5

7 8 21.5 7 10 22.5 7 12 23.0 7 14 25

8 8 23 8 10 23 8 12 23.5 8 14 24

9 8 20 9 10 21 9 12 22 9 14 21.5

10 8 16.5 10 10 19 10 12 19 10 14 19.5

11 8 24.5 11 10 25 11 12 28 11 14 28

;

proc gplot;

title “Growth data for 11 girls – pituitary to maxillary fissure”;

  plot dist*age=girl;

  run;

 

proc reg data = growth;

title2 "OLS ignoring multiple measurements per girl";

  model dist = age;

  run;

 

proc mixed data=growth;

  title2 "Random intercept for each girl";

  class girl;

  model dist = age / solution;

  random intercept / type=un subject=girl;

  run;

 

proc mixed data=growth;

  title2 "Random intercept and slope for each girl";

  class girl;

  model dist = age / solution;

  random intercept age / type=un subject=girl;

  run;

 

 

*  MORE MAY BE ADDED