Course Code:               STAT 3001

Course Title:                 Experimental Design and Sampling Theory

Course Type:                Core

Level:                            3

Semester:                      1

No. of Credits:                3

Pre-requisites:                Statistics I (MATH2275)

Course Rationale

This course is designed for individuals interested in understanding survey sampling methods and experimental designs and applying them in practice. One of the greatest statistical needs in the Caribbean is data collection through the development of appropriate sampling designs. Introductory course work in applied statistical methods, at least one and possibly two semesters of basic statistics is strongly recommended. Students should be familiar with descriptive statistics, the normal and binomial distributions, chance selection, expected values, standard error, and confidence intervals. Although a rigorous development of the mathematical aspects of sampling theory will not be covered, statistical notation and algebraic derivations will be used for key formulas, and a comfort level with algebraic arguments as used in introductory applied statistics courses will be helpful. The course is highly practical in nature.

Experiments are carried out to find the effects of factors on a dependent variable. Experiments abound in all disciplines stretching from manufacturing to Agriculture. By controlling extraneous factors, we can truly measure the effect factors have on the dependent variable. Experimental designs are becoming more complicated with some designs having unbalanced and/or missing data. Knowledge of the underlying structure and the design of an experiment are therefore crucial in the proper analysis of data.

This course seeks to give the student the tools necessary first of all to design experiments and then to use appropriate statistical methods to analyze experimental data. This training is helpful in the physical sciences as well as the social sciences.

The course is necessary to provide the necessary practical foundation for execution of a sample survey and experimental designs using proper analysis of the data using suitable software. The exposure of students to this course helps them to think critically, communicate statistical results to a non-statistical audience, think about ethical issues involved in the data collection and dissemination of results and be IT skilled through the use of statistical and word processing software. This course is therefore one that captures many of the key attributes of our graduates.

Course Description

This course aims to deliver basic ideas of sampling and experimental design from an applied perspective and to provide experience with real-like problems and data. The course will cover the main techniques used in actual sampling practice — simple random sampling, stratification, systematic selection and cluster sampling.

This is an applied statistical methods course. It differs from most statistics courses because it is concerned as much with the design of data collection as with the analysis of data. The course will concentrate on problems of applying sampling methods to human populations, because survey practices are widely used in that area, and because sampling human populations pose particular problems not found in sampling of other types of units. However, the principles of sample selection can be applied to many other types of populations.

The experimental designs covered are sufficient to provide students with the knowledge and capability to execute and advise on experiments in an of the sciences. Students get exposure to the analysis of real datasets using appropriate statistical software like SPSS and R to analyze survey data.

At the end of this course students would be empowered to design a sample survey, execute the survey and analyze the data using a suitable statistics package like SPSS.

All lectures, assignments, handouts, and review materials are available online through myeLearning to all students. Blended leaning techniques will be employed. Lectures will be supplemented with laboratory work and group discussions.

Assessment is designed to encourage students to work continuously with the course materials. Active learning will be achieved through assignments and problem sheets allowing continuous feedback and guidance on problem solving techniques in tutorials and lectures. Assessment will be based on the assignments and in-course tests followed by a comprehensive final examination.

Learning Outcomes

Upon successful completion of the course, students will be able to:

• Analyze data from a sample surveys suing appropriate software such as SPSS, R and Stata.
• Provide a suitable design and canvass for a sample survey for different data collection scenarios.
• State the advantages and disadvantages of both probability and no-probability designs.
• Derive confidence intervals for some  common estimators like the population mean and population total under different sampling designs.
• Compute the relative efficiencies of the ratio, regression and difference estimators.
• Apply ratio, regression and difference estimation to suitable sampling problems.
• Identify the different types of experimental designs such as the Completely Randomized Design, the Randomized Complete Block Design ( as well as the Incomplete Complete Block Design), Latin Square and Greaco-Latin Square design, Factorial Designs including Fractional Factorial designs and Nested Designs.
• Use the statistical software R, Minitab and SPSS to analyze designs.
• Construct an analysis of variance table for various designs and perform appropriate F-tests in order to determine significance difference between levels of factors and significance of interactions.
• Differentiate between fixed effects and random effects models and give situations when each is appropriate.

Content

The main topics to be covered in this course involve:

• Elements of the Sampling Problem: Discussion of some key factors to be cognizant of when carrying out sample surveys. Non-probability samples – their advantages and disadvantages.

• Review of some Basic Concepts of Statistics. Use of statistics in summarizing information. Sampling distributions. Estimation of population parameters.

• Simple random sampling, estimators of associated parameters and their properties. Extension to sampling for proportions and percentages.

• Stratified random sampling, estimators of associated parameters and their properties… Extension to sampling for proportions and percentages.

• Ratio estimator, bias and variance of ratio estimator, sample estimation of variance of ratio estimate. Comparison with mean per unit estimator.

• Systematic sampling, comparison with stratified random sampling, problems of linear trend, and/or periodic variations. Variance of the estimated mean and sample estimate thereof.

• Collecting data by experiment, Principles of experimental design, Simple design ideas, quick look at ANOVA.

• Models, matrix formulation,  parameter estimation, contrasts inference, subdivision of Total Sum of Squares (TSS), parameterisations.

• Fixed and Random effects model, residual analysis, contrasts. Multiple hypothesis tests.

• Fixed, Random and Mixed models, randomised block designs, Efficiency, additivity, interaction, missing values, Balanced incomplete block, Latin Squares, Graeco-Latin squares,

Teaching Methodology

Lectures, tutorials, assignments and problem papers.

Lectures: Two lectures per week.

Labs: One two hour computer lab per week.

Assignments: One assignment (marked) per week.

Additional problems will be given during lectures and tutorials but will not be marked. However, students will need to do some of these, as well as the assignments, in order to learn the material properly and to adequately prepare for examinations and quizzes. The tutorial will be interspersed with the lectures by having students discuss exercises, revise material as needed, and cover new content each day.  Course materials such as exercises, assignments, solutions will be posted on myeLearning.

Assessment

This course assessment is assessed by coursework accounting for 50% of the final grade and a two hour final examination worth 50%. The coursework is broken up as follows:

1. Three Mid-term tests (1 hour) worth 30% of the student’s final grade
2. Problems papers and lab reports, worth 10%
3. Group Survey Project Report and Presentation worth 10%

Course Calendar

WEEK

LECTURE TOPICS

ASSIGNMENTS

TESTS

1

Introduction/Course Overview

Objectives and mechanics of the course; Introduction to sample surveys and survey methodology; concepts relating to populations; probability and non-probability sampling; sampling frames, sampling units, analytical units; sampling measurements and summary statistics.

Assignment 1 handed out

None

2

ELEMENTS OF THE SAMPLING PROBLEM

Technical Terms. How to Select the Sample: The Design of the Sample Survey. Sources of Errors in

Surveys. Designing a Questionnaire. Planning a Survey.

None

3

SOME BASIC CONCEPTS OF STATISTICS.

Summarizing Information in Populations and Samples: The Infinite Population Case. Summarizing Information in Populations and Samples: The Finite Population Case. Sampling Distributions. Covariance and Correlation. Estimation.

Assignment     3

Assignment 2 returned.

none

4

SIMPLE RANDOM SAMPLING.

How to Draw a Simple Random Sample. Estimation of a Population Mean and Total. Selecting the Sample Size for Estimating Population Means and Totals. Estimation of a Population Proportion. Comparing Estimates.

Assignment 4  Assignment 3 returned

None

5

STRATIFIED RANDOM SAMPLING

How to Draw a Stratified Random Sample. Estimation of a Population Mean and Total. Selecting the Sample Size for Estimating Population Means and Totals. Allocation of the Sample. Estimation of a Population Proportion. Selecting the Sample Size and Allocating the Sample to Estimate Proportions.

Additional Comments on Stratified Sampling. An Optimal Rule for Choosing Strata. Stratification after Selection of the Sample. Double Sampling for Stratification.

Assignment     5

Assignment  4 returned

Test 1 on material in weeks 1 to 3.

6

RATIO, REGRESSION, AND  DIFFERENCE ESTIMATION

Surveys that Require the Use of Ratio Estimators. Ratio Estimation Using Simple Random Sampling.

Selecting the Sample Size. Ratio Estimation in Stratified Random Sampling. Regression Estimation.

Difference Estimation. Relative Efficiency of Estimators.

Assignment     6

Assignment 5 returned

7

INTRODUCTION TO BASIC CONCEPTS OF EXPERIMENTAL DESIGNS

Introduction to the terminology of designs of experiments. Review of some basic statistical concepts.

Computing: Installation and review of R. Introduction to Minitab and SPSS.

Review of some more basic statistics and

Statistics of the Completely randomized design and

Assignment     7

Assignment 6 Returned

8

INTRODUCTION TO COMPLETELY RANDOMIZED DESIGNS

The two sample t test with equal variance. The Completely randomized design as an extension of the two sample t test. Computing: Use of statistical software to analyze data from completely randomized designs.

Assignment     8

Assignment 7 returned

Test 2 on material in weeks 4, 5 and 6.

9

FIXED AND RANDOM ONE-WAY ANOVA.

The construction of the ANOVA table for the Completely randomized Design.

The randomized block design as a design for controlling one source of variation.

Computing: Use of statistical software to analyze data from randomized block designs.

Assignment     9

10

INTRODUCTION TO THE RANDOMIZED COMPLETE BLOCK DESIGN

Statistics of the Randomized complete block design and other additive blocking models.

The randomized block design as a design for controlling one source of variation.

Assignment   10

Assignment 9 returned

11

RANDOMIZED COMPLETE BLOCK DESIGNS

The ANOVA table for the Randomized Design.

Computing: Use of statistical software to analyze data from randomized block designs, Latin Squares and Graeco-Latin squares.

Incomplete Block Designs.

Analysis of Incomplete Block designs.

Computing: Coding of Incomplete Block designs in statistical software.

Assignment   11

12

INTRODUCTION TO FACTORIAL DESIGNS.

The advantages of factorials design. The 2-factor factorial design.

Computing: Use of statistical software to analyze data from factorial designs.

Assignment   12

13

Revision

Revision

Test 3 on material in weeks 7 to 12.

Essential Texts:

Elementary Survey Sampling, 6th Edition, by Scheaffer, Mendenhall and Ott (Duxbury).

Montgomery, D., Designs of Experiments Wiley; 5th edition.

Extra course material will either be provided in class.

Other Recommended Texts:

Sampling : Design and Analysis, 1st edition, by Sharon L. Lohr, Duxbury Press.

Sampling Techniques, 3rd ed., by William G. Cochran (J. Wiley and Sons, Inc., New York, 1977).

Survey Sampling by Leslie Kish (J.W. Wiley & Sons, New York, 1965).

Kutner, M., Nachtsheim, C., Neter, J. and W. Li Applied Linear Statistical Models McGraw-Hill/Irwin; 5th edition

Morris M., D: Design of Experiments: An Introduction Based on Linear Models1st edition, 2010, Chapman and Hall/CRC.

Cox D, R: Planning of Experiments, 1st edition, 1992. Wiley-Interscience.

SOFTWARE

Some of the assigned exercises will require the use of R, Minitab and SPSS statistical packages. R is free statistical software. Students can download R and use it on their personal computers. Minitab is free for UWI students. SPSS can be accessed at the computer labs.