Course Title 
COMPUTATIONAL STATISTICS I 
Course code 
STAT 6181 
Level: 
Graduate 
Course Type 
Elective 
Credits 
3 
Prerequisites 
Multivariate calculus, familiarity with basic matrix algebra, graduate course in probability and statistics (MATH 2140). 
Rationale:
In today’s world, many of the problems that statisticians face cannot be handled analytically thus alternative methods must be sought computationally to find approximate solutions to real world problems. This area has burgeoned over the last two decades as the cost of computing power decreased and speed of computers has increased. It is thus necessary to have courses within the MSc and PhD programs in Statistics that would cater for the need of computational skills in order to solve statistical problems that require such skills.
Course Description
This course is meant to cover the basics methods in computational statistics. Techniques such as bootstrap, jackknife, MCMC with particular reference to both hierarchical Bayesian and Empirical Bayes will be covered. The theoretical underpinnings of the course will be covered in conjunction with relevant computational aspects. The course will be hands on and practical and will rely heavily on the statistical software R. Matlab will be utilized where there is a need for numerical computations. We will rely on both real data and simulated data for illustrating the main concepts in the course. Datasets from different subject areas will be utilized.The course is the first in a sequence of two computational statistics courses.
Content
Computational Statistics is a branch of mathematical sciences concerned with efficient methods for obtaining numerical solutions to statistically formulated problems. This course will introduce students to a variety of computationally intensive statistical techniques and the role of computation as a tool of discovery. Topics include numerical optimization in statistical inference (expectationmaximization (EM) algorithm, Fisher scoring, etc.), random number generation, Monte Carlo methods, randomization methods, jackknife methods, bootstrap (parametric and nonparametric) methods,
Aims and Goals
The main goals of the course are:
 Introduce and explore modern computational methods used in statistics.
 Solve numerically some numerical problems associated with statistical routines like the NewtonRapson method.
 Review methods for simulation, estimation and visualization of statistical data.
 Understand the role of computation as a tool of discovery in data analysis.
 Perform simple Bayesian Hierarchical modeling.
 Write the full conditionals for parameters in a hierarchical data setting
 Use appropriate software in Bayesian model estimation.
Objectives
Upon successful completion of this course, students MUST be able to:
 State and use appropriate optimization techniques for single and multiparameter problems
 Apply appropriate bootstrap techniques to both the parametric and nonparametric setting
 Use the Jackknife in parameter estimation
 Derive the Expectation and Maximization steps of the EM Algorithm given a statistical scenario.
 Generate random numbers from various distributions
 Use R and Matlab to solve problems involving numerical approximations
 Apply Bayesian hierarchical models to real world data.
Mode of Delivery
Lectures delivered facetoface. All lectures, assignments, handouts, and review materials are available online to all students. Each face to face session will involve a practical computational component. Students are required to bring their laptops with appropriate software for all classes.
Course content and structure
Week 
Material 
Notes 

1 
Introduction to R and Matlab Review of the main numerical recipes in R. An introduction to matrix algebra using R. Writing short routines in R. 
Course introduction, format of delivery. Lab 
2 
Optimization Theory Single and Multivariate methods 
Lectures, with lecture notes made available *Project topics to be investigated and finalised Lab 
3 
The Basic Bootstraps Parametric and Nonparametric simulation. 
Lectures, with lecture notes made available *Students start research on project and report on progress to Supervisor on weekly basis. Lab 
4 
Bootstrap Further ideas related to semiparametric models and censoring 
Lectures, with lecture notes made available. Lab 
5 
Bootstrap Applications of bootstrap to hypothesis testing. Permutations tests.

Lectures, with lecture notes made available Lab 
6 
Jackknife An introduction to the Jackknife. Illustration of the disadvantages of the Jackknife computationally. 
Lectures, with lecture notes made available Lab 
7 
EM Algorithm Missing Data; Marginalization and Notation. Simple examples for example using Fisher’s data.

Lectures, with lecture notes made available Lab 
8 
EM Algorithm Application of the EM and variants of the EM to problems such as finding the number of components in a mixture of normal. 
Lectures, with lecture notes made available Lab 
9 
EM Algorithm Applications to multivariate data

Lectures, with lecture notes made available Lab 
10 
Bayesian Modeling MCMC, Simple Hierarchical models; Gibbs Sampling; Implementation in R;

Lectures, with lecture notes made available Lab 
11 
Bayesian Modelling Application of Bayeisna models to some advanced MCMC, Simple Hierarchical models; Gibbs Sampling; Implementation in R and Winbugs. 
Lectures, with lecture notes made available Lab 
12 
Bayesian Modelling Examples some more hierarchal models using a combination of R, Winbugs, OpenBugs and RBugs. 
Lectures, with lecture notes made available Lab 
13 
Revision and Group Presentations


Assessment
Coursework 100 %
This course will be assessed completely via 4 individual assignments and one group project. Each assignment and project will involve both theoretical and computer based problems.
Individual Assignments (4) – 60%
Four homework assignments will be given, collected and graded throughout the semester.
While discussion of the homework is allowed, you must prepare your solutions separately. Direct copying of written work or computer code is considered cheating and will result in a zero on the assignment. Assignments are worth 60% of the course grade.
Group Project (1) – 40%
Each student will be required to do a group project during the second half of the semester. The minimum group size is 3, however larger groups are encouraged. The topics will vary and can be discussed with the instructor. The groups will be required to present their project in class on last week of classes. Full details will be given around class session four. The project is worth 40% of the course grade.
Resource requirements
The statistical computing lab already has Stata and Matlab. Open source statistical software such as R, Winbugs and Openbugs will be used as far as possible.
PRESCRIBED TEXTS AND READING MATERIALS
Required reading
Computational Statistics, by G. H. Givens and J. A. Hoeting, (Wiley 2005).
Statistical Computing with R by M. Rizzo, Chapman and Hall
Recommended reading
Hastie, T., Tibshirani, R. and Friedman J. 2009. Elements of Statistical Learning Springer.