Fall 2010/11
Computational Statistics - Course (52609) and Seminar (52804)
Benjamin Yakir
Mondays 10:30-13:15, Soc. 2205
Contact Info:
Announcements
- We will hold a review session on Sunday, January 16, 2011, 10:00AM in the department's seminar room.
- The final exam will take place on Monday, January 17, 2011.
- Notes regarding the application of the EM for variance components can be found here.
Requirements
- You are required to read the relevant bibliography before class. Instructions regarding the required reading for each class are given below.
- Homework assignments will be handed during class. It is recommended that you try to address the assignments before class. Solutions to the assignments will be discussed in class and/or given on the web. It is required that you go over the solutions and understand them.
- Two take-home assignments will be given during the first semester. At the end of the course there will be a final exam. Each one of the midterm assignments will determine 20% of the final score. The final exam will determine 60% of the score.
- The second semester will be constructed as a seminar. Reading material will be assigned to the participants. The material will be present in class and related statistical issues will be investigate. The outcome of the investigation will be summarized in writing in a structure of a scientific research article.
Bibliography
- R for Beginners by Emmanuel Paradis.
- Bootstrap Methods for Standard Errors, Confidence Intervals, and Other Measures of Statistical Accuracy by B. Efron and R. Tibshirani (Statistical Science, 1986, Vol. 1, No. 1, 54-77).
- Maximum Likelihood from Incomplete Data via the EM Algorithm by A. P. Dempster, N.M. Laired and D.B. Rubin (JRSS B, 1977, Vol. 39, No. 1, 1-38). Here is an alternative link to the paper.
- Explaining the Gibbs Sampler by G. Casella and E.I. George (The American Statistician, 1992, Vol. 46, No. 3, 167-174).
Here is an alternative link to the paper.
- Regression Selection and Shrinkage via the LASSO by R. Tibshirani (Journal of the Royal Statistical Society. Series B (Methodological), 1995, Vol. 58, No. 1, 1996, 267-288).
Here is an alternative link to the paper.
Reading
- For the class of 18-10-10: Read the the first chapter of the class notes:
on Background in Statistics and R and do Homework 1.
(A solution to the homework can be found here.)
- For the class of 25-10-10: Read the the Section 1 and 2 of the paper on the bootstrap. For homework: Program the parametric and non-parametric bootstrap of the bi-normal distribution. It is recommended
to investigate the statistical properties of the procedures. For that, you can generate samples under known conditions and
apply the procedure to them. Iterate several times and base the assessment on the sampling distribution.
- For the class of 01-11-10: We will read the class notes on the bootstrap in a regression example and gene mapping. We will run in class the following code. For homework: Program the parametric and non-parametric bootstrap of the bi-normal distribution. It is recommended
to investigate the statistical properties of the procedures. For that, you can generate samples under known conditions and
apply the procedure to them. Iterate several times and base the assessment on the sampling distribution.
- For the class of 8-11-10:
Read Sections 3, 4 and 5.
You may use existing R functions for application of the statistical procedure in the example that you are analyzing.
In class we considered the example of Cox Regression using this code. For homework: program the bootstrap for at least one of the procedures.
You may investigate the statistical properties of the bootstrap procedure and compare them to the asymptotic estimator
which is is typically produced by the function that applies the procedure. We also considered this code that deals with the Cox Regression and Projection Pursuit Regression.
- For the class of 15-11-10: We will discuss the EM algorithm and demonstrated it for the
normal mixture problem. We will run this code as a demonstration. Read Sections 1-3
of the EM paper by
Dempster et al.. Some may find the article by Blimes to be a gentler tutorial to the EM algorithm.
- For the class of 6-12-10: We will finish the discussion of the EM analysis in the context of missing observations. We will start dealing with the Gibbs sampler.
- For the class of 13-12-10: We will go over the this code for the implementation of the EM algorithm in the context of missing Normal data. We will continue the discussion on the Gibbs samples for MCMC computation. You may read some old class notes on Markov chains and the MCMC algorithms in general and discuss
the Gibbs sampler in particular.
- For the class of 27-12-10: We will go over this code for the application of the Gibbs sampler for the Binomial-Beta setting. We will also look at other applications of the algorithm given in this old code. We intend to give a proof of the Gibbs algorithm and discuss it in other applications. Time permitting, we will start discussing the LASSO paper.
Assignments
- Project 1: To be submitted no later than 29-11-2010.
- Project 2: To be submitted no later than 27-12-2010.
Old Exams
Seminar
Useful Links