January 16, 2001 Introduction and administrivia. Beginning of course questionnaire. Chapter 1 [DHS01]; [DH] used: ash vs. birch; Fu used: structural vs. statistical pattern recognition. Chapter 1 [DHS01] assigned as reading.
January 18, 2001 Bayesian Decision Theory. Prior, posterior, class-conditional density function, likelihood, evidence (probability of evidence), Bayes decision rule. All introduced for the special one-feature, two-class, classification-only case. (Section 2.1 [DHS]) Homework 1 assigned: problems 10, 11, 12, 13, 14 in Ch.2 (pp.68-69) [DHS]. Later changed to HW1: problems 10 and 11 Ch.2 DHS. The other problems will be assigned as HW2.
January 23, 2001 (Tuesday) Bayesian Decision Theory, ctd. Continuous features: multiple states of nature, feature vectors, extra actions, loss function. Expected loss, conditional risk. Minimum expected loss leads to the Bayes decision rule (i.e., maximum expected utility principle), which minimizes the overall risk. Illustration in the two-category classification case: likelihood ratio and prior odds ratio formulation of the Bayes decision rule.
January 25, 2001 (Thursday) Bayesian Decision Theory, ctd. Minimum-error-rate classification as a special case of minimum expected loss (i.e., minimum risk) classification. Minimax criterion and Neyman-Pearson criteria (basics only). Classifiers, discriminant functions, and decision surfaces. Sections 2.3 and s.4 [DHS].
January 30, 2001 (Tuesday) Bayesian Decision Theory, ctd. The normal (Gaussian) density function. Informal justification of its use for class-conditional densities. Expectation, variance, cumulative distribution. One-dimensional case. Standard normal (mean 0, standard deviation 1). Tables for the cumulative standard normal. How to transform a normal to a standard normal.
February 1, 2001 (Thursday) Bayesian Decision Theory, ctd. Discussion of exercises 10 and 11, Ch.2 DHS. They are due (as HW 1) on Tuesday, February 6. The area under the Gaussian is 1. Moment generating functions; Fourier transform, a.k.a. characteristic function of a random variable. Use of Fourier transform to show that the mean of a Gaussian is mu. A good reference on moment generating functions, etc. is: _Probability and Statistics with Reliability, Queuing, and Computer Science Applications_. Kishor Shridhabhai Trivedi, Prentice-Hall, INC., Englewood Cliffs, New Jersey 07632, 1982.
February 6, 2001 (Tuesday) Bayesian Decision Theory, ctd. HW 1 collected: Exercises 10 and 11, Ch.2, DHS01. The multivariate normal density. Section 2.5 completed.
February 8, 2001 (Thursday) Bayesian Decision Theory, ctd. HW 1 returned and discussed. HW 2 assigned: exercises 12, 13, and 14 part a, Ch2 DHS. Discriminant functions for the normal density: the case of statistically independent features of equal variance. DHS Section 2.6.1.
February 13, 2001 (Tuesday) Bayesian Decision Theory, ctd. HW 2 collected. Discriminant functions for the normal density: boundary regions for the first case; case 2: covariance matrices for all classes are the same but otherwise arbitrary; case 3: the general case. DHS Section 2.6 completed.
February 15, 2001 (Thursday) Bayesian Decision Theory, ctd. HW2 collected and discussed. HW3 assigned: Exercise 14 (b,c,d) Ch.2, due Tuesday, February 20. HW4 assigned: Exercises 8, 9, and 24 Ch.2, due Thursday, February 22. These exercises require the use of Maple, which is installed on the PCs in SUM 361 and (as xmaple) on the Suns with the names of constellations (e.g., pollux). Introduction to Maple with examples related to exercise 14 (b and c), in class, using the laptop gabbiano. (See handout linked to main course page.) Chernoff and Bhattacharya error bounds for the normal distribution in the two-category case.
February 20, 2001 (Tuesday) Bayesian Decision Theory, ctd. HW3 collected. Discussion of problem 8 for HW4. Signal detection theory. Missing values.
February 22, 2001 (Thursday) Bayesian Decision Theory, ctd. HW4 collected. HW5 assigned: Problems 28 and 38 Ch.2, due Tuesday, March 1. Noisy features. Compound Bayes decision theory and context. Bayesian nets and intro to HMMs. Chapter 2 completed!
February 27, 2001 (Tuesday) Parameter Estimation and Supervised Learning. Begin Ch.3: Maximum-Likelihood and Parameter Estimation. Training data or design samples. Parametric vs. non-parametric estimation of class-conditional densities. Maximum-likelihood estimation (MLE) vs. Bayesian estimation. Bayesian learning. Supervised vs. unsupervised learning: clustering (intra-cluster and inter-cluster distances). p(x | w_i ; theta_i) Independent and identically distributed samples. Likelihood and log-likelihood functions. Gradient (operator) (w.r.t. parameters). The Gaussian (multivariate normal) case: unknown mu.
March 1, 2001 (Thursday) Parameter Estimation and Supervised Learning. Begin Ch.3: Maximum-Likelihood and Parameter Estimation. HW4 returned and discussed. HW5 collected. Complete MLE of Gaussian case with unknown mu. Some discussion of unbiased, consistent, and asymptotically unbiased estimators.
March 6, 2001 (Tuesday) Parameter Estimation and Supervised Learning. HW5 returned and discussed. HW6 assigned: problems 1,2,4, ch.3. Andrei Caretnic-Pipa derives the MLE for the variance in the Gaussian case and discusses bias in estimates.
March 8, 2001 (Thursday) Parameter Estimation and Supervised Learning. HW6 is due on Tuesday, 3/20 HW7 assigned: computer exercises 1-4 ch.3 (pp.80-81), due on Tuesday, 3/27 HW8 assigned: problem 15 ch.3 (p.145), due on Thursday, 3/22.
March 20, 2001 (Tuesday) HW6 collected. Discussion of generation of random samples from an arbitraty distribution. Bayesian estimation of mean of normal.
March 22, 2001 (Thursday) Discussion of HW6 (esp. problem 2b). HW6 returned. Discussion of HW8 (esp. problem 15b). HW8 was not collected, but the students were asked to turn it in within a day. More on Bayesian parameter extimation; recursive Bayesian learning. (Ch. 3 through section 3.5: p.102)
March 27, 2001 (Tuesday) I gave an extension on HW7. Some were turned in anyway. Discussion of some parts of HW7. Three kinds or error: Bayes or indistinguishability; model; (parameter) estimation. The curse of dimensionality: complexity of (parameter) estimation (learning) and of model use; inability to do inference (lack of generality).
March 29, 2001 (Thursday) NSDP. My notes. (Motivation: evaluation of HMMs.)
April 3, 2001 (Tuesday) NSDP continued. My notes. (Motivation: evaluation of HMMs.)
April 5, 2001 (Thursday) HW 9 assigned: Exercise 17 p.145, due Thursday, April 12. HMMs. Rabiner's tutorial and my "Aalborg" notes. Link to notes added to web page.
April 10, 2001 (Tuesday) MMs completed. Some discussion of HW9.
April 12, 2001 (Thursday) HW9 collected. HW 10 assigned: exercises 49 and 50 pp.145-155 due Tuesday, April 17. HW11 assigned: exercise 44 p.153 due Thursday, April 19. EM and GEM.
April 17, 2001 (Tuesday) HW10 collected. (Only one student turns it in. There are some problems in the book's formulas for Baum-Welch, and maybe for the backward procedure too.)( Begin Ch.4, Non-parametric methods: general intro and intro to Parzen windows.
April 19, 2001 (Thursday) Parzen windows, through qualitative arguments for the choice of h_n--Figures used. Cross-validation for parameter choice---informal discussion based on questions.
April 24, 2001 (Tuesday) Q&A on time complexity of forward and Viterbi algorithms. Derivation of O(TN^2). (Asymptotic) unbiaseness (i.e. convergence of mean) and consistency (i.e., vanishing variance) of Parzen window estimate. Begin PNNs.
April 26, 2001 (Tuesday) PNNs (materials from Donal Specht's original paper as well as from the textbook), k-nearest-neighbors method: basics. Student survey. (Ms. Lafti is the designated student.)
May 1, 2001 (Tuesday) Fisher('s) linear discriminant. Take-home final exam assigned. It is due on May 8 at noon. Meetings with 11 students for oral part of the final exam set up.