Stat Consortium

Seminar Events 2008-2009

Co-sponsored with AMSC Program Applied Statistics

SPEAKER: Miscellaneous Faculty from: MATH, ENEE, CMSC, JPSM, BMGT, ANSC, ISR, UMIACS, ,

TITLE:     10-Minute Applied Statistics Madness

TIME AND PLACE:  Tuesday, May 5, 2009, 3:30pm-5:30pm
            Colloquium Room 3206, Math Building

This Event consists of 10-minute presentations of individual faculty members' applied statistical research together with briefer overviews of applied statistical research of colleagues in their same academic units. See flyer here for more information.

The talks will be followed by a Reception with food from 5:30-6:30pm in MTH 3206, the Math Department Lounge.




Co-sponsored with Statistics Seminar, Mathematics Department

SPEAKER: Prof. Edward J. Wegman,   George Mason University

TITLE:     Mixture Models for Document Clustering

TIME AND PLACE:  Thursday, October 30, 2008, 3:30pm
            Colloquium Room 3206, Math Building

ABSTRACT: Automatic clustering and classification of documents within corpora is a challenging task. Often, comparing word usage within the corpus, the so-called bag-of-words methodology, does this. The lexicon for a corpus can indeed be very large. For the example of 503 documents that we consider, there are more than 7000 distinct terms and more than 91,000 bigrams. This means that a term vector characterizing a document will be approximately 7000 dimensional. In this talk, we use an adaptation of normal mixture models with 7000 dimensional data to locate centroids of clusters. The algorithm works surprisingly well and is linear in all the size metrics.

PowerPoint slides for the talk can be found here.

Immediately following the talk, there will be a Reception and High Tea in the Mathematics Department Lounge, MTH 3201.



Co-sponsored with Statistics Seminar, Mathematics Department

SPEAKER: Prof. Gauri S. Datta,   University of Georgia, Department of Statistics

TITLE:     Estimation of Small Area Means under Measurement Error Models

TIME AND PLACE:  Tuesday, November 18, 2008, 3:30pm
            Room 1313, Math Building

ABSTRACT: In recent years demand for reliable estimates for characteristics of small domains (small areas) has greatly increased worldwide due to growing use of such estimates in formulating policies and programs, allocating government funds, planning regional development, and marketing decisions at local level. However, due to cost and operational considerations, it is seldom possible to procure a large enough overall sample size to support direct estimates of adequate precision for all domains of interest. It is often necessary to employ indirect estimates for small areas that can increase the effective domain sample size by borrowing strength from related areas through linking models, using census and administrative data and other auxiliary data associated with the small areas. To this end, the nested error regression model for unit-level data and the Fay-Herriot model for the area-level data have been widely used in small area estimation. These models usually treat that the explanatory variables are measured without error. However, explanatory variables are often subject to measurement error. Both functional and structural measurement error models have been recently proposed by researchers in small area estimation to deal with this issue. In this talk, we consider both functional and structural measurement error models in discussing empirical Bayes (equivalently, empirical BLUP) estimation of small area means.

Immediately following the talk, there will be a Reception and High Tea in the Mathematics Department Lounge,MTH 3201.



DISTINGUISHED STATISTICS CONSORTIUM LECTURE

SPEAKER: Mitchell H. Gail, M.D., Ph.D.
  Senior Investigator, Biostatistics Branch, Div. Cancer Epidemiology & Genetics, National Cancer Institute

TITLE:     Absolute Risk: Clinical Applications and Controversies

TIME:  Friday, December 5, 2008, 3:15pm
            PLACE: TBA

ABSTRACT: Absolute risk is the probability that a disease will develop in a defined age interval in a person with specific risk factors. Sometimes absolute risk is called "crude" risk to distinguish it from the cumulative "pure" risk that might arise in the absence of competing causes of mortality. After defining absolute risk, I shall present a model for absolute breast cancer risk and illustrate its clinical applications. I will also describe the kinds of data and approaches that are used to estimate models of absolute risk and two criteria, calibration and discriminatory accuracy, that are used to evaluate absolute risk models. In particular, I will address whether well calibrated models with limited discriminatory accuracy can be useful.

Immediately following the talk there will be a formal Discussion, with a Reception to follow that.
Details concerning the Discussant and the location of the Event will be posted here in the near future.