Skip to main content

Statistics Seminar

R.L. Anderson Lecture

Statistical Thinking About Home Run Hitting


 

Jim Albert

Emeritus Distinguished University Professor

Department of Mathematics and Statistics

Bowling Green State University


 

Abstract


 

Baseball is remarkable with respect to the amount of data collected over the seasons of Major League Baseball (MLB) beginning in 1871.  These data have provided an opportunity to address many questions of interest among baseball fans and researchers.  This talk will review several statistical studies on baseball home run hitting by the speaker over the last 30 years.  By modeling career trajectories, one learns about the greatest peak abilities of home run hitters.  We know that players exhibit streaky home run performances, but is there evidence that hitters exhibit streaky ability? MLB has been concerned about the abrupt rise in home run hitting in recent seasons.  What are the possible causes of the home run explosion, and in particular, is the explosion due to the composition of the baseball?

Date:
-
Location:
The90 rm. 203 (Teal Classroom)
Tags/Keywords:
Event Series:

Enhancing the Study of Microbiome-Metabolome Interactions: A Transfer-Learning Approach for Precise Identification of Essential Microbes

Abstract: Recent research has revealed the essential role that microbial metabolites play in host-microbiome interactions. Although statistical and machine-learning methods have been employed to explore microbiome-metabolome interactions in multiview microbiome studies, most of these approaches focus solely on the prediction of microbial metabolites, which lacks biological interpretation. Additionally, existing methods face limitations in either prediction or inference due to small sample sizes and highly correlated microbes and metabolites. To overcome these limitations, we present a transfer-learning method that evaluates microbiome-metabolome interactions. Our approach efficiently utilizes information from comparable metabolites obtained through external databases or data-driven methods, resulting in more precise predictions of microbial metabolites and identification of essential microbes involved in each microbial metabolite. Our numerical studies demonstrate that our method enables a deeper understanding of the mechanism of host-microbiome interactions and establishes a statistical basis for potential microbiome-based therapies for various human diseases.

 

Date:
-
Location:
MDS 220
Event Series:

Cost of Sequential Adaptation

Abtract: Possibility of early stopping or interim sample size re-estimation lead random sample sizes. If these interim adaptations are informative, the sample size becomes a part of a sufficient statistic. Consequently, statistical inference based solely on the observed sample or the likelihood function does not use all available statistical evidence. In this work, we quantify the loss of statistical evidence using (expected) Fisher Information (FI) because observed Fisher information as a function of the likelihood does not capture the loss of statistical evidence. We decompose the total FI into the sum of the design FI and a conditional on design FI. Further, the conditional on design FI is represented as a weighted linear combination of FI conditional on realized decisions. The decomposition of total FI is useful for making a few practically useful conclusions for designing sequential experiments. In addition, this FI decomposition is used to derive a sequential version of the Cramer-Rao Lower Bound (CRLB) for estimators' mean squared errors. For a given sequential design, when the data are generated from one-parameter exponential family with canonical parameterization, the sequential CRLB is attained. Theoretical results are illustrated with a simple normal case of a two-stage design with a possibility of early stopping.

 

Link to speaker bio: 

https://www.mcw.edu/departments/biostatistics/people/sergey-tarima-phd&…;

 

Date:
Location:
MDS 220
Event Series:

Yuguo Chen (from U of Illinois at Urbana-Champaign)

Sampling for Conditional Inference on Network Data

Random graphs with given vertex degrees have been widely used as a model for many real-world complex networks. We describe a sequential sampling method for sampling networks with a given degree sequence. These samples can be used to approximate closely the null distributions of a number of test statistics involved in such networks, and provide an accurate estimate of the total number of networks with given vertex degrees. We apply our method to a range of examples to demonstrate its efficiency in real problems.
 
Personal webpage:
Date:
-
Location:
CB 102
Event Series:
Subscribe to Statistics Seminar