Seminario de Estadística

   
2017-03-06
12:00hrs.
Garritt Page. Brigham Young University
Estimation And Prediction In The Presence Of Spatial Confounding For Spatial Linear Models
Sala 2, Facultad de Matemáticas
Abstract:

In studies that produce data with spatial structure it is common that covariates of interest vary spatially in addition to the error. Because of this,  the error and covariate are often correlated. When this occurs it is difficult to distinguish the covariate effect from residual spatial variation.  In an iid normal error setting, it is well known that this type of correlation produces biased coefficient estimates but predictions remain unbiased.  In a spatial setting  recent studies have shown that coefficient estimates remain biased, but spatial prediction has not been addressed. The purpose of this paper is to provide a more detailed study of coefficient estimation from spatial models when covariate and error are correlated and then begin a formal study regarding spatial prediction. This is carried out by investigating properties of the generalized least squares estimator and the best linear unbiased predictor when a spatial random effect and a covariate are jointly modeled. Under this setup we demonstrate that the mean squared prediction error is possibly reduced when covariate and error are correlated.  

2017-01-11
12:00hrs.
Marc G. Genton. King Abdullah University Of Science And Technology (Kaust), Saudi Arabia
Computational Challenges With Big Environmental Data
Sala 2, Facultad de Matemáticas
Abstract:

Two types of computational challenges arising from big environmental data

are discussed. The first type occurs with multivariate or spatial

extremes. Indeed, inference for max-stable processes observed at a large

collection of locations is among the most challenging problems in

computational statistics, and current approaches typically rely on less

expensive composite likelihoods constructed from small subsets of data. We

explore the limits of modern state-of-the-art computational facilities to

perform full likelihood inference and to efficiently evaluate high-order

composite likelihoods. With extensive simulations, we assess the loss of

information of composite likelihood estimators with respect to a full

likelihood approach for some widely-used multivariate or spatial extreme

models. The second type of challenges occurs with the emulation of climate

model outputs. We consider fitting a statistical model to over 1 billion

global 3D spatio-temporal temperature data using a distributed computing

approach. The statistical model exploits the gridded geometry of the data

and parallelization across processors. It is therefore computationally

convenient and allows to fit a non-trivial model to a data set with a

covariance matrix comprising of 10^{18} entries. We provide 3D

visualization of the results. The talk is based on joint work with Stefano

Castruccio and Raphael Huser.

 

2017-01-11
11:00hrs.
Ying Sun. King Abdullah University Of Science And Technology (Kaust), Saudi Arabia
Total Variation Depth For Functional Data
Sala 2, Facultad de Matemáticas
Abstract:

There has been extensive work on data depth-based methods for robust

multivariate data analysis. Recent developments have moved to

infinite-dimensional objects such as functional data. In this work, we

propose a new notion of depth, the total variation depth, for functional

data. As a measure of depth, its properties are studied theoretically, and

the associated outlier detection performance is investigated through

simulations. Compared to magnitude outliers, shape outliers are often

masked among the rest of samples and harder to identify. We show that the

proposed total variation depth has many desirable features and is well

suited for outlier detection. In particular, we propose to decompose the

total variation depth into two components that are associated with shape

and magnitude outlyingness, respectively. This decomposition allows us to

develop an effective procedure for outlier detection and useful

visualization tools, while naturally accounting for the correlation in

functional data. Finally, the proposed methodology is demonstrated using

real datasets of curves, images, and video frames. The talk is based on

joint work with Huang Huang.

2016-12-16
12:00hrs.
Fernanda de Bastiani. Pontificia Universidad Católica de Chile
Flexible Regression And Smoothing: Gaussian Markov Random Field Models In Gamlss
Sala 2, Facultad de Matemáticas
Abstract:

This work describes a brief history about GAMLSS and the modelling and fitting of Gaussian Markov random field components within a GAMLSS model. This allows modelling of any or all the parameters of the distribution for the response variable using explanatory variables and spatial effects. The response variable distribution is allowed to be a non-exponential family distribution. A new package developed in R to achieve this is presented. We use Gaussian Markov random fields to model  the spatial effect in Munich  rent data and explore some features and characteristics of the data. The potential of using spatial analysis within GAMLSS is discussed. We argue that the flexibility of  parametric distributions, ability to model all the parameters of the distribution and diagnostic tools of GAMLSS provide an ideal environment for modelling spatial features of data. 

2016-12-02
12:00hrs.
Gabriel Martos. Pontificia Universidad Católica de Valparaíso
Discrimination Surfaces For Region-Specific Brain Asymmetry Analysis
Sala 2, Facultad de Matemáticas
Abstract:
Discrimination surfaces are introduced as a diagnostic for localizing brain regions where discrimination between diseased and non-diseased subjects is higher. An applied goal of interest is on conducting a brain asymmetry analysis so to localize brain regions where schizophrenia patients differ further from healthy controlsJoint work with Miguel de Carvalho.
2016-11-25
12:00hrs.
Giovanni Motta. Pontificia Universidad Católica de Chile
Spatial Identification Of Epilepsy Regions
Sala 2, Facultad de Matemáticas
Abstract:

The surgical outcomes of patients suffering from neocortical epilepsy are not always successful. The main difficulty in the treatment of neocortical epilepsy is that current technology has limited accuracy in mapping neocortical epileptogenic tissue (see Haglund and Hochman 2004). It is known that the optical spectroscopic properties of brain tissue are correlated with changes in neuronal activity. The method of mapping these activity-evoked optical changes is known as imaging of intrinsic optical signals (ImIOS). Activity-evoked optical changes measured in neocortex are generated by changes in cerebral hemodynamics (i.e., changes in blood oxygenation and blood volume).

ImIOS has the potential to be useful for both clinical and experimental investigations of the human neocortex. However, its usefulness for human studies is currently limited because intra-operatively acquired ImIOS data is noisy. To improve the reliability and usefulness of ImIOS for human studies, it is desirable to find appropriate statistical appropriate methods for the removal of noise artifacts and its statistical analysis (see Lavine et al. 2011).

In this paper we introduce a novel flexible tool, based on spatial statistical representation of ImIOS, that allows for source localization of the epilepsy regions. In particular, our model incorporates spatial correlation between the location of the epileptic region(s) and the neighboring regions, non-stationarity of the observed time series, and heartbeat/respiration cyclical components. The final goal is clustering (dimension reduction) of the pixels in regions, in order to localize the epilepsy regions for the craniectomy.

The advantage of our approach compared with previous approaches is twofold. Firstly, we use a non-parametric specification, rather than the (more restrictive) parametric or polynomial-based specification. Secondly, we provide a statistical method – based on the spatial information – that is able to identify the clusters in a data-driven way, rather than the (sometimes arbitrary) ad-hoc currently used approaches.

To demonstrate how our method might be used for intra-operative neuro- surgical mapping, we provide an application of the technique to optical data acquired from a single human subject during direct electrical stimulation of the cortex. 

2016-11-18
12:00hrs.
Karine Bertin. Cimfav - Universidad de Valparaíso
Adaptive Density Estimation On Bounded Domains
Sala 2, Facultad de Matemáticas
Abstract:
We study the estimation, in $L_p$-norm, of density function defined on $[0, 1]^d$. We construct a new family of kernel density estimators that do not suffer from the so-called boundary bias problem and we propose a data-driven procedure based on Goldenshluger and Lepski approach that jointly select a kernel and a bandwidth. We derive two estimators that satisfy oracle type inequalities and that are also proved to be adaptive over a scale of anisotropic or isotropic Sobolev-Slobodetskii classes. The main interest of the isotropic procedure is to obtain adaptive results without any restriction on the smoothness parameter.
2016-11-11
12:00hrs.
José Quinlan. Pontificia Universidad Católica de Chile
Parsimonious Hierarchical Modeling Using Repulsive Distributions
Sala 2, Facultad de Matemáticas
Abstract:

Employing nonparametric methods for density estimation has become routine in Bayesian statistical practice. In this regard, models based on discrete nonparametric priors such as Dirichlet Process Mixture (DPM) models are very attractive choices due to their flexibility and tractability. However, a common problem in fitting DPMs or other discrete models to data is that they tend to produce a large number of (sometimes) redundant clusters. In this work we propose a method that produces parsimonious mixture models (i.e. mixtures that avoid creating redundant clusters), without sacrificing flexibility or model fit. This method is based on the idea of repulsion, that is, that any two mixture components are encouraged to be well separated. We propose a family of d-dimensional probability densities whose coordinates tend to repel each other in a smooth way. The induced probability measure has a close relation with Gibbs measures, Graph theory and Point Processes. We investigate its global properties and explore its use in the context of mixture models for density estimation. Computational techniques are detailed and we illustrate utility with some well-known data sets. 

2016-11-04
12:00hrs.
Rodrigo Rubio. Pontificia Universidad Católica de Chile
Similary-Based Clustering For Stock Market Extremes
Sala 2, Facultad de Matemáticas
Abstract:
The analysis of the magnitude and dynamics of extreme losses in a stock market is essential from an investors viewpoint. An important question of applied interest is: “How to group into different categories, stocks which are more similar from the viewpoint of those features?”.In this talk we discuss methods of similarity-based clustering for statistics of heteroscedastic extremes which allow us to assemble stocks that are more similar from the viewpoint of the scedasis and/or tail index. 
2016-10-21
12:00hrs.
Ramsés Mena. Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, México
Algunos Procesos Markovianos Derivados de Intercambiabilidad
Sala 2, Facultad de Matemáticas
Abstract:
Utilizando la  simetría inherente al concepto de intercambiabilidad  y el aprendizaje del método bayesiano se vislumbra una atractiva construcción de procesos markovianos. Dicho enfoque es considerablemente general, en tanto a supuestos distribucionales y de dependencia se refiere. Discutiremos varios casos particulares, tanto a tiempo discreto como a tiempo continuo. Si el tiempo lo permite presentaremos algunas generalizaciones a procesos con valores en espacio de medidas de probabilidad y algunas de sus aplicaciones. 
2016-10-19
12:00hrs.
Mingan Yang. San Diego State University
Bayesian Semiparametric Latent Variable Model: An Application On Fibroid Tumor Study
Sala 5, Facultad de Matemáticas
Abstract:
In parametric hierarchical models, it is standard practice to place mean and variance constraints on the latent variable distributions for the sake of identifiability and interpretability. Because incorporation of such constraints is challenging in semiparametric models that allow latent variable distributions to be unknown, previous methods either constrain the median or avoid constraints. In this article, we propose a centered stick-breaking process (CSBP), which induces mean and variance constraints on an unknown distribution in a hierarchical model. This is accomplished by viewing an unconstrained stick-breaking process as a parameter-expanded version of a CSBP. An efficient blocked Gibbs sampler is developed for approximate posterior computation. The methods are illustrated through a simulated example and an epidemiologic application.
2016-10-14
12:00hrs.
Carlos Díaz Ávalos. Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas - Unam, México
Estadística Espacial: Contestando Preguntas Relevantes en Ciencias Ambientales y de la Salud
Sala 2, Facultad de Matemáticas
Abstract:
La estadística espacial es una rama que cobró auge a finales de la década de los 80, cuando se desató una procupación mundial por problemas ambientales.  Esto dió lugar al desarrollo de métodos aplicables a ciencias ambientales en las que se consideran campos aleatorios de tipo contínuo, discreto y puntual.  Actualmente la mayoría de los países enfrentan graves problemas ambientales entre los que destacan la quema de bosques, la contaminación y problemas epidemiológicos.

En esta charla se presenta una visión global de los métodos de la estadística espacial adecuados para el análisis en los tres tipos de soporte y se muestran tres aplicaciones a datos reales, con una breve reseña de tópicos de investigación aún abiertos.
2016-10-07
12:00hrs.
Giovanni Motta. Pontificia Universidad Católica de Chile
Local Polynomials For Time-Varying Correlations: Adaptivity Versus Positivity
Sala 2, Facultad de Matemáticas
Abstract:
In this paper we propose a new nonparametric method to estimate the time-varying correlation between two non-stationary time series. Linear smoothers of the cross-products are based on the same bandwidth for both numerator (covariance) and denominator (variances). This approach guarantees two important properties: the estimated correlation is bounded between minus one and one, and the resulting correlation matrix is positive semi-definite. However, the use of one common bandwidth for both numerator and denominator appears to be restrictive, as the covariance and the variances are in general characterized by different degrees of smoothness. On the other hand, a kernel-type estimator based on different smoothing parameters for numerator and denominator has two drawbacks. First, the ratio between time-varying numerators and denominators is not necessarily bounded between minus one and one; as a consequence, the resulting correlation matrix is not necessarily positive semi-definite. Second, the estimated bandwidths that are optimal for estimating the covariance and the variances are not necessarily optimal for estimating the ratio. The estimator we propose in this paper is based on local smoothing of the sign of the cross-products, which does not require distinguishing between numerator and denominator. Our novel method can be used to estimate the time-varying AR coefficients and time-varying spectra of locally stationary time series.
2016-07-07
Pedro Jodrá. Universidad de Zaragoza
On The Log-Extended Exponential-Geometric Distribution And Applications In Business Research
Sala 2 (Víctor Ochsenius) Facultad de Matemáticas a las 12:00 Hrs.
2016-07-07
María Dolores Jiménez-Gamero. Universidad de Sevilla
Penalized Estimation Of The Finite Population Distribution Function Using Auxiliary Information
Sala 2 (Víctor Ochsenius) Facultad de Matemáticas a las 12:00 Hrs.
2016-03-28
Eduardo Engel. Facultad de Economía de la Universidad de Chile
Sala 2 de la Facultad de Matemáticas a las 12hs.
2016-03-28
Eduardo Engel. Facultad de Economía de la Universidad de Chile.
Missing Aggregate Dynamics: On The Slow Convergence Of Lumpy Adjustment Models
Sala 2 - Facultad de Matemáticas PUC
2016-03-18
Paula Brito. Universidad de Porto
Taking Variability In Data Into Account: Symbolic Data Analysis The Special Case Of Interval Data
Sala 2- Facultad de Matemáticas 12:00 Hrs.
2016-01-22
Hamdi Raissi
Statistical Analysis Of The Non Constant Covariance Structure Of Time Series.
Sala 2 Facultad de Matemáticas a las 12:00 Hrs.
2016-01-15
Mattieu Saumard
Two Applications Of Functional Data.
Sala 2 Facultad de Matemáticas a las 12:00 Hrs.