Francisco Kuncar. Banco Santander Aplicaciones de la Estadística y Data Mining en Instituciones Financieras Sala 2 (Víctor Ochsenius) - 12:00 hrs. Abstract: Resumen: En esta charla ilustraremos algunas aplicaciones de la Estadística y Data Mining en Instituciones Financieras. Luego de una breve introducción y contexto, se abordarán las principales problemáticas y se presentarán algunos algoritmos y metodologías de resolución. Finalmente se presentan algunas aplicaciones a casos prácticos.
2011-10-07
Garrit Page. Pontificia Universidad Católica de Chile Density Estimation and Classification via Bayesian Nonparametric Learning of Affine Subspaces Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas - 12:00 Hrs. Abstract: It is now practically the norm for data to be very high dimensional in areas such as genetics, machine vision, image analysis and many others. When analyzing such data, parametric models are often too inflexible while nonparametric procedures tend to be non-robust because of insufficient data on these high dimensional spaces. It is often the case with high-dimensional data that most of the variability tends to be along a few directions, or more generally along a much smaller dimensional submanifold of the data space. In this article, we propose a class of models that flexibly learn about this submanifold and its dimension which simultaneously performs dimension reduction. As a result, density estimation is carried out effic
2011-09-30
Fernando Quintana. Pontificia Universidad Católica de Chile Compartiendo información entre subpoblaciones usando prioris no permutables Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas - 12:00 Abstract: Se presenta un modelo Bayesiano no paramétrico para un ensayo clínico de fase II con pacientes que presentan diferentes subtipos de la enfermedad en estudio. El objetivo es estimar la probabilidad de éxito de una terapia experimental para cada subtipo. Consideramos el caso en que el reducido tamaño de las muestras requiere compartir información a través de subtipos, pero los subtipos no son permutables a priori. La falta de permutabilidad a priori impide el uso directo de los modelos jerárquicos tradicionales para estos efectos. Nuestro enfoque se basa en un modelos de particiones aleatorias en que los subtipos dentro del mismo grupo comparten una misma probabilidad de éxito. A diferencia de los modelos jerárquicos usuales, el modelo p
2011-09-23
Mauricio Tejo. Pontificia Universidad Católica de Chile Hodgkin-Huxley Equations Arising From Voltage-Gated Processes Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas - 12:00 Hrs.
2011-09-16
Karine Bertin. Universidad de Valparaíso, Cimfav Estimación No-Parametrica Usando Estimadores Beta Kernel de la Densidad Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas
2011-09-02
Luis E. Nieto Barajas. Departamento de Estadística, Itam Reclamos intercambiables en un proceso tipo Poisson Compuesto Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas PUC - 12:00 Hrs. Abstract: En los modelos de riesgo, el supuesto de independencia en los reclamos no siempre se satisface. En esta plática consideramos el caso en el que los reclamos son intercambiables y estudiamos las implicaciones cuando estos reclamos son agregados a través de un proceso tipo Poisson compuesto. En particular intercambiabilidad se logra a través de independencia condicional mediante el uso de medidas paramétricas y no paramétricas para la variable/distribución condicionante. Se hace uso del Teorema de Bayes para garantizar una distribución arbitraria, pero fija, en los reclamos. Finalmente se realiza un análisis Bayesiano del modelo y se ilustra con una base de datos de la encuesta tipo panel de gastos médicos del 2005 en E.U.
2011-08-25
Luis M. Castro Cepero. Universidad de Concepción Change point detection in the skew-normal model parameters Sala de Seminarios (Octavo piso) MIDE UC - 13:00 Hrs. Abstract: Bayesian inference under the skew-normal family of distributions is discussed using an arbitrary proper prior for the skewness parameter. In particular, we review some results when a skew-normal prior distribution is considered. Considering this particular prior, we provide a stochastic representation of the posterior of the skewness parameter. Moreover, we obtain analytical expressions for the posterior mean and variance of the skewness parameter. The ultimate goal is to consider these results to one change point identification in the parameters of the location-scale skew-normal model. Some Latin American emerging market datasets are used to illustrate the methodology developed in this work
2011-08-18
Ricardo Olea O.. Pontificia Universidad Católica de Chile An Effcient Estimator for Locally Stationary Gaussian Long-Memory Processes Auditorio Colege UC - Pontificia Universidad Católica de Chile - 13:00 a 14:00 Hrs.
2011-08-17
Alejandro Rodríguez. Departamento de Estadística - Universidad de Concepción Parameter Estimation Uncertainty in Unobserved Component Models Auditorio College - UC 13:10 Hrs.
2011-08-11
Terrance Savitsky. Rand Corporation, Usa Variable Selection for Nonparametric Gaussian Process Priors: Models and Computational Strategies Auditorio College Abstract: This paper presents a unified treatment of Gaussian process models that extends to data from the exponential dispersion family and to survival data. Our specific interest is in the analysis of data sets with predictors that have an a priori unknown form of possibly nonlinear associations to the response. The modeling approach we describe incorporates Gaussian processes in a generalized linear model framework to obtain a class of nonparametric regression models where the covariance matrix depends on the predictors. We consider, in particular, continuous, categorical and count responses. We also look into models that account for survival outcomes.We explore alternative covariance formulations for the Gaussian process pr
2011-06-23
Guido del Pino M.. Pontificia Universidad Católica de Chile Second order partial exchangeability within a factorial structure and variances of linear combinations Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas - 12:00 Hrs. Abstract:
Exchangeability conditions on models with a factorial structure arise in many statistical areas: Balanced experimental designs, Random e ects models, Sampling from nite populations, and Bayesian analysis. In this talk we deal with second order notions which we call second order partial exchangeability. If the combinations of factor levels are arranged lexicographically, this structure translates into invariance properties on the covariance matrix of the observations. An important property is that the covariance between any two observations depends just on which factor levels they share. The main theorem result is a procedure to compute the variance of any linear combination of the observations. This is of theoretical interest if the entries of the covariance matrix are analytical expressions. It also provides a computationally effcient algorithm which is particularly relevant for large covariance matrices.
The covariance structure of the original model is associated with a symbolic model formula, which is mathematically represented by a lattice. The lattice determines both a random efects and a xed effects model. The variance of a linear combination of the observations is then a linear combination of the expected mean squares (EMS) in an ANOVA table corresponding to a random eff ects model. Furthermore, the coefficient of each EMS coincides with a sum of squares obtained by using the coecients of the linear combination as artifficial data.
"
2011-06-16
José Romeo. Universidad de Santiago de Chile Modelación temporal de dependencia en tiempos de vida bivariados usando cópulas Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas Abstract: El objetivo principal de este trabajo es estimar la estructura de dependencia temporal entre tiempos bivariados de sobrevivencia desde un punto de vista Bayesiano. Particularmente consideramos modelos de cópula arquimedianos como estructura de dependencia, permitiendo que el parámetro de asociación entre los tiempos de vida varíe en el tiempo. Asumimos un procedimiento de estimación en dos etapas. En la primera etapa, desde un enfoque Bayesiano, estimamos las funciones de sobrevivencia marginal considerando independencia entre los tiempos de vida bivariados y la distribución exponencial por partes. Luego, la estimación del parámetro de dependencia es realizada usando una factorización temporal de la función de verosimilitud y una distrib
2011-06-02 9:00hrs.
Ernesto San Martín. Facultad de Matemáticas & Centro de Medición Mide Uc, Pontificia Universidad Católica de Chile What about the identi cation of parametric and semi-parametric generalized non-linear mixed models? The case of a simple Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas - 12:00 Hrs. Abstract:
IRT models are widely used to specify the probability that a person correctly answers an item. This type of models includes two types of effects: a fixed effect which characterizes items properties, and a random effect which characterizes persons features. In the statistical parlance, these models belong to the family of generalized (non-)linear mixed models. In 1968, A. Birnbaum introduced a specific IRT model which considers the possibility to systematically answering correctly an item by guessing. In the psychometric parlance, such a model is called 3PL, whereas in the statistical parlance the 3PL is an example of a generalized non-linear mixed model. In spite of its widely use, its identifiability is still an open problem. Similarly, in the statistical literature, the identifiability of generalized non-linear mixed model is systematically not considered. In this talk, we want to discuss the identifiability of the 3PL in three dierent contexts:
1. When both item characteristics and person characteristics are considered as fixed effects. In this set-up, we show some identification results, and also we formulate the general problem which is necessary to solve in order to obtain the eventual identifiability of the 3PL.
2. When the person characteristics are considered as random effects distributed according to a probability distribution known up to a scale parameter. In this set-up, we obtain the identifiability under a restriction which, for applications, implies practical limitations.
3. When the probability distribution generating the person characteristics is considered as a parameter. In this set-up, we show that such a semi-parametric model does not have empirical meaning in the sense that such a distribution is identified under a non-realistic scenario.
Joint work with Jean-Marie Rolin (ISBA, UCL, Belgium)
2011-05-19
Mohsen Pourahmadi. Department of Statistics, Texas A&m University Banded Covariance Estimators for High-Dimensional Data Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas UC - 12:00 Hrs. Abstract: We present an overview of the history and some successes of banding as a way of estimating covariance matrices for high-dimensional multivariate as well as time series data. Under a short-range dependence condition for a wide class of nonlinear stationary processes, we show that the banded covariance matrix estimates converge in operator norm to the true covariance matrix with explicit rates of convergence. Connections with the covariance estimation of high-dimensional data, spectral density estimation and order selection for AR(MA) models will be discussed. A sub-sampling approach is used to choose the optimal banding parameter, and simulation results reveal its satisfactory performance for linear and certain nonlinear processes. The procedure
2011-05-06
Peter Mueller. U. Texas, Austin Dirichlet process mixture models for count data from phage display experiments Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas 12:00 Hrs. Abstract: We discuss inference for a phage display experiment with three stages. The data are tri-peptide counts by organ and stage. The primary aim of the experiment is to identify ligands that bind with high affinity to a given organ. We formalize the research question as inference about the monotonicity of mean counts over stages. The inference goal is then to identify a list of peptide and organ pairs with significant increase over stages. We develop a semi-parametric model as a mixture of Poisson distributions with a Dirichlet process prior on the mixing measure. The posterior distribution under this model allows the desired inference about the monotonicity of mean counts. However, the desired inference summary as a list peptide and organ pairs with significant
2011-04-29
Luis Mauricio Castro. Universidad de Concepción On Modeling Heteroskedasticity and Non-Linearity: The Skew-Normal Regression Models Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas 12:00 Hrs. Abstract:
Based on the results pointed out by Spanos in 1994, this paper proposes a probabilistic reduction approach to model heteroskedasticity and non-linearity in a family of skew-normal distributions. The starting point is to consider a marginal-conditional decomposition of the joint density function, studying the relations between the parametric spaces associated to the resulting marginal and conditional models. Necessary and sufficient conditions on the parameters of the joint distribution are analyzed in order to generate a cut, and therefore, an exogenous variable.
"
2011-04-15
Manuel Galea. Pontificia Universidad Católica de Chile Hypotheses testing for comparing means and variances of correlated responses in the symmetric non-normal case Sala 2 (Víctor Ochsenius) - 12:00 a 13:00 Hrs. Abstract: Abstract: We consider hypothesis testing for the equality of means and variances of correlated responses with non-normal distributions. Specifically, we assume that the re-sponses follow a symmetric multivariate distribution. Wald type statistics are considered which are asymptotically distributed according to a chi-square distribution. Statistics are based on the sample mean and the sample covariance matrix. Applications are made for comparing measurement methods and the performance of investment portfolios
2011-03-30
Liliana López Kleine. Universidad Nacional de Colombia Integración de datos genómicos para la predicción funcional de proteínas Sala 2 (Víctor Ochsenius) - 16:00 Hrs. Facultad de Matemáticas - UC Abstract: A pesar de la existencia de una gran cantidad de datos genómicos, el papel preciso de muchas proteínas no ha sido elucidado. Para avanzar en el conocimiento de la función de proteínas, haciendo uso de los datos genómicos disponibles en las bases de datos, ha surgido la Biología de Sistemas, que busca extraer conocimiento a partir del análisis global de diferentes tipos de datos haciendo uso de herramientas estadísticas adecuadas.
Se ilustrarán las etapas de un estudio de Biología de Sistemas con base en dos ejemplos específicos de determinación del papel de proteínas de función desconocida haciendo uso de un análisis de correlación canónica basado en kernels.
2011-03-25
Wolf-Dieter Richter. University of Rostock, Germany On continuous ln;p-symmetric distributions Sala 2 (Víctor Ochsenius) - Facultad de Matemáticas UC - 12:00 a 13:00 Hrs. Abstract: Spherically symmetric distributed random vectors X allow a stochastic representation X d= RU where R is a positive random radius variable, U is distributed according to the uniform probability distribution on the Euclidean unit sphere Sn;2 and R and U are independent. A closely related geometric measure representation makes it possible to measure a subset A from Rn iteratively by rst determining the Euclidean surface content of the intersections Sn;2 [ 1 r A]; r > 0 and then integrating w.r.t. the distribution of R. This generalization of Cavalieri´s and Torricelli´s method of indivisibles has found several applications in probability theory and mathematical statistics. The present talk deals with the more general case that the density of X depends on x = (x1; :::; xn) 2 Rn just through the function jxjp p = Pn i=1 jxijp; p > 0: The stochastic and geometric representations are then based upon the generalized radius variable R = (Pn i=1 jXijp)1=p and the notion of a p-generalized uniform distribution on the ln;p-unit sphere Sn;p. The latter notion refers to a geometrically dened probability measure w.r.t. the surface content in a suitably chosen non-Euclidean geometry. Presented probabilistic applications concern the determination of exact distributions of functions (like the product, the ratio or a linear combination of the components) of the vector X if n = 2. First statistical applications deal with generalizations of the classical 2-, t- and F-distributions for arbitrary n and p.
2011-01-06 9:00hrs.
Timothy Hanson. University of South Carolina, Usa Bayesian survival analysis: An overview of models and methods Sala 1 de la Facultad de Matemáticas Abstract:
In these talks I will provide an overview of approaches for analyzing time-to-event data using semiparametric and nonparametric models. Popular models including proportional hazards, accelerated failure time, proportional odds, additive hazards, and proportional mean residual life will be discussed, along with common parametric and nonparametric priors on the baseline distribution. Nonparametric priors include the gamma process, beta process, Dirichlet process mixtures, Polya trees, penalized splines and extensions of these. Then I will discuss various model generalizations including time dependent covariates, joint longitudinal and survival modeling, various frailty structures, cure models, and completely nonparametric dependent process approaches. Various models will be fit and illustrated in several software packages and languages including WinBUGS, BayesX, DPpackage, and FORTRAN.