SciELO - Scientific Electronic Library Online

 
vol.23 número1El modelo de regresión logística para el caso en que la variable de respuesta puede asumir uno de tres niveles: estimaciones, pruebas de hipótesis y selección de modelosUna propuesta bioinspirada basada envecindades para particionamiento índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay articulos similaresSimilares en SciELO

Compartir


Revista de Matemática Teoría y Aplicaciones

versión impresa ISSN 1409-2433

Resumen

ARAYA ALPIZAR, Carlomagno. An alternative to classical latent class models selection methods for sparse binary data: an illustration with simulated data. Rev. Mat [online]. 2016, vol.23, n.1, pp.199-220. ISSN 1409-2433.

Within the context of a latent class model with manifest binary variables, we propose an alternative method that solves the problem of estimating empirical distribution with sparse contingency tables and the chisquare approximation for goodness-of-fit will not be valid. We analyze sparse binary data, where there are many response patterns with very small expected frequencies in several data sets varying in degree of sparseness from 1 to 5 defined d = n/2p = n/R is a factor that is mentioned in almost all prior literature as being an important determinant of how well the distribution is represented by the chi-squared.The proposed approach produced results that were valid and reliable under the mentioned problematic data conditions. Results from the proposal presented compare the rates of Type I for traditional goodness-of-fit tests. We also show that with data density d ≤ 5, Pearson's statistic should not be used to select latent class models using the Patterns Method, given that this has the probability of Type I error being greater than 5%. By comparing the Patterns Method and the Parametric Bootstrap for data density d = 2, we show that the Patterns Method has more accurate Type I error probabilities since the likelihood ratio, Read-Cressie and Freeman-Tukey statistics afford values of α<0.05. In contrast, the Parametric Bootstrap provides values in these statistics that surpass 5%.

Palabras clave : sparse data; latent class; goodness-of-fit; binary data.

        · resumen en Español     · texto en Inglés     · Inglés ( pdf )