Yesterday, I read ‘Measurement error and the replication crisis’ by Eric Loken and Andrew Gelman, which left me puzzled. The first part of the paper consists of general statements about measurement error. The second part consists of the claim that in the presence of measurement error, we overestimate the true effect when having a small sample size. This sounded wrong enough to ask the authors for their simulation code and spend a couple of hours to figure out what they did in their paper. I am offering a short and a long version.
Network models have become a popular way to abstract complex systems and gain insights into relational patterns among observed variables in almost any area of science. The majority of these applications focuses on analyzing the structure of the network. However, if the network is not directly observed (Alice and Bob are friends) but estimated from data (there is a relation between smoking and cancer), we can analyze - in addition to the network structure - the predictability of the nodes in the network. That is, we would like to know: how well can an arbitrarily picked node in the network predicted by all remaining nodes in the network?
In a previous post we recovered the conditional independence structure in a dataset of mixed variables describing different aspects of the life of individuals diagnosed with Autism Spectrum Disorder, using the mgm package. While depicting the independence structure in multivariate data set gives a first overview of the relations between variables, in most applications we interested in the exact parameter estimates. For instance, for interactions between continuous variables, we would like to know the sign and the size of parameters - i.e., if the nodes in the graph are positively or negatively related, and how strong these associations are. In the case of interactions between categorical variables, we are interested in the signs and sizes of the set of parameters that describes the exact non-linear relationship between variables.
Determining conditional independence relationships through undirected graphical models is a key component in the statistical analysis of complex obervational data in a wide variety of disciplines. In many situations one seeks to estimate the underlying graphical model of a dataset that includes variables of different domains.