Yesterday, I read ‘Measurement error and the replication crisis’ by Eric Loken and Andrew Gelman, which left me puzzled. The first part of the paper consists of general statements about measurement error. The second part consists of the claim that in the presence of measurement error, we overestimate the true effect when having a small sample size. This sounded wrong enough to ask the authors for their simulation code and spend a couple of hours to figure out what they did in their paper. I am offering a short and a long version.
Network models have become a popular way to abstract complex systems and gain insights into relational patterns among observed variables in many areas of science. The majority of these applications focuses on analyzing the structure of the network. However, if the network is not directly observed (Alice and Bob are friends) but estimated from data (there is a relation between smoking and cancer), we can analyze - in addition to the network structure - the predictability of the nodes in the network. That is, we would like to know: how well can a given node in the network predicted by all remaining nodes in the network?
In a previous post we estimated a Mixed Graphical Model (MGM) on a dataset of mixed variables describing different aspects of the life of individuals diagnosed with Autism Spectrum Disorder, using the mgm package. For interactions between continuous variables, the weighted adjacency matrix fully describes the underlying interaction parameter. Correspondinly, the parameters are represented in the graph visualization: the width of the edges is proportional to the absolute value of the parameter, and the edge color indicates the sign of the parameter. This means that we can clearly interpret an edge between two continuous variables as a positive or negative linear relationship of some strength.
Determining conditional independence relationships through undirected graphical models is a key component in the statistical analysis of complex obervational data in a wide variety of disciplines. In many situations one seeks to estimate the underlying graphical model of a dataset that includes variables of different domains.