Variable (mathematics) ≈ Variable (mathematics)
View article: An Introduction to Variational Autoencoders
An Introduction to Variational Autoencoders Open
Variational autoencoders provide a principled framework for learning deep latent-variable models and corresponding inference models. In this work, we provide an introduction to variational autoencoders and some important extensions.
View article
Variable selection – A review and recommendations for the practicing statistician Open
Statistical models support medical research by facilitating individualized outcome prognostication conditional on independent variables or by estimating effects of risk factors adjusted for covariates. Theory of statistical models is well‐…
View article
All Models are Wrong, but Many are Useful: Learning a Variable's Importance by Studying an Entire Class of Prediction Models Simultaneously Open
Variable importance (VI) tools describe how much covariates contribute to a prediction model's accuracy. However, important variables for one well-performing model (for example, a linear model $f(\mathbf{x})=\mathbf{x}^{T}β$ with a fixed c…
View article
A Survey of Predictive Modeling on Imbalanced Domains Open
Many real-world data-mining applications involve obtaining predictive models using datasets with strongly imbalanced distributions of the target variable. Frequently, the least-common values of this target variable are associated with even…
View article
A Simple New Approach to Variable Selection in Regression, with Application to Genetic Fine Mapping Open
Summary We introduce a simple new approach to variable selection in linear regression, with a particular focus on quantifying uncertainty in which variables should be selected. The approach is based on a new model—the ‘sum of single effect…
View article
Direct and Indirect Effects Open
The direct effect of one eventon another can be defined and measured byholding constant all intermediate variables between the two.Indirect effects present conceptual andpractical difficulties (in nonlinear models), because they cannot be …
View article
Variable selection strategies and its importance in clinical prediction modelling Open
Clinical prediction models are used frequently in clinical practice to identify patients who are at risk of developing an adverse outcome so that preventive measures can be initiated. A prediction model can be developed in a number of ways…
View article
How Many Participants Do We Have to Include in Properly Powered Experiments? A Tutorial of Power Analysis with Reference Tables Open
Given that an effect size of d = .4 is a good first estimate of the smallest effect size of interest in psychological research, we already need over 50 participants for a simple comparison of two within-participants conditions if we want t…
View article
How Much Should We Trust Estimates from Multiplicative Interaction Models? Simple Tools to Improve Empirical Practice Open
Multiplicative interaction models are widely used in social science to examine whether the relationship between an outcome and an independent variable changes with a moderating variable. Current empirical practice tends to overlook two imp…
View article
Best Practices for Estimating, Interpreting, and Presenting Nonlinear Interaction Effects Open
Many effects of interest to sociologists are nonlinear. Additionally, many effects of interest are interaction effects—that is, the effect of one independent variable is contingent on the level of another independent variable. The proper w…
View article
Generalized Synthetic Control Method: Causal Inference with Interactive Fixed Effects Models Open
Difference-in-differences (DID) is commonly used for causal inference in time-series cross-sectional data. It requires the assumption that the average outcomes of treated and control units would have followed parallel paths in the absence …
View article
FINEMAP: efficient variable selection using summary data from genome-wide association studies Open
Motivation: The goal of fine-mapping in genomic regions associated with complex diseases and traits is to identify causal variants that point to molecular mechanisms behind the associations. Recent fine-mapping methods using summary data f…
View article
A review of modelling tools for energy and electricity systems with large shares of variable renewables Open
This paper presents a thorough review of 75 modelling tools currently used for analysing energy and electricity systems. Increased activity within model development in recent years has led to several new models and modelling capabilities, …
View article
Evaluation of variable selection methods for random forests and omics data sets Open
Machine learning methods and in particular random forests are promising approaches for prediction based on high dimensional omics data sets. They provide variable importance measures to rank predictors according to their predictive power. …
View article
Sample size for binary logistic prediction models: Beyond events per variable criteria Open
Binary logistic regression is one of the most frequently applied statistical approaches for developing clinical prediction models. Developers of such models often rely on an Events Per Variable criterion (EPV), notably EPV ≥10, to determin…
View article
<b>mgm</b>: Estimating Time-Varying Mixed Graphical Models in High-Dimensional Data Open
We present the R package mgm for the estimation of k-order mixed graphical models (MGMs) and mixed vector autoregressive (mVAR) models in high-dimensional data. These are a useful extensions of graphical models for only one variable type, …
View article
Decomposing Wage Distributions Using Recentered Influence Function Regressions Open
This paper provides a detailed exposition of an extension of the Oaxaca-Blinder decomposition method that can be applied to various distributional measures. The two-stage procedure first divides distributional changes into a wage structure…
View article
A simple method to control over-alignment in the MAFFT multiple sequence alignment program Open
Motivation: We present a new feature of the MAFFT multiple alignment program for suppressing over-alignment (aligning unrelated segments). Conventional MAFFT is highly sensitive in aligning conserved regions in remote homologs, but the ris…
View article
Variable selection with stepwise and best subset approaches Open
While purposeful selection is performed partly by software and partly by hand, the stepwise and best subset approaches are automatically performed by software. Two R functions stepAIC() and bestglm() are well designed for stepwise and best…
View article
Neural Ordinary Differential Equations Open
We introduce a new family of deep neural network models. Instead of specifying a discrete sequence of hidden layers, we parameterize the derivative of the hidden state using a neural network. The output of the network is computed using a b…
View article
Keep Calm and Learn Multilevel Logistic Modeling: A Simplified Three-Step Procedure Using Stata, R, Mplus, and SPSS Open
This paper aims to introduce multilevel logistic regression analysis in a simple and practical way. First, we introduce the basic principles of logistic regression analysis (conditional probability, logit transformation, odds ratio). Secon…
View article
Evaluation of Variance Inflation Factors in Regression Models Using Latent Variable Modeling Methods Open
A procedure that can be used to evaluate the variance inflation factors and tolerance indices in linear regression models is discussed. The method permits both point and interval estimation of these factors and indices associated with expl…
View article
A Comparative Study of Categorical Variable Encoding Techniques for Neural Network Classifiers Open
In classification analysis, the dependent variable is frequently influenced not only by ratio scale variables, but also by qualitative (nominal scale) variables.Machine Learning algorithms accept only numerical inputs, hence, it is necessa…
View article
Large-scale analysis of test–retest reliabilities of self-regulation measures Open
The ability to regulate behavior in service of long-term goals is a widely studied psychological construct known as self-regulation. This wide interest is in part due to the putative relations between self-regulation and a range of real-wo…
View article
Interpreting Multiple Linear Regression: A Guidebook of Variable Importance Open
Multiple regression (MR) analyses are commonly employed in social science fields. It is also common for interpretation of results to typically reflect overreliance on beta weights (cf. Courville & Thompson, 2001; Nimon, Roberts, & Gavrilov…
View article
Partial least squares structural equation modeling-based discrete choice modeling: an illustration in modeling retailer choice Open
Commonly used discrete choice model analyses (e.g., probit, logit and multinomial logit models) draw on the estimation of importance weights that apply to different attribute levels. But directly estimating the importance weights of the at…
View article
Mitigating the Multicollinearity Problem and Its Machine Learning Approach: A Review Open
Technologies have driven big data collection across many fields, such as genomics and business intelligence. This results in a significant increase in variables and data points (observations) collected and stored. Although this presents op…
View article
Adequate sample size for developing prediction models is not simply related to events per variable Open
Higher EPV is needed when low-prevalence predictors are present in a model to eliminate bias in regression coefficients and improve predictive accuracy.
View article
Conindex: Estimation of Concentration Indices Open
Concentration indices are frequently used to measure inequality in one variable over the distribution of another. Most commonly, they are applied to the measurement of socioeconomic-related inequality in health. We introduce the user-writt…
View article
Model building strategy for logistic regression: purposeful selection Open
Logistic regression is one of the most commonly used models to account for confounders in medical literature. The article introduces how to perform purposeful selection model building strategy with R. I stress on the use of likelihood rati…