Nefel Tellioglu, Yaman Barlas
Bogazici University Department of Industrial Engineering Bebek, Istanbul, 34342, Turkey
As data analysis tools have been progressing in recent years, constructing models grounded in data has become a promising topic. There are four categories in model construction where data may be used: discovering causal relations, the polarity of relations, stock variables and mathematical expressions. In this work, we focus on discovering (1) polarity of relations, (2) stock variables, and (3) mathematical expressions, by applying correlation analysis, curve fitting, and structural equation modeling (SEM) on “simulation-generated” data. It is concluded that, for (1) extracting directions of relations, correlation analysis may return misleading outcomes because of unknown stock variables and/or perfectly correlated cause variables. The perfect correlation problem can be solved if the data involve randomness. For (2) discovering stock variables, even for very simple models, determining accumulations from data is not possible. For (3) discovering mathematical expressions, curve fitting for a single causal link is promising, although modelers must make sense of the constant values in the equations. For multiple causal links, SEM can only be applied when the effect functions are linear. Moreover, when there is a one-to-one relationship between variables, SEM cannot converge because of multiple solutions and/or zero-variance problem, which do not exist when the data involve randomness/noise.