But if you have a good enough model, you can maybe tell them apart. From medium.com:
…in the last few years, statisticians have begun to explore a number of ways to solve this problem. They say that in certain circumstances it is indeed possible to determine cause and effect based only on the observational data.
At first sight, that sounds like a dangerous statement. But today Joris Mooij at the University of Amsterdam in the Netherlands and a few pals, show just how effective this new approach can be by applying it to a wide range of real and synthetic datasets. Their remarkable conclusion is that it is indeed possible to separate cause and effect in this way.
Mooij and co confine themselves to the simple case of data associated with two variables, X and Y. A real-life example might be a set of data of measured wind speed, X, and another set showing the rotational speed of a wind turbine, Y.
These datasets are clearly correlated. But which is the cause and which the effect? Without access to a controlled experiment, it is easy to imagine that it is impossible to tell.
The basis of the new approach is to assume that the relationship between X and Y is not symmetrical. In particular, they say that in any set of measurements there will always be noise from various cause. The key assumption is that the pattern of noise in the cause will be different to the pattern of noise in the effect. That’s because any noise in X can have an influence on Y but not vice versa.
And it appears to work. This is very exciting!