The theoretical, empirical sides of machine learning

3 min readNov 30, 2020

When watching an open course on computational social science which a friend recommended, I ran across this interesting categorization of the scientific method. It describes three categories, the empirical, the analytical, and the theoretical. The empirical method is about the generalization of observed data — in other words, inductive reasoning. For example, Darwin’s theory of evolution. The analytical is between the empirical and the theoretical, focusing more on the “how” of a phenomenon. The theoretical is something characterized by Einstein, it goes in the opposite direction as empirical method, crafting a “theory” first and checking it with real world data later on. Einstein allegedly said: “if the theory of relativity turned out to not match the experiment, then I feel sorry for the Creator. Because the theory is right.” However, can it not be argued that the empirical and the theoretical are fundamentally the same? After all, Einstein crafted his theory to fit at least some of the data observed in physics experiments, just like Darwin tried to fit the data he observed in biology. He did not invent something out of nothingness, that’s for sure. Darwin’s studies is also dubbed a “theory” of evolution, so it’s not clear whether this bi-partition of methodology even makes sense. I do see that there’s a spectrum where one end is using lots of data but not a lot of (rigorous and high accuracy) math, and the other end is the opposite. Either way, it made enough sense to me at the time I was listening to the lecture. I want to use this categorization to sort of intuitively classify the current fields in computer science (more specifically, machine learning). This is just for improving my own intuitive grasp on the subject.

Empirical method definitely sounds like data science, machine learning, and all of the statistical computer sciences. But are those things empirical? I’d say they are more designed to replace human empiricist thinking. With machine learning, a computer can generalize from data even better than a human being could. That is, if the algorithm and the amount of data are both sufficient for such a task, which has to be specific (ex. recognizing a cat from an image). However, are the design of these algorithms empirical in nature? The machine learning algorithm researchers think of ideas of new models, and then does experiments on the data to see their effectiveness (accuracy, recall, running time, etc). This sounds more like the theoretical method. I think the confusion here can be lifted by realizing that there is a distinction between the “theory” of machine learning and “models” of machine learning. The entirety of machine learning is based on the theory of computer science, statistics, information theory, and linear algebra. From the theory, machine learning scientists develop more specific models which serve to achieve something in a more concrete context: classification, generating content, playing games. These models rely on the theory to be correct, since theoretical results are used and assumed in creating the models. Does this means models can only improve if the the theory improve? No, what made machine learning become so successful is not some miraculous new discovery in statistics or linear algebra, but better and better models (deep neural nets, CNN, RNN, etc). This means that the main guiding criteria for machine learning and data science is not theoretical, but empirical. It’s about using theoretical foundations to develop very useful tools that solve problems, and not really about how to make the theoretical foundations better to allow more more models being possible. That is not to say that the latter is not important, or that no one is doing that, but it is important to realize what is being prioritized in a given field, and what is not.

The theoretical, empirical sides of machine learning

Written by Shallow Sea