Tree based VS Deep learning


Why tree-based models beat deep learning-based on tabular data

Why Tree-Based Models Beat Deep Learning on Tabular Data | by Devansh- Machine Learning Made Simple | Geek Culture | Aug, 2022 | Medium

The autoher think that Random Forest are very good for situations with missing data. And the paper he evaluated implement removing missing data for each columns. The author says that he doesn’t like to do some preprocess for data analysis.

Why do tree-based methods beat deep learning?

  1. NNs are biased to overly smoothed solutions. Neural Nets based on gradient, the decision boundary of the Neural Nets should be smooth over, but the Random Forest could have a irregular pattern for more precise decision.
  2. Uniformative features affect more MLP-like NNs. The decision trees are designed to have information gain and entropy when decide the paths to follow.
  3. NNs are invariant to rotation, actual data is not. NNs are maintaining their original performance, while all other learners actually lose quite a bit of performance.

Author: Wulilichao
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint policy. If reproduced, please indicate source Wulilichao !
  TOC