Numerai Quant Club / Why do tree-based models still outperform deep learning on tabular data?

preview_player
Показать описание
“Why do tree-based models still outperform deep learning on tabular data?” and how to apply its insights to the Numerai competition.

Thank you to Wigglemuse for joining the conversation.

Join Numerai’s Chief Scientist and Minister of Research, Michael Oliver, as he digs deep into the world of Numerai, research, data and everything in between.

This will be a recurring event through 2023. More details:

[13:42] The fine structure that you say that can’t be captured, does it just go away if you have a lot more parameters for the neural network?
[23:46] Do you think feature selection is a very important thing to do well?
[25:18] What is your best way to do feature selection? What do you think is sort of the state of the art of feature selection on Numerai?
[30:54] In this age of TC, the question is what is good performance?
[36:51] Are all features sort of inherently relative anyway?
[39:00] I know people now have access to past meta model scores and I wanted to ask if people have had luck in using them in trying to pursue TC-based models and been able to make more sense out of TC now that they have access to past meta model scores.
[40:08] What would be a perfect TC
[41:40] Is there a solid suggestion there? Let’s say I build a model today, what can I do to see if it’s going to get TC, save submitting it and waiting? If I look at those meta model scores, and look at the residuals, and then the core leftover is good, would I then expect to get TC?
[43:49] Do you think that if you are not getting good core, even negative core, that you’re just not going to get TC, or whatever TC you are getting is spurious and random? Or is the good core correlated with good TC just a subset of the good TC space?
[46:15] Well I mean nomi, moving from a flat target to nomi Gaussian did that to some extent, right? So now you’re talking about doing that to our side as well?
[51:20] Do you think that’s still a problem, maybe? Just a high dependence, if this feature, even if it’s non-linear, it’s just not behaving this week then there goes your model, in terms of robustness?

Want to join in on the hardest #datascience competition on the planet? Or are you a quant with your own data?
Рекомендации по теме
Комментарии
Автор

Pardon my ignorance.. what is the full form of TC ?

robinjacobroy
Автор

Will tabular transformer ever outperform tree-based models?

Daniel-tp