- 3 Idiots' Solution (1st)
- Code, docs and discussion thread can be found online.
- Basically it is a single FM model result. (!!!)
- "Empirically we observe using categorical features is always better than using numerical features." (!!!)
- "instance-wise data normalization makes the optimization problem easier to be solved." (!!!)
- Field wise fm.
- Use GBDT results as new features.
- per-coordinate learning rate schedule looks very useful for sgd. Related paper. I think this method is also used in vowpal wabbit.
G = G + g*g
w = w - eta*(1/sqrt(G))*g
- calibrate the final result based on local prediction.
- Some ideas of this solution come from paper "Practical Lessons from Predicting Clicks on Ads at Facebook".
- Guocong Song's Solution (3rd)
- Code, docs and discussion thread can be found online.
- Linear combination of 4 models. All 4 models are trained by vw.(!!!)
- Grouping features before generate quadratic/polynomial features.
- Two tricks of using vw.
- -C [ --constant ] Set initial value of constant (Useful for faster convergence on data-sets where the label isn't centered around zero)
- --feature_mask allows to specify directly a set of parameters which can update, from a model file. This is useful in combination with --l1. One can use --l1 to discover which features should have a nonzero weight and do -f model, then use --feature_mask model without --l1 to learn a better regressor.
- Julian de Wit's Solution (4th)
- Code, docs and discussion thread can be found online.
- Ensemble of deep neural networks (!!!)
- Traindata contained roughly 128K-200K features (!!!)
- Rare and unseen test-set categorical values were all encoded as one category. (almost all winner mentioned this.) This is almost the only feature engeering of this winner.
- Numeric features need standardized. Log-transform apply on longtail features (cnt based features).
- 2 hidden layers with respectively 128 and 256 units.
- "Other challenges also reported simply averaged ensembles with neural networks are quite good already." (link)
Hopefully will get more thoughts after repro those results and reading code.
No comments:
Post a Comment