10
Axa challenge with DSS Matthieu Scordia - Data Scientist @ Dataiku Kaggle Meetup 13/01/2015

challenge Axa datascience

Embed Size (px)

Citation preview

Axa challenge with DSS

Matthieu Scordia - Data Scientist @ Dataiku

Kaggle Meetup 13/01/2015

Matthieu Scordia

Score insurance’s product for cross-sales.

1st : 5 000 € 2nd : 3 000 € 3th: 2 000 € 4th : 1 500 € 5th: 1 000 € 6th: 500 €

Second phase of the competition with DSS!

Matthieu Scordia

Step 1: Let’s get started.

1 - Download & install the DSS community edition.

2 - Import the AXA project.

Matthieu Scordia

Step 2: Enter the flow.

datasets source datasets preprocessed

submissions

Two models plugged

Matthieu Scordia

The data.

Classification problem. - Target: {0, 1} - Some known variables: department, sex, age - 19 more numerical, only integers. - 11 categorical variables with small cardinality.

25 333 rows!

88 225 rows!

34 columns!

Unbalanced!!

Matthieu Scordia

Evaluation method: the lift (10%)

Matthieu Scordia

Step 3: Make your own model.

Two examples available in the DSS project:

a python script running a random forest: - a DSS model bench:

Matthieu Scordia

In the model bench

You can try different algorithms and compare them:

Matthieu Scordia

Put your model in the flow or customize it in a python notebook

2 - Récupérer des features et les critiques sur allocine.com

If you want to submit on datascience.net.

Matthieu Scordia

Export and submit online.

Good luck !