16
24 AUGUST 2016 THE BID PROGRAMME IS FUNDED BY THE EUROPEAN UNION OpenRefine Nestor Beltran

BID CE workshop 1 session 10 - presentation - Open Refine

Embed Size (px)

Citation preview

Page 1: BID CE workshop 1   session 10 - presentation - Open Refine

24 AUGUST 2016THE BID PROGRAMME IS FUNDED BY THE EUROPEAN UNION

OpenRefineNestor Beltran

Page 2: BID CE workshop 1   session 10 - presentation - Open Refine

DATA CURATION, FORMATTING & TRANSFORMATION

Use of OpenRefine

Page 3: BID CE workshop 1   session 10 - presentation - Open Refine

MANIPULATION, FORMATAGE ET TRANSFORMATION DE DONNÉES

Utilisation d’OpenRefine

Page 4: BID CE workshop 1   session 10 - presentation - Open Refine

Faceting

Clustering

Reconciling

What is?A powerful tool for working with messy data

Page 5: BID CE workshop 1   session 10 - presentation - Open Refine

Facettes

Agglomération (“Clustering”)

Réconciliation

Qu'est-ce que c'est?Un outil puissant pour manipuler des données “désordonnées”

Page 6: BID CE workshop 1   session 10 - presentation - Open Refine

What is not?

Database Excel-like experience

Page 7: BID CE workshop 1   session 10 - presentation - Open Refine

Qu’est ce que ce n’est pas ?

Base de données Comme Excel

Page 8: BID CE workshop 1   session 10 - presentation - Open Refine

Features comparison

● Usually one cell edition ● Useful to document

data and perform operations

● Data is not always visible

● Deficient visualization

● Multiple cell edition● Easy exploration and

transformation● Interactive visualization

● Infrastructure and programing skills to edit

● Absence of easy visualization

Database OpenRefineExcel

Page 9: BID CE workshop 1   session 10 - presentation - Open Refine

Comparaison des fonctionnalités

● Édition “une cellule à la fois”

● Utile pour documenter les données et effectuer des opérations

● Les données ne sont pas toujours visibles

● Visualisations limitées

● Edition de plusieurs cellules simultanément

● Explorations et transformations simples

● Visualisations interactives

● Nécessite une infrastructure et des compétences en programmation.

● Pas de visualisation facile/automatique

Base de données OpenRefineExcel

Page 10: BID CE workshop 1   session 10 - presentation - Open Refine

Features comparison

● Multiple cell edition● Easy exploration and

transformation● Interactive visualization

OpenRefine

● Free ● Desktop app (offline)● Infinite undo function● Use of APIs ● Several type files to

export/import

● Faceting / Filters● Use of regular

expressions● Extensions● Large community of

developers (extensions, tutorials, etc)

Page 11: BID CE workshop 1   session 10 - presentation - Open Refine

Comparaison des fonctionnalités

● Edition de plusieurs cellules simultanément

● Explorations et transformations simples

● Visualisations interactives

OpenRefine

● Gratuit et Open Source● Application de bureau

(hors ligne)● “Undo” illimité● Utilisation d’APIs ● Plusieurs formats

d’import/export

● Facettes / filtres● Utilise les expressions

régulières● Extensions● Grande communauté

(extensions, tutoriels, etc)

Page 12: BID CE workshop 1   session 10 - presentation - Open Refine

DATA CURATION, FORMATTING & TRANSFORMATION

Practical session

Page 13: BID CE workshop 1   session 10 - presentation - Open Refine

MANIPULATION, FORMATAGE ET TRANSFORMATION DE DONNÉES

Session pratique

Page 14: BID CE workshop 1   session 10 - presentation - Open Refine

Cell.recon.match.

id

Edit column

●Formulas (copy-paste)

●Commands in OpenRefine

●Column names

●Useful hyperlinks

●Column menu

Conventions

nameRecon

www.gbif.org

Page 15: BID CE workshop 1   session 10 - presentation - Open Refine

Cell.recon.match.

id

Edit column

●Formules (copier-coller)

●Commandes OpenRefine

●Noms de colonnes

●Liens hypertextes

●Menu déroulant

Conventions

nameRecon

www.gbif.org

Page 16: BID CE workshop 1   session 10 - presentation - Open Refine

24 AUGUST 2016THE BID PROGRAMME IS FUNDED BY THE EUROPEAN UNION

OpenRefineNestor Beltran