Introduction to OpenRefine
|
OpenRefine is ‘a tool for working with messy data’
OpenRefine works best with data in a simple tabular format
OpenRefine can help you split data up into more granular parts
OpenRefine can help you match local data up to other data sets
OpenRefine can help you enhance a data set with data from other sources
|
Importing data into OpenRefine
|
Use the Create Project option to import data
You can control how data imports using options on the import screen
OpenRefine uses rows and columns to display data
Most options to work with data in OpenRefine are accessed through a drop down menu at the top of a data column
When you select an option in a particular column (e.g. to make a change to the data), it will effect all the cells in that column
|
Faceting - Filtering - Splitting
|
You can use facets and filters to explore your data
You can use facets and filters work with a subset of data in OpenRefine
You can easily correct common data issues from a Facet
|
Clustering
|
Clustering is a way of finding variant forms of the same piece of data within a dataset (e.g. different spellings of a name)
There are a number of different Clustering algorithms that work in different ways and will produce different results
The best clustering algorithm to use will depend on the data
Using clustering you can replace varying forms of the same data with a single consistent value
|
Undo and Redo
|
|
Introduction to Transformations
|
|
Writing Transformations
|
|
Transforming Strings, Numbers, Dates and Booleans
|
|
Exporting
|
|