Transforming Strings, Numbers, Dates and Booleans
Overview
Teaching: 5 min
Exercises: 5 minQuestions
How do I use transformations to programmatically edit my data?
How do I transform the various data types?
Objectives
Introduce date formats
Data types
Understanding data types and regular expressions will help you write more complex transformations using GREL.
Data types in OpenRefine
Every piece of data in OpenRefine has a ‘type’. The most common ‘type’ is a ‘string’ - that is a piece of text. However there are other data types available and transformations let you convert data from one type to another where appropriate. The data types supported are:
- String
- Number
- Date
- Boolean
- Array (covered in the next lesson)
Dates and Numbers
So far we’ve been looking only at ‘String’ type data. Much of the time it is possible to treat numbers and dates as strings. For example in the Date column we have the date of publication represented as a String. However, some operations and transformations only work on ‘number’ or ‘date’ type operations. The simplest example is sorting values in numeric or date order. To carry out these functions we need to convert the values to a date or number first.
Reformat the Date
- Make sure you remove all Facets and Filters
- On the Date column use the dropdown menu to select
Edit cells->Common transforms->To date
- Note how the values are now displayed in green and follow a standard convention for their display format (ISO8601) - this indicates they are now stored as date data types in OpenRefine. We can now carry out functions that are specific to Dates
- On the Date column dropdown select
Edit column->Add column based on this column
. Using this function you can create a new column, while preserving the old column- In the ‘New column name’ type “Formatted Date”
- In the ‘Expression’ box type the GREL expression
value.toString("dd MMMM yyyy")
Key Points
You can alter data in OpenRefine based on specific instructions