Hello everybody! I am going to write a tutorial on how to work with CSVs using TerminusDB. That’s right, TerminusDB 4.0 now has the feature in which you can import and export CSVs, Let’s dive into this.
How to create a database with CSVs
- Open up the TerminusDB console and create a database
- Fill in the Id, Name and description of your database
- You will find a button Create Database from CSVs
- Click on this button to load which ever CSVs you wish to import to your database, In this blog I am going to use The Bike Share Data
- I have added 2 CSVs into my database as shown below
- And that’s it, go ahead and click on Create New database, which includes the 2 CSVs in it 😊😊😊😊😊
What happens when you import CSVs into your Database?
On creating our database, Go to the Documents Page
Each of our CSV files that were loaded on create becomes a separate Document Type. All the columns in the CSV file have been converted into properties and a schema is auto-generated.
Each CSV is of doctype CSV and has rows hanging off them, we can use these auto generated column properties to query the database.
You can also add more CSVs once you have created your database by clicking the Add CSVs
Go to the Schema Page to view the property types which have been generated off the columns
Let’s Query the Database
Get CSV Id’s, Query for type scm:CSV
and (
triple('v:CSV ID', 'type','scm:CSV'),
triple('v:CSV ID', 'label', 'v:CSV Name')
)
This query simply gives you the ID and name of available CSVs in the Database. Take a note of the CSV Id you’re interested in, also visit the Schema page to get the Property Id’s your interested in as well.
Open a new Query Pane and try the below query
const id="doc:CSV_bike_tutorial-1.csv"
WOQL.and (
.limit(1).triple('v:CSV ID', 'type', 'scm:CSV')
.eq('v:CSV ID', id)
.triple('v:CSV ID', 'scm:csv_row', 'v:CSV Row')
.triple('v:CSV Row', 'type', 'v:Row Type'),
.quad('v:Property', 'domain', 'v:Row Type', 'schema/main')
.quad('v:Property', 'label', 'v:Property Name' ,'schema/main')
)
This query gets all the property Id and name from the schema graph.
Next, let’s query to display a few columns of our CSV, I plan to display Start Station Number, Start Station, End Station Number, End Station and Duration
const id="doc:CSV_bike_tutorial-1.csv"
const startStationNumber = "scm:column_Start%20station%20number"
const StartStation = "scm:column_Start%20station"
const endStationNumber = "scm:column_End%20station%20number"
const endStation = "scm:column_End%20station"
const duration = "scm:column_Duration"
and (
triple('v:CSV ID', 'type','scm:CSV').eq('v:CSV ID', id)
.triple('v:CSV ID', 'scm:csv_row', 'v:CSV Row')
.triple('v:CSV Row', startStationNumber, 'v:Start Station Number')
.triple('v:CSV Row', StartStation, 'v:Start Station')
.triple('v:CSV Row', endStationNumber, 'v:End Station Number')
.triple('v:CSV Row', endStation, 'v:End Station')
.triple('v:CSV Row', duration, 'v:Duration')
)
All the rows of CSV matching the column names will be displayed on running this query
So yeah that’s it, you can do all sorts of queries on your CSV
How to update CSV
I am going to show an example in which I change the duration of Bike number W22558 to 500 in my CSV file and I am going to update this change. So we go to Document Page and click on Add CSVs and add the CSV of interest
Since bike_tutorial-1.csv already exists, console will automatically prompt you to update the CSV. Input a commit message of what we have changed and hit on Upload CSV.
The brilliant part about the update is that only diffs are considered and updated which makes update of big sized files more efficient. To understand this, Go to DB Home Page -> Latest Updates
Click on your recent update, The only change on update was the row for bike W22558 of duration 417 was removed and a new row for bike W22558 with duration 500 was added instead. This makes updates of CSVs much more quick 😊😊😊😊😊
You can also time travel on the database to see the diffs on CSV files over time.
Export your CSV
Now that we have imported our CSVs into our database we can do whatever we like or modify it however we want and Export this CSV back.
Go to Documents, and filter for all CSV types to get a download option as shown and download the CSV.
So here we go, this tutorial shows a very simple way to import CSVs into TerminusDB and play with them and do all sorts of useful cool queries, whatever you like.
Earlier we had to do all the steps mentioned in TerminusDB: Importing, Reordering & Exporting a CSV not anymore!
Now it’s just a few button clicks and voilà!