Version: 2.0.1

CSV Import Tool

CSV is a universal and very versatile data format used to store large quantities of data. Each Memgraph database instance has a CSV import tool installed called mg_import_csv. The CSV import tool should be used for initial bulk ingestion of data into the database. Upon ingestion, the CSV importer creates a snapshot that will be used by the database to recover its state on its next startup.

If you are already familiar with the Neo4j bulk import tool, then using the mg_import_csv tool should be easy. The CSV import tool is fully compatible with the Neo4j CSV format. If you already have a pipeline set-up for Neo4j, you should only replace neo4j-admin import with mg_import_csv.

info

For more detailed information about the CSV Import Tool, check our Reference guide.

How to use the CSV Import Tool?

Docker 🐳
Linux

note

If you installed Memgraph through Docker Hub, the name of the Docker image memgraph should be replaced with memgraph/memgraph-platform if you didn't change the image tag.

If you installed Memgraph using Docker, you will need to run the importer using the following command:

docker run -v mg_lib:/var/lib/memgraph -v mg_import:/import-data --entrypoint=mg_import_csv memgraph

caution

This is an incomplete command as it's missing the files that need to be imported. It will result with a The --nodes flag is required! error. You can find a complete example below.

For information on other options, run:

docker run --entrypoint=mg_import_csv memgraph --help

The import tool is run from the console, using the mg_import_csv command. The tool should be run as user memgraph, using the following command:

sudo -u memgraph mg_import_csv

For information on other options, run:

sudo -u memgraph mg_import_csv --help

Below, you can find two examples of how to use the CSV Import Tool depending on the complexity of your data:

One type of nodes and relationships
Multiple types of nodes and relationships

caution

It is also important to note that importing CSV data using the mg_import_csv command should be a one-time operation before running Memgraph. In other words, this tool should not be used to import data into an already running Memgraph instance.

Examples

One type of nodes and relationships

Let's import a simple dataset.

1. people_nodes.csv
2. people_relationships.csv

Store the following in people_nodes.csv:

id:ID(PERSON_ID),name:string,:LABEL
100,Daniel,Person
101,Alex,Person
102,Sarah,Person
103,Mia,Person
104,Lucy,Person

Let's add relationships between people in people_relationships.csv:

:START_ID(PERSON_ID),:END_ID(PERSON_ID),:TYPE
100,102,IS_FRIENDS_WITH
103,101,IS_FRIENDS_WITH
102,103,IS_FRIENDS_WITH
101,104,IS_FRIENDS_WITH
104,100,IS_FRIENDS_WITH
101,102,IS_FRIENDS_WITH
100,103,IS_FRIENDS_WITH

Now, you can import the dataset using the CSV Import Tool.

caution

Your existing snapshot and WAL data will be considered obsolete, and Memgraph will load the new dataset.

Use the following command:

Docker 🐳
Linux

If using Docker, things are a bit more complicated. First you need to copy the CSV files where the Docker image can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the importer with the following:

docker run -v mg_lib:/var/lib/memgraph -v mg_import:/import-data \
  --entrypoint=mg_import_csv memgraph \
  --nodes /import-data/people_nodes.csv \
  --relationships /import-data/people_relationships.csv

Next time you run Memgraph, the dataset will be loaded:

 docker run -p 7687:7687 -v mg_lib:/var/lib/memgraph memgraph

sudo -u memgraph mg_import_csv --nodes people_nodes.csv --relationships people_relationships.csv

Next time you run Memgraph, the dataset will be loaded.

Multiple types of nodes and relationships

The previous example is showcasing a simple graph with one node type and one relationship type. If we have more complex graphs, the procedure is similar. Let's define the following dataset:

1. people_nodes.csv
2. people_relationships.csv
3. restaurants_nodes.csv
4. restaurants_relationships.csv

Add the following to people_nodes.csv:

id:ID(PERSON_ID),name:string,age:int,city:string,:LABEL
100,Daniel,30,London,Person
101,Alex,15,Paris,Person
102,Sarah,17,London,Person
103,Mia,25,Zagreb,Person
104,Lucy,21,Paris,Person
105,Adam,23,New York,Person

Let's define the relationships between people in people_relationships.csv :

:START_ID(PERSON_ID),:END_ID(PERSON_ID),:TYPE, met_in:int
100,102,IS_FRIENDS_WITH,2014
103,105,IS_FRIENDS_WITH,2021
102,103,IS_FRIENDS_WITH,2005
101,104,IS_FRIENDS_WITH,2005
104,100,IS_FRIENDS_WITH,2018
105,102,IS_FRIENDS_WITH,2017
100,103,IS_FRIENDS_WITH,2001

Let's introduce another node type, restaurants, in restaurants_nodes.csv :

id:ID(REST_ID),name:string,menu:string[],:LABEL
200,Mc Donalds,Fries;BigMac;McChicken;Apple Pie,Restaurant
201,KFC,Fried Chicken;Fries;Chicken Bucket,Restaurant
202,Subway,Ham Sandwich;Turkey Sandwich;Foot-long,Restaurant
203,Dominos,Pepperoni Pizza;Double Dish Pizza;Cheese filled Crust,Restaurant

Let's define the relationships between people and restaurants in restaurants_relationships.csv:

:START_ID(PERSON_ID),:END_ID(REST_ID),:TYPE, liked:boolean
100,200,ATE_AT,true
103,201,ATE_AT,false
104,200,ATE_AT,true
101,202,ATE_AT,false
101,203,ATE_AT,false
101,200,ATE_AT,true
102,201,ATE_AT,true

After preparing the files above, you can import the dataset using the CSV Import tool.

Docker 🐳
Linux

If using Docker, things are a bit more complicated. First, you need to copy the CSV files where the Docker container can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker cp restaurants_nodes.csv mg_import_helper:/import-data
docker cp restaurants_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the importer with the following command:

docker run -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
  --entrypoint=mg_import_csv memgraph \
  --nodes /import-data/people_nodes.csv \
  --nodes /import-data/restaurants_nodes.csv \
  --relationships /import-data/people_relationships.csv \
  --relationships /import-data/restaurants_relationships.csv

Next time you run Memgraph, the dataset will be loaded:

 docker run -p 7687:7687 -v mg_lib:/var/lib/memgraph memgraph

sudo -u memgraph mg_import_csv --nodes people_nodes.csv --nodes restaurants_nodes.csv --relationships people_relationships.csv --relationships restaurants_relationships.csv

The next time you run Memgraph, the dataset will be loaded.

How to use the CSV Import Tool?​

Examples​

One type of nodes and relationships​

Multiple types of nodes and relationships​

How to use the CSV Import Tool?

Examples

One type of nodes and relationships

Multiple types of nodes and relationships