Skip to main content
Version: 2.2.1

CSV import tool

Related - Reference Guide

CSV is a universal and very versatile data format used to store large quantities of data. Each Memgraph database instance includes a CSV import tool called mg_import_csv. The CSV import tool should be used for initial bulk ingestion of data into the database. Upon ingestion, the CSV importer creates a snapshot that will be used by the database to recover its state on its next startup.

If you are already familiar with the Neo4j bulk import tool, then using the mg_import_csv tool should be easy. The CSV import tool is fully compatible with the Neo4j CSV format. If you already have a pipeline set-up for Neo4j, you should only replace
neo4j-admin import with mg_import_csv.

info

For more detailed information about the CSV import tool, check our Reference guide.

Importing CSV data using the mg_import_csv should be a one-time operation done before running Memgraph. In other words, this tool should not be used to import data into an already running Memgraph instance.

If you are using Docker, before the import, you need to transfer CSV files where the Docker container can see them.

Please check the examples below to find out how to use the import tool based on the complexity of your data.

Examples

Here are two examples of how to use the CSV import tool depending on the complexity of your data:

One type of nodes and relationships

Let's import a simple dataset.

Download the people_nodes.csv file with the following content:

id:ID(PERSON_ID),name:string,:LABEL
100,Daniel,Person
101,Alex,Person
102,Sarah,Person
103,Mia,Person
104,Lucy,Person

Download the people_relationships.csv file with the following content:

:START_ID(PERSON_ID),:END_ID(PERSON_ID),:TYPE
100,102,IS_FRIENDS_WITH
103,101,IS_FRIENDS_WITH
102,103,IS_FRIENDS_WITH
101,104,IS_FRIENDS_WITH
104,100,IS_FRIENDS_WITH
101,102,IS_FRIENDS_WITH
100,103,IS_FRIENDS_WITH

Let's import the dataset using the CSV import tool. We will be importing 2 CSV files.

danger

Your existing snapshot and WAL data will be considered obsolete, and Memgraph will load the new dataset. This means that all of your existing data will be lost and replaced with the newly imported data.

If your Memgraph docker is running, you need to stop it before starting the import process.

If you are using Docker, first copy the CSV files where the Docker container can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the import tool with the following command, but be careful of three things:

  1. Check the image name you are using is correct:
    • If you downloaded Memgraph Platform, leave the current image name memgraph/memgraph-platform.
    • If you downloaded MemgraphDB, replace the current image name with memgraph.
    • If you downloaded MAGE, replace the current image name with memgraph/memgraph-mage.

  2. If you are using Docker on Windows and execute commands in PowerShell change the line breaks from \ to `.
  3. Check that the paths of the files you want to import are correct.
docker run --user="memgraph" -v mg_lib:/var/lib/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph/memgraph-platform \
--nodes /import-data/people_nodes.csv \
--relationships /import-data/people_relationships.csv

If you get a --nodes flag is required! error, the paths to the files are incomplete or you are missing them completely.

Next time you run Memgraph, the dataset will be loaded.

 docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 -v mg_lib:/var/lib/memgraph memgraph/memgraph-platform

For information on other options, run:

docker run --entrypoint=mg_import_csv memgraph/memgraph-platform --help
After the import, the graph in Memgraph should look like this:

Multiple types of nodes and relationships

The previous example is showcasing a simple graph with one node type and one relationship type. If we have more complex graphs, the procedure is similar. Download the four CSV files to define a dataset:

You can check the contents of the files and its description in the tabs below.

The people_nodes.csv file contains the people nodes with name, age, city and label properties.

id:ID(PERSON_ID),name:string,age:int,city:string,:LABEL
100,Daniel,30,London,Person
101,Alex,15,Paris,Person
102,Sarah,17,London,Person
103,Mia,25,Zagreb,Person
104,Lucy,21,Paris,Person
105,Adam,23,New York,Person

Let's import 4 files using the CSV import tool.

If you are using Docker, first copy the CSV files where the Docker container can see them:

docker container create --user memgraph --name mg_import_helper -v mg_import:/import-data busybox
docker cp people_nodes.csv mg_import_helper:/import-data
docker cp people_relationships.csv mg_import_helper:/import-data
docker cp restaurants_nodes.csv mg_import_helper:/import-data
docker cp restaurants_relationships.csv mg_import_helper:/import-data
docker rm mg_import_helper

Then, run the import tool with the following command, but be careful of three things:

  1. Check the image name you are using is correct:
    • If you downloaded Memgraph Platform leave the current image name memgraph/memgraph-platform.
    • If you downloaded MemgraphDB replace the current image name with memgraph.
    • If you downloaded MAGE replace the current image name with memgraph/memgraph-mage.

  2. If you are using Docker on Windows and execute commands in PowerShell change the line breaks from \ to `.
  3. Check that the paths of the files you want to import are correct.
docker run --user="memgraph" -v mg_lib:/var/lib/memgraph -v mg_etc:/etc/memgraph -v mg_import:/import-data \
--entrypoint=mg_import_csv memgraph/memgraph-platform \
--nodes /import-data/people_nodes.csv \
--nodes /import-data/restaurants_nodes.csv \
--relationships /import-data/people_relationships.csv \
--relationships /import-data/restaurants_relationships.csv

The next time you run Memgraph, the dataset will be loaded:

 docker run -it -p 7687:7687 -p 7444:7444 -p 3000:3000 -v mg_lib:/var/lib/memgraph memgraph/memgraph-platform

For information on other options, run:

docker run --entrypoint=mg_import_csv memgraph/memgraph-platform --help
After the import, the graph in Memgraph should look like this: