Panama Papers: Import Data to Neo4J using Cypher

I downloaded the panama paper network data, I was hoping it would be all the data, sadly not. Its it still interesting however. The import process is not to tricky. The following Cypher commands will get the data into a running Neo4J database. Note there is a \” in the Addresses file that will break the import. Search for it an replace with \ “. Data can be downloaded from here.

To get the relationships in we have to do a bit of hack as you cannot generate a relationship type on the fly from a CSV file with Cypher. I will do this properly with a bit of Java soon.

Change the paths! This is for the Addresses:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Addresses.csv' AS line CREATE (:Addresses { address: line.address, icij_id: line.icij_id, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, node_id: toInt(line.node_id), sourceID: line.sourceID})

For the Intermediaries:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Intermediaries.csv' AS line CREATE (:Intermediaries { name: line.name, internal_id: line.internal_id, address: line.address, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, status: line.status, node_id: toInt(line.node_id), sourceID: line.sourceID})

Officers:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Officers.csv' AS line CREATE (:Officers { name: line.name, icij_id: line.icij_id, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, node_id: toInt(line.node_id), sourceID: line.sourceID})

Entities:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Entities.csv' AS line CREATE (:Entities { name: line.name, original_name: line.original_name, former_name: line.former_name, jurisdiction: line.jurisdiction, jurisdiction_description: line.jurisdiction_description, company_type: line.company_type, address: line.address, internal_id: line.internal_id, incorporation_date: line.incorporation_date, inactivation_date: line.inactivation_date, struck_off_date: line.struck_off_date, dorm_date: line.dorm_date, status: line.status, service_provider: line.service_provider, ibcRUC: toInt(line.ibcRUC) , country_codes: line.country_codes, countries: line.countries, note: line.note, valid_until: line.valid_until, node_id: toInt(line.node_id), sourceID: line.sourceID})

Finally the relationships, or edges. Note the hack, all relationships are of type ACCOC. This isn’t a big problem but offends me a little bit. I will post you Java code that generates the graph dir from the files.

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/all_edges.csv' AS csvLine
MATCH (n1 { id: toInt(csvLine.node_1)}),(n2 { id: toInt(csvLine.node_2)})
CREATE (n1)-[:ACCOC {role: csvLine.rel_type}]->(n2)

About phil

Complex systems scientist.
This entry was posted in Neo4J, Networks, Panama Papers, Politics and tagged , , , , . Bookmark the permalink.