What do you do with the Panama Data?

The released Panama data comes in the form of a Neo4J database, or the files that you can make one with, seems to me a little tricky to do much with. There is no detail beyond attributes of the different entities, so that limits us to looking at the relationships alone and it is hard to judge the significance of the relationships without the context… that said its a fun data set to play with.

I decided to draw out some graphs of how things are connected via other things. Below is one from Officers connected to other Officers via *something* else, generated via R using iGraph from the Neo4J data set. This produces a few clusters containing a relatively small number of nodes connected to others. The query that produces the graph is, “MATCH (n:Officers)-[:`officer of`]->(o)<-[:`officer of`]-(m:Officers) WHERE NOT id(n)=id(m) AND id(n)<id(m) RETURN n.name AS Officer1, m.name AS Officer2, count(o) AS Weight”

Panama Data - Officers Rels
Officers connected to other Officers via anything else.

Panama Revisted

The people over at The International Consortium of Investigative Journalists have updated the released panama data. Its not clear to me if that is more data than they had already released, or that this time it is a ready made Neo4J database. They provide two versions of the database, Windows and Mac. Its easy to get it to work in Linux, just copy the graph.db file from out of the archive into the databases directory of your Neo4J install.

I made a quick query to look for officers with the same address. Seems there some, it would need something more sophisticated to did any deeper.

MATCH (n:Officer)–(a:Address)–(m:Officer) RETURN n,a,m LIMIT 25

graph

 

 

 

 

 

Java Panama Papers Neo4J Network Generator

Further to the first attempt at importing the Panama Papers network data into Neo4J I did a very quick Java program that greats an embedded Neo4J database. It needs a bit of checking as it finds nodes that have the same node_id. Which I assume is some sort of mistake in the program or the data, it also looks like there is some duplicate relationships.

This program generates relationships of the different types. Such as ‘officer_of’, rather than the hack used to get Cypher to import the data (see earlier post).

The code can be found in my new github.

Below is Blairmore, Ian Cameron, the intermediary, and loads of other companies that use the same intermediary.

Fig2

 

 

 

 

 

 

 

Not many directly links to Blairmore.

Panama Papers: Import Data to Neo4J using Cypher

I downloaded the panama paper network data, I was hoping it would be all the data, sadly not. Its it still interesting however. The import process is not to tricky. The following Cypher commands will get the data into a running Neo4J database. Note there is a \” in the Addresses file that will break the import. Search for it an replace with \ “. Data can be downloaded from here.

To get the relationships in we have to do a bit of hack as you cannot generate a relationship type on the fly from a CSV file with Cypher. I will do this properly with a bit of Java soon.

Change the paths! This is for the Addresses:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Addresses.csv' AS line CREATE (:Addresses { address: line.address, icij_id: line.icij_id, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, node_id: toInt(line.node_id), sourceID: line.sourceID})

For the Intermediaries:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Intermediaries.csv' AS line CREATE (:Intermediaries { name: line.name, internal_id: line.internal_id, address: line.address, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, status: line.status, node_id: toInt(line.node_id), sourceID: line.sourceID})

Officers:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Officers.csv' AS line CREATE (:Officers { name: line.name, icij_id: line.icij_id, valid_until: line.valid_until, country_codes: line.country_codes, countries: line.countries, node_id: toInt(line.node_id), sourceID: line.sourceID})

Entities:

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/Entities.csv' AS line CREATE (:Entities { name: line.name, original_name: line.original_name, former_name: line.former_name, jurisdiction: line.jurisdiction, jurisdiction_description: line.jurisdiction_description, company_type: line.company_type, address: line.address, internal_id: line.internal_id, incorporation_date: line.incorporation_date, inactivation_date: line.inactivation_date, struck_off_date: line.struck_off_date, dorm_date: line.dorm_date, status: line.status, service_provider: line.service_provider, ibcRUC: toInt(line.ibcRUC) , country_codes: line.country_codes, countries: line.countries, note: line.note, valid_until: line.valid_until, node_id: toInt(line.node_id), sourceID: line.sourceID})

Finally the relationships, or edges. Note the hack, all relationships are of type ACCOC. This isn’t a big problem but offends me a little bit. I will post you Java code that generates the graph dir from the files.

USING PERIODIC COMMIT LOAD CSV WITH HEADERS FROM 'file:/path/all_edges.csv' AS csvLine
MATCH (n1 { id: toInt(csvLine.node_1)}),(n2 { id: toInt(csvLine.node_2)})
CREATE (n1)-[:ACCOC {role: csvLine.rel_type}]->(n2)