Neo4j in Docker - Max Heap Size Causes Hard crash 137

  1. Is this due to a Docker or an OSX limitation?

    NO Increase the amount of available RAM to Docker to resolve this issue.

  2. Is there a way I can modify these limits? If I drop the requested limit to 1GB, it will spin up, but still crashes once I run my heavy query (which is what caused the need for increased Heap space anyway).

    Docker Memory Settings

  3. The query that I'm running is a large-scale update across a lot of nodes (>150k) containing full-text attributes, so that they can be syncronised to ElasticSearch using the plug-in. Is there a way I can get Neo to step through doing, say, 500 nodes at a time, using only cypher (I'd rather avoid writing a script if I can, feels a little dirty for this).

    N/A This is a NEO4J specific question. It might be better to seperate this from the Docker questions listed above.


3.The query that I'm running is a large-scale update across a lot of nodes (>150k) containing full-text attributes, so that they can be syncronised to ElasticSearch using the plug-in. Is there a way I can get Neo to step through doing, say, 500 nodes at a time, using only cypher (I'd rather avoid writing a script if I can, feels a little dirty for this).

You can do this with the help of apoc plugin for neo4j, more specifically apoc.periodic.iterate or apoc.periodic.commit .

If you will use apoc.periodic.commit your first match should be specific like in example you mark which nodes have you already synced, because it sometimes fall in the loop:

call apoc.periodic.commit("
match (user:User) WHERE user.synced = false
with user limit {limit}
MERGE (city:City {name:user.city})
MERGE (user)-[:LIVES_IN]->(city)
SET user.synced =true
RETURN count(*)
",{limit:10000})

If you use apoc.periodic.iterate you can run it in parallel mode:

CALL apoc.periodic.iterate(
"MATCH (o:Order) WHERE o.date > '2016-10-13' RETURN o",
"with {o} as o MATCH (o)-[:HAS_ITEM]->(i) WITH o, sum(i.value) as value 
CALL apoc.es.post(host-or-port,index-or-null,type-or-null,
query-or-null,payload-or-null) yield value return *", {batchSize:100, parallel:true})

Note that there is no need for second MATCH clause and apoc.es.post is a function for apoc that can send post requests to elastic search. see documentation for more info