Skip to content

NodeSet API Reference

The NodeSet class is used for bulk loading nodes into Neo4j. It provides methods for adding nodes, creating indexes, and loading data efficiently.

Class Definition

NodeSet

NodeSet(labels: list[str] = None, merge_keys: list[str] = None, default_props: dict = None, preserve: list[str] = None, append_props: list[str] = None, additional_labels: list[str] = None, indexed: bool = False, deduplicate: bool = False)

Container for a set of Nodes with the same labels and the same properties that define uniqueness.

Initialize a NodeSet for bulk loading nodes to Neo4j.

Parameters:

Name Type Description Default
labels list[str]

Node labels to apply to all nodes in this set

None
merge_keys list[str]

Properties that define node uniqueness for MERGE operations

None
default_props dict

Properties applied to all nodes by default

None
preserve list[str]

Properties that should not be overwritten during MERGE operations

None
append_props list[str]

Properties that should be appended as arrays during MERGE

None
additional_labels list[str]

Additional labels to apply during CREATE/MERGE operations

None
indexed bool

Whether indexes have been created for this NodeSet

False
deduplicate bool

If True, automatically prevents duplicate nodes based on merge_keys

False

add_node

add_node(properties, force=False)

Create a node in this NodeSet.

When deduplicate=True, nodes with duplicate merge_keys are skipped unless force=True.

Parameters:

Name Type Description Default
properties dict | NodeModel instance

Node properties as dict or OGM instance.

required
force bool

Force adding the node even if deduplicate is True and a duplicate exists.

False

add

add(properties, force=False)

Add a node to this NodeSet (alias for add_node).

Parameters:

Name Type Description Default
properties dict | NodeModel instance

Node properties as dict or OGM instance.

required
force bool

Force adding the node even if deduplicate is True and a duplicate exists.

False

add_nodes

add_nodes(list_of_properties, force=False)

create

create(graph: Driver, database: str = DEFAULT_DATABASE, batch_size=None)

Create all nodes from NodeSet.

merge

merge(graph, merge_properties=None, batch_size=None, preserve=None, append_props=None, database=None)

Merge nodes from NodeSet on merge properties.

Parameters:

Name Type Description Default
merge_properties

The merge properties.

None

create_index

create_index(graph, database=None)

Create indices for all label/merge ky combinations as well as a composite index if multiple merge keys exist.

Deduplication

Starting with the latest version, NodeSet supports built-in deduplication to prevent duplicate nodes based on merge keys:

# Enable deduplication
nodeset = NodeSet(['Person'], merge_keys=['email'], deduplicate=True)

# Add nodes - duplicates are automatically skipped
nodeset.add({'name': 'Alice', 'email': 'alice@example.com'})  
nodeset.add({'name': 'Alice Updated', 'email': 'alice@example.com'})  # Skipped

# Override deduplication for specific cases
nodeset.add({'name': 'Alice Forced', 'email': 'alice@example.com'}, force=True)  # Added

The deduplication feature: - Uses an efficient internal index for O(1) duplicate detection - Works with single or multiple merge keys - Can be overridden on a per-node basis using force=True - Applies to add_node(), add(), and add_nodes() methods