NodeSet API Reference¶
The NodeSet class is used for bulk loading nodes into Neo4j. It provides methods for adding nodes, creating indexes, and loading data efficiently.
Class Definition¶
NodeSet ¶
NodeSet(labels: list[str] = None, merge_keys: list[str] = None, default_props: dict = None, preserve: list[str] = None, append_props: list[str] = None, additional_labels: list[str] = None, indexed: bool = False, deduplicate: bool = False)
Container for a set of Nodes with the same labels and the same properties that define uniqueness.
Initialize a NodeSet for bulk loading nodes to Neo4j.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
labels
|
list[str]
|
Node labels to apply to all nodes in this set |
None
|
merge_keys
|
list[str]
|
Properties that define node uniqueness for MERGE operations |
None
|
default_props
|
dict
|
Properties applied to all nodes by default |
None
|
preserve
|
list[str]
|
Properties that should not be overwritten during MERGE operations |
None
|
append_props
|
list[str]
|
Properties that should be appended as arrays during MERGE |
None
|
additional_labels
|
list[str]
|
Additional labels to apply during CREATE/MERGE operations |
None
|
indexed
|
bool
|
Whether indexes have been created for this NodeSet |
False
|
deduplicate
|
bool
|
If True, automatically prevents duplicate nodes based on merge_keys |
False
|
add_node ¶
Create a node in this NodeSet.
When deduplicate=True, nodes with duplicate merge_keys are skipped unless force=True.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
properties
|
dict | NodeModel instance
|
Node properties as dict or OGM instance. |
required |
force
|
bool
|
Force adding the node even if deduplicate is True and a duplicate exists. |
False
|
add ¶
Add a node to this NodeSet (alias for add_node).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
properties
|
dict | NodeModel instance
|
Node properties as dict or OGM instance. |
required |
force
|
bool
|
Force adding the node even if deduplicate is True and a duplicate exists. |
False
|
create ¶
Create all nodes from NodeSet.
merge ¶
merge(graph, merge_properties=None, batch_size=None, preserve=None, append_props=None, database=None)
Merge nodes from NodeSet on merge properties.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
merge_properties
|
The merge properties. |
None
|
create_index ¶
Create indices for all label/merge ky combinations as well as a composite index if multiple merge keys exist.
Deduplication¶
Starting with the latest version, NodeSet supports built-in deduplication to prevent duplicate nodes based on merge keys:
# Enable deduplication
nodeset = NodeSet(['Person'], merge_keys=['email'], deduplicate=True)
# Add nodes - duplicates are automatically skipped
nodeset.add({'name': 'Alice', 'email': 'alice@example.com'})
nodeset.add({'name': 'Alice Updated', 'email': 'alice@example.com'}) # Skipped
# Override deduplication for specific cases
nodeset.add({'name': 'Alice Forced', 'email': 'alice@example.com'}, force=True) # Added
The deduplication feature:
- Uses an efficient internal index for O(1) duplicate detection
- Works with single or multiple merge keys
- Can be overridden on a per-node basis using force=True
- Applies to add_node(), add(), and add_nodes() methods