OBSOLETE -> latest version

Class NeoAccess

    VERSION 3.9.2  (OBSOLETE ; latest version)

    High-level class to interface with the Neo4j graph database from Python.
    Mostly tested on version 4.3 of Neo4j Community version, but should work with other 4.x versions, too.

    Conceptually, there are two parts to NeoAccess:
        1) A thin wrapper around the Neo4j python connectivity library "Neo4j Python Driver"
          that is documented at: https://neo4j.com/docs/api/python-driver/current/api.html

        2) A layer above, providing higher-level functionality for common database operations,
           such as lookup, creation, deletion, modification, import, indices, etc.

    SECTIONS IN THIS CLASS:
        * INIT
        * RUNNING GENERIC QUERIES
        * RETRIEVE DATA
        * FOLLOW LINKS
        * CREATE NODES
        * DELETE NODES
        * MODIFY FIELDS
        * RELATIONSHIPS
        * LABELS
        * INDEXES
        * CONSTRAINTS
        * READ IN DATA from PANDAS
        * JSON IMPORT/EXPORT
        * DEBUGGING SUPPORT

    Plus separate class "CypherUtils"

    ----------------------------------------------------------------------------------
    HISTORY and AUTHORS:
        - NeoAccess (this library) is a fork of NeoInterface;
                NeoAccess was created, and is being maintained, by Julian West,
                primarily in the context of the BrainAnnex.org open-source project.
                It started out in late 2021; for change log thru 2022,
                see the "LIBRARIES" entries in https://brainannex.org/viewer.php?ac=2&cat=14

        - NeoInterface was co-authored by Alexey Kuznetsov and Julian West in 2021,
                and is maintained by GSK pharmaceuticals
                with an Apache License 2.0 (https://github.com/GSK-Biostatistics/neointerface).
                NeoInterface is in part based on the earlier library Neo4jLiaison,
                as well as a library developed by Alexey Kuznetsov.

        - Neo4jLiaison, now deprecated, was authored by Julian West in 2020
                (https://github.com/BrainAnnex/neo4j-liaison)

name	arguments	returns
__init__	self, host=os.environ.get("NEO4J_HOST"), credentials=(os.environ.get("NEO4J_USER"), os.environ.get("NEO4J_PASSWORD")), apoc=False, debug=False, autoconnect=True
If unable to create a Neo4j driver object, raise an Exception reminding the user to check whether the Neo4j database is running :param host: URL to connect to database with. EXAMPLES: bolt://123.456.0.29:7687 , bolt://your_domain.com:7687 , neo4j://localhost:7687 DEFAULT: read from NEO4J_HOST environmental variable :param credentials: Pair of strings (tuple or list) containing, respectively, the database username and password DEFAULT: read from NEO4J_USER and NEO4J_PASSWORD environmental variables :param apoc: Flag indicating whether apoc library is used on Neo4j database to connect to Notes: APOC, if used, must also be enabled on the database. The only method currently requiring APOC is export_dbase_json() :param debug: Flag indicating whether a debug mode is to be used by all methods of this class :param autoconnect Flag indicating whether the class should establish connection to database at initialization TODO: try os.getenv() in lieu of os.environ.get()

name

arguments

returns

__init__

self, host=os.environ.get("NEO4J_HOST"), credentials=(os.environ.get("NEO4J_USER"), os.environ.get("NEO4J_PASSWORD")), apoc=False, debug=False, autoconnect=True

        If unable to create a Neo4j driver object, raise an Exception
        reminding the user to check whether the Neo4j database is running

        :param host:        URL to connect to database with.
                                EXAMPLES: bolt://123.456.0.29:7687  ,  bolt://your_domain.com:7687  ,  neo4j://localhost:7687
                                DEFAULT: read from NEO4J_HOST environmental variable
        :param credentials: Pair of strings (tuple or list) containing, respectively, the database username and password
                                DEFAULT: read from NEO4J_USER and NEO4J_PASSWORD environmental variables
        :param apoc:        Flag indicating whether apoc library is used on Neo4j database to connect to
                                Notes: APOC, if used, must also be enabled on the database.
                                The only method currently requiring APOC is export_dbase_json()
        :param debug:       Flag indicating whether a debug mode is to be used by all methods of this class
        :param autoconnect  Flag indicating whether the class should establish connection to database at initialization

        TODO: try os.getenv() in lieu of os.environ.get()

name	arguments	returns
connect	self	None
Attempt to establish a connection to the Neo4j database, using the credentials stored in the object. In the process, create and save a driver object.

name	arguments	returns
test_dbase_connection	self	None
Attempt to perform a trivial Neo4j query, for the purpose of validating whether a connection to the database is possible. A failure at start time is typically indicative of invalid credentials :return: None

name

arguments

returns

test_dbase_connection

self

None

        Attempt to perform a trivial Neo4j query, for the purpose of validating
        whether a connection to the database is possible.
        A failure at start time is typically indicative of invalid credentials

        :return:    None

name	arguments	returns
version	self	str
Return the version of the Neo4j driver being used. EXAMPLE: "4.3.9" :return: A string with the version number

name	arguments	returns
close	self	None
Terminate the database connection. Note: this method is automatically invoked after the last operation of a "with" statement :return: None

name	arguments	returns
assert_valid_neo_id	self, neo_id: int	None
Raise an Exception if the argument is not a valid Neo4j ID :param neo_id: :return: None

name	arguments	returns
query	self, q: str, data_binding=None, single_row=False, single_cell="", single_column=""
Run a Cypher query. Best suited for Cypher queries that return individual values, but may also be used with queries that return nodes or relationships or paths - or nothing. Execute the query and fetch the returned values as a list of dictionaries. In cases of no results, return an empty list. A new session to the database driver is started, and then immediately terminated after running the query. ALTERNATIVES: * if the Cypher query returns nodes, and one wants to extract the internal Neo4j ID's or labels (in addition to all the properties and their values) then use query_extended() instead. * in case of queries that alter the database (and may or may not return values), use update_query() instead, in order to retrieve information about the effects of the operation :param q: A Cypher query :param data_binding: An optional Cypher dictionary EXAMPLE, assuming that the cypher string contains the substrings "$node_id": {'node_id': 20} :param single_row: Return a dictionary with just the first (0-th) result row, if present - or {} in case of no results TODO: maybe this should be None :param single_cell: Meant in situations where only 1 node (record) is expected, and one wants only 1 specific field of that record. If single_cell is specified, return the value of the field by that name in the first returned record Note: this will be None if there are no results, or if the first (0-th) result row lacks a key with this name TODO: test and give examples. single_cell="name" will return result[0].get("name") :param single_column: Name of the column of interest. Form a list from all the values of that particular column all records. :return: If any of single_row, single_cell or single_column are True, see info under their entries. If those arguments are all False, it returns a (possibly empty) list of dictionaries. Each dictionary in the list will depend on the nature of the Cypher query. EXAMPLES: Cypher returns nodes (after finding or creating them): RETURN n1, n2 -> list item such as {'n1': {'gender': 'M', 'patient_id': 123} 'n2': {'gender': 'F', 'patient_id': 444}} Cypher returns attribute values that get renamed: RETURN n.gender AS client_gender, n.pid AS client_id -> list items such as {'client_gender': 'M', 'client_id': 123} Cypher returns attribute values without renaming: RETURN n.gender, n.pid -> list items such as {'n.gender': 'M', 'n.pid': 123} Cypher returns a single computed value -> a single list item such as {"count(n)": 100} Cypher returns a single relationship, with or without attributes: MERGE (c)-[r:PAID_BY]->(p) -> a single list item such as [{ 'r': ({}, 'PAID_BY', {}) }] Cypher returns a path: MATCH p= ....... RETURN p -> list item such as {'p': [ {'name': 'Eve'}, 'LOVES', {'name': 'Adam'} ] } Cypher creates nodes (without returning them) -> empty list

name

arguments

returns

query

self, q: str, data_binding=None, single_row=False, single_cell="", single_column=""

        Run a Cypher query.  Best suited for Cypher queries that return individual values,
        but may also be used with queries that return nodes or relationships or paths - or nothing.

        Execute the query and fetch the returned values as a list of dictionaries.
        In cases of no results, return an empty list.
        A new session to the database driver is started, and then immediately terminated after running the query.

        ALTERNATIVES:
            * if the Cypher query returns nodes, and one wants to extract the internal Neo4j ID's or labels
              (in addition to all the properties and their values) then use query_extended() instead.

            * in case of queries that alter the database (and may or may not return values),
              use update_query() instead, in order to retrieve information about the effects of the operation

        :param q:       A Cypher query
        :param data_binding:  An optional Cypher dictionary
                        EXAMPLE, assuming that the cypher string contains the substrings "$node_id":
                                {'node_id': 20}
        :param single_row:      Return a dictionary with just the first (0-th) result row, if present - or {} in case of no results
                                TODO: maybe this should be None

        :param single_cell:     Meant in situations where only 1 node (record) is expected, and one wants only 1 specific field of that record.
                                If single_cell is specified, return the value of the field by that name in the first returned record
                                Note: this will be None if there are no results, or if the first (0-th) result row lacks a key with this name
                                TODO: test and give examples.  single_cell="name" will return result[0].get("name")

        :param single_column:   Name of the column of interest.  Form a list from all the values of that particular column all records.

        :return:        If any of single_row, single_cell or single_column are True, see info under their entries.
                        If those arguments are all False, it returns a (possibly empty) list of dictionaries.
                        Each dictionary in the list will depend on the nature of the Cypher query.
                        EXAMPLES:
                            Cypher returns nodes (after finding or creating them): RETURN n1, n2
                                    -> list item such as {'n1': {'gender': 'M', 'patient_id': 123}
                                                          'n2': {'gender': 'F', 'patient_id': 444}}
                            Cypher returns attribute values that get renamed: RETURN n.gender AS client_gender, n.pid AS client_id
                                    -> list items such as {'client_gender': 'M', 'client_id': 123}
                            Cypher returns attribute values without renaming: RETURN n.gender, n.pid
                                    -> list items such as {'n.gender': 'M', 'n.pid': 123}
                            Cypher returns a single computed value
                                    -> a single list item such as {"count(n)": 100}
                            Cypher returns a single relationship, with or without attributes: MERGE (c)-[r:PAID_BY]->(p)
                                    -> a single list item such as [{ 'r': ({}, 'PAID_BY', {}) }]
                            Cypher returns a path:   MATCH p= .......   RETURN p
                                    -> list item such as {'p': [ {'name': 'Eve'}, 'LOVES', {'name': 'Adam'} ] }
                            Cypher creates nodes (without returning them)
                                    -> empty list

name	arguments	returns
query_extended	self, q: str, params = None, flatten = False, fields_to_exclude = None	[dict]
Extended version of query(), meant to extract additional info for queries that return Graph Data Types, i.e. nodes, relationships or paths, such as "MATCH (n) RETURN n", or "MATCH (n1)-[r]->(n2) RETURN r" For example, useful in scenarios where nodes were returned, and their Neo4j internal IDs and/or labels are desired (in addition to all the properties and their values) Unless the flatten flag is True, individual records are kept as separate lists. For example, "MATCH (b:boat), (c:car) RETURN b, c" will return a structure such as [ [b1, c1] , [b2, c2] ] if flatten is False, vs. [b1, c1, b2, c2] if flatten is True. (Note: each b1, c1, etc, is a dictionary.) TODO: Scenario to test: if b1 == b2, would that still be [b1, c1, b1(b2), c2] or [b1, c1, c2] - i.e. would we remove the duplicates? Try running with flatten=True "MATCH (b:boat), (c:car) RETURN b, c" on data like "CREATE (b:boat), (c1:car1), (c2:car2)" :param q: A Cypher query :param params: An optional Cypher dictionary EXAMPLE, assuming that the cypher string contains the substring "$age": {'age': 20} :param flatten: Flag indicating whether the Graph Data Types need to remain clustered by record, or all placed in a single flattened list :param fields_to_exclude: Optional list of strings with name of fields (in the database or special ones added by this function) that wishes to drop. No harm in listing fields that aren't present :return: A (possibly empty) list of dictionaries, if flatten is True, or a list of list, if flatten is False. Each item in the lists is a dictionary, with details that will depend on which Graph Data Types were returned in the Cypher query. EXAMPLE of individual items - for a returned NODE {'gender': 'M', 'age': 20, 'neo4j_id': 123, 'neo4j_labels': ['patient']} EXAMPLE of individual items - for a returned RELATIONSHIP {'price': 7500, 'neo4j_id': 2, 'neo4j_start_node': , 'neo4j_end_node': , 'neo4j_type': 'bought_by'}]

name

arguments

returns

query_extended

self, q: str, params = None, flatten = False, fields_to_exclude = None

[dict]

        Extended version of query(), meant to extract additional info for queries that return Graph Data Types,
        i.e. nodes, relationships or paths,
        such as "MATCH (n) RETURN n", or "MATCH (n1)-[r]->(n2) RETURN r"

        For example, useful in scenarios where nodes were returned, and their Neo4j internal IDs and/or labels are desired
        (in addition to all the properties and their values)

        Unless the flatten flag is True, individual records are kept as separate lists.
            For example, "MATCH (b:boat), (c:car) RETURN b, c"
            will return a structure such as [ [b1, c1] , [b2, c2] ]  if flatten is False,
            vs.  [b1, c1, b2, c2]  if  flatten is True.  (Note: each b1, c1, etc, is a dictionary.)

        TODO:  Scenario to test:
            if b1 == b2, would that still be [b1, c1, b1(b2), c2] or [b1, c1, c2] - i.e. would we remove the duplicates?
            Try running with flatten=True "MATCH (b:boat), (c:car) RETURN b, c" on data like "CREATE (b:boat), (c1:car1), (c2:car2)"

        :param q:       A Cypher query
        :param params:  An optional Cypher dictionary
                            EXAMPLE, assuming that the cypher string contains the substring "$age":
                                        {'age': 20}
        :param flatten: Flag indicating whether the Graph Data Types need to remain clustered by record,
                        or all placed in a single flattened list
        :param fields_to_exclude:   Optional list of strings with name of fields (in the database or special ones added by this function)
                                    that wishes to drop.  No harm in listing fields that aren't present

        :return:        A (possibly empty) list of dictionaries, if flatten is True,
                        or a list of list, if flatten is False.
                        Each item in the lists is a dictionary, with details that will depend on which Graph Data Types
                                    were returned in the Cypher query.
                                    EXAMPLE of individual items - for a returned NODE
                                        {'gender': 'M', 'age': 20, 'neo4j_id': 123, 'neo4j_labels': ['patient']}
                                    EXAMPLE of individual items - for a returned RELATIONSHIP
                                        {'price': 7500, 'neo4j_id': 2,
                                         'neo4j_start_node': ,
                                         'neo4j_end_node': ,
                                         'neo4j_type': 'bought_by'}]

name	arguments	returns
update_query	self, cypher: str, data_binding=None	dict
Run a Cypher query and return statistics about its actions (such number of nodes created, etc.) Typical use is for queries that update the database. If the query returns any values, a list of them is also made available, as the value of the key 'returned_data'. Note: if the query creates nodes and one wishes to obtain their Neo4j internal ID's, one can include Cypher code such as "RETURN id(n) AS neo4j_id" (where n is the dummy name of the newly-created node) EXAMPLE: result = update_query("CREATE(n :CITY {name: 'San Francisco'}) RETURN id(n) AS neo4j_id") result will be {'nodes_created': 1, 'properties_set': 1, 'labels_added': 1, 'returned_data': [{'neo4j_id': 123}] } , assuming 123 is the Neo4j internal ID of the newly-created node :param cypher: Any Cypher query, but typically one that doesn't return anything :param data_binding: Data-binding dictionary for the Cypher query :return: A dictionary of statistics (counters) about the query just run EXAMPLES - {} The query had no effect {'nodes_deleted': 3} The query resulted in the deletion of 3 nodes {'properties_set': 2} The query had the effect of setting 2 properties {'relationships_created': 1} One new relationship got created {'returned_data': [{'neo4j_id': 123}]} 'returned_data' contains the results of the query, if it returns anything, as a list of dictionaries - akin to the value returned by query() {'returned_data': []} Gets returned by SET QUERIES with no return statement OTHER KEYS include: nodes_created, nodes_deleted, relationships_created, relationships_deleted, properties_set, labels_added, labels_removed, indexes_added, indexes_removed, constraints_added, constraints_removed More info: https://neo4j.com/docs/api/python-driver/current/api.html#neo4j.SummaryCounters

name

arguments

returns

update_query

self, cypher: str, data_binding=None

dict

        Run a Cypher query and return statistics about its actions (such number of nodes created, etc.)
        Typical use is for queries that update the database.
        If the query returns any values, a list of them is also made available, as the value of the key 'returned_data'.

        Note: if the query creates nodes and one wishes to obtain their Neo4j internal ID's,
              one can include Cypher code such as "RETURN id(n) AS neo4j_id" (where n is the dummy name of the newly-created node)

        EXAMPLE:  result = update_query("CREATE(n :CITY {name: 'San Francisco'}) RETURN id(n) AS neo4j_id")

                  result will be {'nodes_created': 1, 'properties_set': 1, 'labels_added': 1,
                                  'returned_data': [{'neo4j_id': 123}]
                                 } , assuming 123 is the Neo4j internal ID of the newly-created node

        :param cypher:      Any Cypher query, but typically one that doesn't return anything
        :param data_binding: Data-binding dictionary for the Cypher query
        :return:            A dictionary of statistics (counters) about the query just run
                            EXAMPLES -
                                {}      The query had no effect
                                {'nodes_deleted': 3}    The query resulted in the deletion of 3 nodes
                                {'properties_set': 2}   The query had the effect of setting 2 properties
                                {'relationships_created': 1}    One new relationship got created
                                {'returned_data': [{'neo4j_id': 123}]}  'returned_data' contains the results of the query,
                                                                        if it returns anything, as a list of dictionaries
                                                                        - akin to the value returned by query()
                                {'returned_data': []}  Gets returned by SET QUERIES with no return statement
                            OTHER KEYS include:
                                nodes_created, nodes_deleted, relationships_created, relationships_deleted,
                                properties_set, labels_added, labels_removed,
                                indexes_added, indexes_removed, constraints_added, constraints_removed
                                More info:  https://neo4j.com/docs/api/python-driver/current/api.html#neo4j.SummaryCounters

name	arguments	returns
get_single_field	self, match: Union[int, dict], field_name: str, order_by=None, limit=None	list
For situations where one is fetching just 1 field, and one desires a list of the values of that field, rather than a dictionary of records. In other respects, similar to the more general get_nodes() :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param field_name: A string with the name of the desired field (attribute) :param order_by: see get_nodes() :param limit: see get_nodes() :return: A list of the values of the field_name attribute in the nodes that match the specified conditions

name

arguments

returns

get_single_field

self, match: Union[int, dict], field_name: str, order_by=None, limit=None

list

        For situations where one is fetching just 1 field,
        and one desires a list of the values of that field, rather than a dictionary of records.
        In other respects, similar to the more general get_nodes()

        :param match:       EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param field_name:  A string with the name of the desired field (attribute)
        :param order_by:    see get_nodes()
        :param limit:       see get_nodes()

        :return:  A list of the values of the field_name attribute in the nodes that match the specified conditions

name	arguments	returns
get_record_by_primary_key	self, labels: str, primary_key_name: str, primary_key_value, return_nodeid=False	Union[dict, None]
Return the first (and it ought to be only one) record with the given primary key, and the optional label(s), as a dictionary of all its attributes. If more than one record is found, an Exception is raised. If no record is found, return None. :param labels: A string or list/tuple of strings. Use None if not to be included in search :param primary_key_name: The name of the primary key by which to look the record up :param primary_key_value: The desired value of the primary key :param return_nodeid: If True, an extra entry is present in the dictionary, with the key "neo4j_id" :return: A dictionary, if a unique record was found; or None if not found

name

arguments

returns

get_record_by_primary_key

self, labels: str, primary_key_name: str, primary_key_value, return_nodeid=False

Union[dict, None]

        Return the first (and it ought to be only one) record with the given primary key, and the optional label(s),
        as a dictionary of all its attributes.

        If more than one record is found, an Exception is raised.
        If no record is found, return None.

        :param labels:              A string or list/tuple of strings.  Use None if not to be included in search
        :param primary_key_name:    The name of the primary key by which to look the record up
        :param primary_key_value:   The desired value of the primary key
        :param return_nodeid:       If True, an extra entry is present in the dictionary, with the key "neo4j_id"

        :return:                    A dictionary, if a unique record was found; or None if not found

name	arguments	returns
exists_by_key	self, labels: str, key_name: str, key_value	bool
Return True if a node with the given labels and key_name/key_value exists, or False otherwise :param labels: :param key_name: :param key_value: :return:

name	arguments	returns
exists_by_neo_id	self, neo_id	bool
Return True if a node with the given internal Neo4j exists, or False otherwise :param neo_id: :return: True if a node with the given internal Neo4j exists, or False otherwise

name	arguments	returns
get_nodes	self, match: Union[int, dict], return_neo_id=False, return_labels=False, order_by=None, limit=None, single_row=False, single_cell=""
RETURN a list of the records (as dictionaries of ALL the key/value node properties) corresponding to all the Neo4j nodes specified by the given match data. :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param return_neo_id: Flag indicating whether to also include the Neo4j internal node ID in the returned data (using "neo4j_id" as its key in the returned dictionary) TODO: change to "neo_id" :param return_labels: Flag indicating whether to also include the Neo4j label names in the returned data (using "neo4j_labels" as its key in the returned dictionary) :param order_by: Optional string with the key (field) name to order by, in ascending order Note: lower and uppercase names are treated differently in the sort order :param limit: Optional integer to specify the maximum number of nodes returned :param single_row: Meant in situations where only 1 node (record) is expected, or perhaps one wants to sample the 1st one; if not found, None will be returned [to distinguish it from a found record with no fields!] :param single_cell: Meant in situations where only 1 node (record) is expected, and one wants only 1 specific field of that record. If single_cell is specified, return the value of the field by that name in the first node Note: this will be None if there are no results, or if the first (0-th) result row lacks a key with this name TODO: test and give examples. single_cell="name" will return result[0].get("name") :return: If single_cell is specified, return the value of the field by that name in the first node. If single_row is True, return a dictionary with the information of the first record (or None if no record exists) Otherwise, return a list whose entries are dictionaries with each record's information (the node's attribute names are the keys) EXAMPLE: [ {"gender": "M", "age": 42, "condition_id": 3}, {"gender": "M", "age": 76, "location": "Berkeley"} ] Note that ALL the attributes of each node are returned - and that they may vary across records. If the flag return_nodeid is set to True, then an extra key/value pair is included in the dictionaries, of the form "neo4j_id": some integer with the Neo4j internal node ID If the flag return_labels is set to True, then an extra key/value pair is included in the dictionaries, of the form "neo4j_labels": [list of Neo4j label(s) attached to that node] EXAMPLE using both of the above flags: [ {"neo4j_id": 145, "neo4j_labels": ["person", "client"], "gender": "M", "condition_id": 3}, {"neo4j_id": 222, "neo4j_labels": ["person"], "gender": "M", "location": "Berkeley"} ] # TODO: provide an option to specify the desired fields

name

arguments

returns

get_nodes

self, match: Union[int, dict], return_neo_id=False, return_labels=False, order_by=None, limit=None, single_row=False, single_cell=""

        RETURN a list of the records (as dictionaries of ALL the key/value node properties)
        corresponding to all the Neo4j nodes specified by the given match data.

        :param match:           EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()

        :param return_neo_id:   Flag indicating whether to also include the Neo4j internal node ID in the returned data
                                    (using "neo4j_id" as its key in the returned dictionary)    TODO: change to "neo_id"
        :param return_labels:   Flag indicating whether to also include the Neo4j label names in the returned data
                                    (using "neo4j_labels" as its key in the returned dictionary)

        :param order_by:        Optional string with the key (field) name to order by, in ascending order
                                    Note: lower and uppercase names are treated differently in the sort order
        :param limit:           Optional integer to specify the maximum number of nodes returned

        :param single_row:      Meant in situations where only 1 node (record) is expected, or perhaps one wants to sample the 1st one;
                                    if not found, None will be returned [to distinguish it from a found record with no fields!]

        :param single_cell:     Meant in situations where only 1 node (record) is expected, and one wants only 1 specific field of that record.
                                If single_cell is specified, return the value of the field by that name in the first node
                                Note: this will be None if there are no results, or if the first (0-th) result row lacks a key with this name
                                TODO: test and give examples.  single_cell="name" will return result[0].get("name")

        :return:                If single_cell is specified, return the value of the field by that name in the first node.
                                If single_row is True, return a dictionary with the information of the first record (or None if no record exists)
                                Otherwise, return a list whose entries are dictionaries with each record's information
                                    (the node's attribute names are the keys)
                                    EXAMPLE: [  {"gender": "M", "age": 42, "condition_id": 3},
                                                {"gender": "M", "age": 76, "location": "Berkeley"}
                                             ]
                                    Note that ALL the attributes of each node are returned - and that they may vary across records.
                                    If the flag return_nodeid is set to True, then an extra key/value pair is included in the dictionaries,
                                            of the form     "neo4j_id": some integer with the Neo4j internal node ID
                                    If the flag return_labels is set to True, then an extra key/value pair is included in the dictionaries,
                                            of the form     "neo4j_labels": [list of Neo4j label(s) attached to that node]
                                    EXAMPLE using both of the above flags:
                                        [  {"neo4j_id": 145, "neo4j_labels": ["person", "client"], "gender": "M", "condition_id": 3},
                                           {"neo4j_id": 222, "neo4j_labels": ["person"], "gender": "M", "location": "Berkeley"}
                                        ]
        # TODO: provide an option to specify the desired fields

name	arguments	returns
get_df	self, match: Union[int, dict], order_by=None, limit=None	pd.DataFrame
Similar to get_nodes(), but with fewer arguments - and the result is returned as a Pandas dataframe [See get_nodes() for more information about the arguments] :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param order_by: :param limit: :return: A Pandas dataframe

name

arguments

returns

get_df

self, match: Union[int, dict], order_by=None, limit=None

pd.DataFrame

        Similar to get_nodes(), but with fewer arguments - and the result is returned as a Pandas dataframe

        [See get_nodes() for more information about the arguments]
        :param match:       EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param order_by:
        :param limit:
        :return:            A Pandas dataframe

name	arguments	returns
find	self, labels=None, neo_id=None, key_name=None, key_value=None, properties=None, subquery=None, dummy_node_name="n"	dict
Register a set of conditions that must be matched to identify a node or nodes of interest, and return a dictionary suitable to be passed as argument to various other functions in this library. No arguments at all means "match everything in the database". TODO: maybe rename to "identify()" maybe save all arguments, in case the dummy_node_name later needs changing IMPORTANT: if neo_id is provided, all other conditions are DISREGARDED; otherwise, an implicit AND applies to all the specified conditions. Note: NO database operation is actually performed by this function. It merely turns the set of specification into the MATCH part, and (if applicable) the WHERE part, of a Cypher query (using the specified dummy variable name), together with its data-binding dictionary - all "packaged" into a dict that can be passed around. The calling functions typically will make use the returned dictionary to assemble a Cypher query, to MATCH all the Neo4j nodes satisfying the specified conditions, and then do whatever else it needs to do (such as deleting, or setting properties on, the located nodes.) EXAMPLE 1 - first identify a group of nodes, and then delete them: match = find(labels="person", properties={"gender": "F"}, subquery=("n.income > $min_income" , {"min_income": 50000})} ) delete_nodes(match) In the above example, the value of match is: {"node": "(n :`person` {`gender`: $n_par_1})", "where": "n.income > $min_income", "data_binding": {"n_par_1": "F", "min_income": 50000}, "dummy_node_name": "n" } EXAMPLE 2 - by specifying the name of the dummy node, it's also possible to do operations such as: # Add the new relationship match_from = db.find(labels="car", key_name="vehicle_id", key_value=45678, dummy_node_name="from") match_to = db.find(labels="manufacturer", key_name="company", key_value="Toyota", dummy_node_name="to") db.add_edges(match_from, match_to, rel_name="MADE_BY") TODO? - possible alt. names for this function include "define_match()", match(), "locate(), choose() or identify() ALL THE ARGUMENTS ARE OPTIONAL (no arguments at all means "match everything in the database") :param labels: A string (or list/tuple of strings) specifying one or more Neo4j labels. (Note: blank spaces ARE allowed in the strings) EXAMPLES: "cars" ("cars", "powered vehicles") Note that if multiple labels are given, then only nodes with ALL of them will be matched; at present, there's no way to request an "OR" operation :param neo_id: An integer with the node's internal ID. If specified, it OVER-RIDES all the remaining arguments, except for the labels :param key_name: A string with the name of a node attribute; if provided, key_value must be present, too :param key_value: The required value for the above key; if provided, key_name must be present, too Note: no requirement for the key to be primary :param properties: A (possibly-empty) dictionary of property key/values pairs, indicating a condition to match. EXAMPLE: {"gender": "F", "age": 22} :param subquery: Either None, or a (possibly empty) string containing a Cypher subquery, or a pair/list (string, dict) containing a Cypher subquery and the data-binding dictionary for it. The Cypher subquery should refer to the node using the assigned dummy_node_name (by default, "n") IMPORTANT: in the dictionary, don't use keys of the form "n_par_i", where n is the dummy node name and i is an integer, or an Exception will be raised - those names are for internal use only EXAMPLES: "n.age < 25 AND n.income > 100000" ("n.weight < $max_weight", {"max_weight": 100}) :param dummy_node_name: A string with a name by which to refer to the node (by default, "n") :return: A dictionary of data storing the parameters of the match. For details, see the "class Matches"

name

arguments

returns

find

self, labels=None, neo_id=None, key_name=None, key_value=None, properties=None, subquery=None, dummy_node_name="n"

dict

        Register a set of conditions that must be matched to identify a node or nodes of interest,
        and return a dictionary suitable to be passed as argument to various other functions in this library.
        No arguments at all means "match everything in the database".
        TODO:   maybe rename to "identify()"
                maybe save all arguments, in case the dummy_node_name later needs changing

        IMPORTANT:  if neo_id is provided, all other conditions are DISREGARDED;
                    otherwise, an implicit AND applies to all the specified conditions.

        Note:   NO database operation is actually performed by this function.
                It merely turns the set of specification into the MATCH part, and (if applicable) the WHERE part,
                of a Cypher query (using the specified dummy variable name),
                together with its data-binding dictionary - all "packaged" into a dict that can be passed around.

                The calling functions typically will make use the returned dictionary to assemble a Cypher query,
                to MATCH all the Neo4j nodes satisfying the specified conditions,
                and then do whatever else it needs to do (such as deleting, or setting properties on, the located nodes.)


        EXAMPLE 1 - first identify a group of nodes, and then delete them:

            match = find(labels="person", properties={"gender": "F"},
                                subquery=("n.income > $min_income" , {"min_income": 50000})}
                         )
            delete_nodes(match)

            In the above example, the value of match is:

                {"node": "(n :`person` {`gender`: $n_par_1})",
                "where": "n.income > $min_income",
                "data_binding": {"n_par_1": "F", "min_income": 50000},
                "dummy_node_name": "n"
                }

        EXAMPLE 2 - by specifying the name of the dummy node, it's also possible to do operations such as:

            # Add the new relationship
            match_from = db.find(labels="car",          key_name="vehicle_id", key_value=45678,
                                 dummy_node_name="from")
            match_to =   db.find(labels="manufacturer", key_name="company", key_value="Toyota",
                                 dummy_node_name="to")

            db.add_edges(match_from, match_to, rel_name="MADE_BY")

        TODO? - possible alt. names  for this function include "define_match()", match(), "locate(), choose() or identify()

        ALL THE ARGUMENTS ARE OPTIONAL (no arguments at all means "match everything in the database")
        :param labels:      A string (or list/tuple of strings) specifying one or more Neo4j labels.
                                (Note: blank spaces ARE allowed in the strings)
                                EXAMPLES:  "cars"
                                            ("cars", "powered vehicles")
                            Note that if multiple labels are given, then only nodes with ALL of them will be matched;
                            at present, there's no way to request an "OR" operation

        :param neo_id:      An integer with the node's internal ID.
                                If specified, it OVER-RIDES all the remaining arguments, except for the labels

        :param key_name:    A string with the name of a node attribute; if provided, key_value must be present, too
        :param key_value:   The required value for the above key; if provided, key_name must be present, too
                                Note: no requirement for the key to be primary

        :param properties:  A (possibly-empty) dictionary of property key/values pairs, indicating a condition to match.
                                EXAMPLE: {"gender": "F", "age": 22}

        :param subquery:    Either None, or a (possibly empty) string containing a Cypher subquery,
                            or a pair/list (string, dict) containing a Cypher subquery and the data-binding dictionary for it.
                            The Cypher subquery should refer to the node using the assigned dummy_node_name (by default, "n")
                                IMPORTANT:  in the dictionary, don't use keys of the form "n_par_i",
                                            where n is the dummy node name and i is an integer,
                                            or an Exception will be raised - those names are for internal use only
                                EXAMPLES:   "n.age < 25 AND n.income > 100000"
                                            ("n.weight < $max_weight", {"max_weight": 100})

        :param dummy_node_name: A string with a name by which to refer to the node (by default, "n")

        :return:            A dictionary of data storing the parameters of the match.
                            For details, see the "class Matches"

name	arguments	returns
get_node_labels	self, neo4j_id: int	[str]
Return a list whose elements are the label(s) of the node specified by its Neo4j internal ID TODO: maybe also accept a "match" structure as argument :param neo4j_id: An integer with a Neo4j node id :return:

name

arguments

returns

get_node_labels

self, neo4j_id: int

[str]

        Return a list whose elements are the label(s) of the node specified by its Neo4j internal ID

        TODO: maybe also accept a "match" structure as argument

        :param neo4j_id:    An integer with a Neo4j node id
        :return:

name	arguments	returns
follow_links	self, match: Union[int, dict], rel_name: str, rel_dir ="OUT", neighbor_labels = None	[dict]
From the given starting node(s), follow all the relationships of the given name to and/or from it, into/from neighbor nodes (optionally having the given labels), and return all the properties of those neighbor nodes. :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param rel_name: A string with the name of relationship to follow. (Note: any other relationships are ignored) :param rel_dir: Either "OUT"(default), "IN" or "BOTH". Direction(s) of the relationship to follow :param neighbor_labels: Optional label(s) required on the neighbors. If present, either a string or list of strings :return: A list of dictionaries with all the properties of the neighbor nodes TODO: maybe add the option to just return a subset of fields

name

arguments

returns

follow_links

self, match: Union[int, dict], rel_name: str, rel_dir ="OUT", neighbor_labels = None

[dict]

        From the given starting node(s), follow all the relationships of the given name to and/or from it,
        into/from neighbor nodes (optionally having the given labels),
        and return all the properties of those neighbor nodes.

        :param match:           EITHER an integer with a Neo4j node id,
                                    OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param rel_name:        A string with the name of relationship to follow.  (Note: any other relationships are ignored)
        :param rel_dir:         Either "OUT"(default), "IN" or "BOTH".  Direction(s) of the relationship to follow
        :param neighbor_labels: Optional label(s) required on the neighbors.  If present, either a string or list of strings

        :return:                A list of dictionaries with all the properties of the neighbor nodes
                                TODO: maybe add the option to just return a subset of fields

name	arguments	returns
count_links	self, match: Union[int, dict], rel_name: str, rel_dir: str, neighbor_labels = None	int
From the given starting node(s), count all the relationships OF THE GIVEN NAME to and/or from it, into/from neighbor nodes (optionally having the given labels) :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param rel_name: A string with the name of relationship to follow. (Note: any other relationships are ignored) :param rel_dir: Either "OUT"(default), "IN" or "BOTH". Direction(s) of the relationship to follow :param neighbor_labels: Optional label(s) required on the neighbors. If present, either a string or list of strings :return: The total number of inbound and/or outbound relationships to the given node(s)

name

arguments

returns

count_links

self, match: Union[int, dict], rel_name: str, rel_dir: str, neighbor_labels = None

int

        From the given starting node(s), count all the relationships OF THE GIVEN NAME to and/or from it,
        into/from neighbor nodes (optionally having the given labels)

        :param match:           EITHER an integer with a Neo4j node id,
                                    OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param rel_name:        A string with the name of relationship to follow.  (Note: any other relationships are ignored)
        :param rel_dir:         Either "OUT"(default), "IN" or "BOTH".  Direction(s) of the relationship to follow
        :param neighbor_labels: Optional label(s) required on the neighbors.  If present, either a string or list of strings

        :return:                The total number of inbound and/or outbound relationships to the given node(s)

name	arguments	returns
get_parents_and_children	self, node_id: int	()
Fetch all the nodes connected to the given one by INbound relationships to it (its "parents"), as well as by OUTbound relationships to it (its "children") :param node_id: An integer with a Neo4j internal node ID :return: A dictionary with 2 keys: 'parent_list' and 'child_list' The values are lists of dictionaries with 3 keys: "id", "label", "rel" EXAMPLE of individual items in either parent_list or child_list: {'id': 163, 'labels': ['Subject'], 'rel': 'HAS_TREATMENT'}

name

arguments

returns

get_parents_and_children

self, node_id: int

()

        Fetch all the nodes connected to the given one by INbound relationships to it (its "parents"),
        as well as by OUTbound relationships to it (its "children")

        :param node_id: An integer with a Neo4j internal node ID
        :return:        A dictionary with 2 keys: 'parent_list' and 'child_list'
                        The values are lists of dictionaries with 3 keys: "id", "label", "rel"
                            EXAMPLE of individual items in either parent_list or child_list:
                            {'id': 163, 'labels': ['Subject'], 'rel': 'HAS_TREATMENT'}

name	arguments	returns
create_node	self, labels, properties=None	int
Create a new node with the given label(s) and with the attributes/values specified in the properties dictionary. Return the Neo4j internal ID of the node just created. :param labels: A string, or list/tuple of strings, specifying Neo4j labels (ok to have blank spaces) :param properties: An optional (possibly empty or None) dictionary of properties to set for the new node. EXAMPLE: {'age': 22, 'gender': 'F'} :return: An integer with the Neo4j internal ID of the node just created

name

arguments

returns

create_node

self, labels, properties=None

int

        Create a new node with the given label(s) and with the attributes/values specified in the properties dictionary.
        Return the Neo4j internal ID of the node just created.

        :param labels:      A string, or list/tuple of strings, specifying Neo4j labels (ok to have blank spaces)
        :param properties:  An optional (possibly empty or None) dictionary of properties to set for the new node.
                                EXAMPLE: {'age': 22, 'gender': 'F'}

        :return:            An integer with the Neo4j internal ID of the node just created

name	arguments	returns
merge_node	self, labels, properties=None	dict

name	arguments	returns
create_node_with_relationships	self, labels, properties=None, connections=None	int
TODO: this method may no longer be needed, given the new method create_node_with_links() Maybe ditch, or extract the Neo4j ID's from the connections, and call create_node_with_links() Create a new node with relationships to zero or more PRE-EXISTING nodes (identified by their labels and key/value pairs). If the specified pre-existing nodes aren't found, then no new node is created, and an Exception is raised. On success, return the Neo4j internal ID of the new node just created. Note: if all connections are outbound, and to nodes with known Neo4j internal IDs, then the simpler method create_node_with_children() may be used instead EXAMPLE: create_node_with_relationships( labels="PERSON", properties={"name": "Julian", "city": "Berkeley"}, connections=[ {"labels": "DEPARTMENT", "key": "dept_name", "value": "IT", "rel_name": "EMPLOYS", "rel_dir": "IN"}, {"labels": ["CAR", "INVENTORY"], "key": "vehicle_id", "value": 12345, "rel_name": "OWNS", "rel_attrs": {"since": 2021} } ] ) :param labels: A string, or list of strings, with label(s) to assign to the new node :param properties: A dictionary of properties to assign to the new node :param connections: A (possibly empty) list of dictionaries with the following keys (all optional unless otherwise specified): --- Keys to locate an existing node --- "labels" RECOMMENDED "key" REQUIRED "value" REQUIRED --- Keys to define a relationship to it --- "rel_name" REQUIRED. The name to give to the new relationship "rel_dir" Either "OUT" or "IN", relative to the new node (by default, "OUT") "rel_attrs" A dictionary of relationship attributes :return: If successful, an integer with the Neo4j internal ID of the node just created; otherwise, an Exception is raised

name

arguments

returns

create_node_with_relationships

self, labels, properties=None, connections=None

int

        TODO: this method may no longer be needed, given the new method create_node_with_links()
              Maybe ditch, or extract the Neo4j ID's from the connections,
              and call create_node_with_links()

        Create a new node with relationships to zero or more PRE-EXISTING nodes
        (identified by their labels and key/value pairs).

        If the specified pre-existing nodes aren't found, then no new node is created,
        and an Exception is raised.

        On success, return the Neo4j internal ID of the new node just created.

        Note: if all connections are outbound, and to nodes with known Neo4j internal IDs, then
              the simpler method create_node_with_children() may be used instead

        EXAMPLE:
            create_node_with_relationships(
                                            labels="PERSON",
                                            properties={"name": "Julian", "city": "Berkeley"},
                                            connections=[
                                                        {"labels": "DEPARTMENT",
                                                         "key": "dept_name", "value": "IT",
                                                         "rel_name": "EMPLOYS", "rel_dir": "IN"},

                                                        {"labels": ["CAR", "INVENTORY"],
                                                         "key": "vehicle_id", "value": 12345,
                                                         "rel_name": "OWNS", "rel_attrs": {"since": 2021} }
                                            ]
            )

        :param labels:      A string, or list of strings, with label(s) to assign to the new node
        :param properties:  A dictionary of properties to assign to the new node
        :param connections: A (possibly empty) list of dictionaries with the following keys
                            (all optional unless otherwise specified):
                                --- Keys to locate an existing node ---
                                    "labels"        RECOMMENDED
                                    "key"           REQUIRED
                                    "value"         REQUIRED
                                --- Keys to define a relationship to it ---
                                    "rel_name"      REQUIRED.  The name to give to the new relationship
                                    "rel_dir"       Either "OUT" or "IN", relative to the new node (by default, "OUT")
                                    "rel_attrs"     A dictionary of relationship attributes

        :return:            If successful, an integer with the Neo4j internal ID of the node just created;
                                otherwise, an Exception is raised

name	arguments	returns
create_node_with_links	self, labels, properties = None, links = None, merge=False	int
Create a new node, with the given labels and optional properties, and make it a parent of all the EXISTING nodes that are specified in the (possibly empty) list of children nodes, identified by their Neo4j ID. The list of children nodes also contains the names to gives to each link, as well as their directions (by default OUTbound from the newly-created node) and, optionally, properties on the links. If any of the requested link nodes isn't found, then no new node is created, and an Exception is raised. Note: the new node may be created even in situations where Exceptions are raised; for example, if attempting to create two identical relationships to the same existing node. EXAMPLE (assuming the nodes with the specified Neo4j IDs already exist): create_node_with_links( labels="PERSON", properties={"name": "Julian", "city": "Berkeley"}, links=[ {"neo_id": 123, "rel_name": "LIVES IN"}, {"neo_id": 456, "rel_name": "EMPLOYS", "rel_dir": "IN"}, {"neo_id": 789, "rel_name": "OWNS", "rel_attrs": {"since": 2022}} ] ) :param labels: Labels to assign to the newly-created node (optional but recommended): a string or list/tuple of strings; blanks allowed inside strings :param properties: A dictionary of optional properties to assign to the newly-created node :param links: Optional list of dicts identifying existing nodes, and specifying the name, direction and optional properties to give to the links connecting to them; use None, or an empty list, to indicate if there aren't any Each dict contains the following keys: "neo_id" REQUIRED - to identify an existing node "rel_name" REQUIRED - the name to give to the link "rel_dir" OPTIONAL (default "OUT") - either "IN" or "OUT" from the new node "rel_attrs" OPTIONAL - A dictionary of relationship attributes :param merge: If True, a node gets created only if there's no other node with the same properties and labels TODO: test :return: An integer with the Neo4j ID of the newly-created node

name

arguments

returns

create_node_with_links

self, labels, properties = None, links = None, merge=False

int

        Create a new node, with the given labels and optional properties,
        and make it a parent of all the EXISTING nodes that are specified
        in the (possibly empty) list of children nodes, identified by their Neo4j ID.

        The list of children nodes also contains the names to gives to each link,
        as well as their directions (by default OUTbound from the newly-created node)
        and, optionally, properties on the links.

        If any of the requested link nodes isn't found,
        then no new node is created, and an Exception is raised.

        Note: the new node may be created even in situations where Exceptions are raised;
              for example, if attempting to create two identical relationships to the same existing node.

        EXAMPLE (assuming the nodes with the specified Neo4j IDs already exist):
            create_node_with_links(
                                labels="PERSON",
                                properties={"name": "Julian", "city": "Berkeley"},
                                links=[ {"neo_id": 123, "rel_name": "LIVES IN"},
                                        {"neo_id": 456, "rel_name": "EMPLOYS", "rel_dir": "IN"},
                                        {"neo_id": 789, "rel_name": "OWNS", "rel_attrs": {"since": 2022}}
                                      ]
            )

        :param labels:      Labels to assign to the newly-created node (optional but recommended):
                                a string or list/tuple of strings; blanks allowed inside strings
        :param properties:  A dictionary of optional properties to assign to the newly-created node
        :param links:       Optional list of dicts identifying existing nodes,
                                and specifying the name, direction and optional properties
                                to give to the links connecting to them;
                                use None, or an empty list, to indicate if there aren't any
                                Each dict contains the following keys:
                                    "neo_id"        REQUIRED - to identify an existing node
                                    "rel_name"      REQUIRED - the name to give to the link
                                    "rel_dir"       OPTIONAL (default "OUT") - either "IN" or "OUT" from the new node
                                    "rel_attrs"     OPTIONAL - A dictionary of relationship attributes
        :param merge:       If True, a node gets created only if there's no other node
                                with the same properties and labels     TODO: test

        :return:            An integer with the Neo4j ID of the newly-created node

name	arguments	returns
_assemble_query_for_linking	self, links: list	tuple
Helper function for create_node_with_links(), and perhaps future methods. Given a list of existing nodes, and info on links to create to/from them, define the portions of the Cypher query to locate the existing nodes, and to link up to them. No query is actually run. :param links: A list: SEE explanation in create_node_with_links() :return: A 4-tuple with the parts of the query, as well as the needed data binding 1) q_MATCH 2) q_WHERE 3) q_MERGE 4) data_binding

name

arguments

returns

_assemble_query_for_linking

self, links: list

tuple

        Helper function for create_node_with_links(), and perhaps future methods.

        Given a list of existing nodes, and info on links to create to/from them,
        define the portions of the Cypher query to locate the existing nodes,
        and to link up to them.
        No query is actually run.

        :param links:   A list: SEE explanation in create_node_with_links()
        :return:        A 4-tuple with the parts of the query, as well as the needed data binding
                            1) q_MATCH
                            2) q_WHERE
                            3) q_MERGE
                            4) data_binding

name	arguments	returns
create_node_with_children	self, labels, properties = None, children_list = None	int
Create a new node, with the given labels and optional specified properties, and make it a parent of all the EXISTING nodes specified in the list of children nodes (possibly empty), using the relationship names specified inside that list. All the relationships are understood to be OUTbound from the newly-created node. Note: this is a simpler version of create_node_with_links() TODO: re-implement, making use of create_node_with_links() EXAMPLE: create_node_with_children( labels="PERSON", properties={"name": "Julian", "city": "Berkeley"}, children_list=[ (123, "EMPLOYS") , (456, "OWNS") ] ) :param labels: Labels to assign to the newly-created node (a string, possibly empty, or list of strings) :param children_list: Optional list of pairs of the form (Neo4j ID, relationship name); use None, or an empty list, to indicate if there aren't any :param properties: A dictionary of optional properties to assign to the newly-created node :return: An integer with the Neo4j ID of the newly-created node

name

arguments

returns

create_node_with_children

self, labels, properties = None, children_list = None

int

        Create a new node, with the given labels and optional specified properties,
        and make it a parent of all the EXISTING nodes
        specified in the list of children nodes (possibly empty),
        using the relationship names specified inside that list.
        All the relationships are understood to be OUTbound from the newly-created node.

        Note: this is a simpler version of create_node_with_links()

        TODO: re-implement, making use of create_node_with_links()

        EXAMPLE:
            create_node_with_children(
                                        labels="PERSON",
                                        properties={"name": "Julian", "city": "Berkeley"},
                                        children_list=[ (123, "EMPLOYS") , (456, "OWNS") ]
            )

        :param labels:          Labels to assign to the newly-created node (a string, possibly empty, or list of strings)
        :param children_list:   Optional list of pairs of the form (Neo4j ID, relationship name);
                                    use None, or an empty list, to indicate if there aren't any
        :param properties: A dictionary of optional properties to assign to the newly-created node

        :return:                An integer with the Neo4j ID of the newly-created node

name	arguments	returns
delete_nodes	self, match: Union[int, dict]	int
Delete the node or nodes specified by the match argument. Return the number of nodes deleted. :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :return: The number of nodes deleted (possibly zero)

name

arguments

returns

delete_nodes

self, match: Union[int, dict]

int

        Delete the node or nodes specified by the match argument.  Return the number of nodes deleted.

        :param match:   EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :return:        The number of nodes deleted (possibly zero)

name	arguments	returns
delete_nodes_by_label	self, delete_labels=None, keep_labels=None	None
Empty out (by default completely) the Neo4j database. Optionally, only delete nodes with the specified labels, or only keep nodes with the given labels. Note: the keep_labels list has higher priority; if a label occurs in both lists, it will be kept. IMPORTANT: it does NOT clear indexes; "ghost" labels may remain! TODO: return the number of nodes deleted :param delete_labels: An optional string, or list of strings, indicating specific labels to DELETE :param keep_labels: An optional string or list of strings, indicating specific labels to KEEP (keep_labels has higher priority over delete_labels) :return: None

name

arguments

returns

delete_nodes_by_label

self, delete_labels=None, keep_labels=None

None

        Empty out (by default completely) the Neo4j database.
        Optionally, only delete nodes with the specified labels, or only keep nodes with the given labels.
        Note: the keep_labels list has higher priority; if a label occurs in both lists, it will be kept.
        IMPORTANT: it does NOT clear indexes; "ghost" labels may remain!
        TODO: return the number of nodes deleted

        :param delete_labels:   An optional string, or list of strings, indicating specific labels to DELETE
        :param keep_labels:     An optional string or list of strings, indicating specific labels to KEEP
                                    (keep_labels has higher priority over delete_labels)
        :return:                None

name	arguments	returns
bulk_delete_by_label	self, label: str
IMPORTANT: APOC required (starting from v 4.4 of Neo4j, will be able to do this without APOC) Meant for large databases, where the straightforward deletion operations may result in very large number of nodes, and take a long time (or possibly fail) "If you need to delete some large number of objects from the graph, one needs to be mindful of the not building up such a large single transaction such that a Java OUT OF HEAP Error will be encountered." See: https://neo4j.com/developer/kb/large-delete-transaction-best-practices-in-neo4j/ TODO: generalize to bulk-deletion not just by label :param label: A string with the label of the nodes to delete (blank spaces in name are ok) :return: A dict with the keys "batches" and "total"

name

arguments

returns

bulk_delete_by_label

self, label: str

        IMPORTANT: APOC required (starting from v 4.4 of Neo4j, will be able to do this without APOC)

        Meant for large databases, where the straightforward deletion operations may result
        in very large number of nodes, and take a long time (or possibly fail)

        "If you need to delete some large number of objects from the graph,
        one needs to be mindful of the not building up such a large single transaction
        such that a Java OUT OF HEAP Error will be encountered."
        See:  https://neo4j.com/developer/kb/large-delete-transaction-best-practices-in-neo4j/

        TODO: generalize to bulk-deletion not just by label

        :param label:   A string with the label of the nodes to delete (blank spaces in name are ok)
        :return:        A dict with the keys "batches" and "total"

name	arguments	returns
empty_dbase	self, keep_labels=None, drop_indexes=True, drop_constraints=True	None
Use this to get rid of everything in the database, including all the indexes and constraints (unless otherwise specified.) Optionally, keep nodes with a given label, or keep the indexes, or keep the constraints :param keep_labels: An optional list of strings, indicating specific labels to KEEP :param drop_indexes: Flag indicating whether to also ditch all indexes (by default, True) :param drop_constraints:Flag indicating whether to also ditch all constraints (by default, True) :return: None

name

arguments

returns

empty_dbase

self, keep_labels=None, drop_indexes=True, drop_constraints=True

None

        Use this to get rid of everything in the database,
        including all the indexes and constraints (unless otherwise specified.)
        Optionally, keep nodes with a given label, or keep the indexes, or keep the constraints

        :param keep_labels:     An optional list of strings, indicating specific labels to KEEP
        :param drop_indexes:    Flag indicating whether to also ditch all indexes (by default, True)
        :param drop_constraints:Flag indicating whether to also ditch all constraints (by default, True)

        :return:                None

name	arguments	returns
set_fields	self, match: Union[int, dict], set_dict: dict	int
EXAMPLE - locate the "car" with vehicle id 123 and set its color to white and price to 7000 match = find(labels = "car", properties = {"vehicle id": 123}) set_fields(match=match, set_dict = {"color": "white", "price": 7000}) NOTE: other fields are left un-disturbed Return the number of properties set. TODO: if any field is blank, offer the option drop it altogether from the node, with a "REMOVE n.field" statement in Cypher; doing SET n.field = "" doesn't drop it :param match: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param set_dict: A dictionary of field name/values to create/update the node's attributes (note: blanks ARE allowed in the keys) :return: The number of properties set

name

arguments

returns

set_fields

self, match: Union[int, dict], set_dict: dict

int

        EXAMPLE - locate the "car" with vehicle id 123 and set its color to white and price to 7000
            match = find(labels = "car", properties = {"vehicle id": 123})
            set_fields(match=match, set_dict = {"color": "white", "price": 7000})

        NOTE: other fields are left un-disturbed

        Return the number of properties set.

        TODO: if any field is blank, offer the option drop it altogether from the node,
              with a "REMOVE n.field" statement in Cypher; doing SET n.field = "" doesn't drop it

        :param match:       EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param set_dict:    A dictionary of field name/values to create/update the node's attributes
                            (note: blanks ARE allowed in the keys)

        :return:            The number of properties set

name	arguments	returns
get_relationship_types	self	[str]
Extract and return a list of all the Neo4j relationship names (i.e. types of relationships) present in the database, in no particular order. :return: A list of strings

name	arguments	returns
add_edges	self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name:str, rel_props = None	int
Add one or more edges (relationships, with the specified rel_name), originating in any of the nodes specified by the match_from specifications, and terminating in any of the nodes specified by the match_to specifications Return the number of edges added; if none were added, or in case of error, raise an Exception. Notes: - if a relationship with the same name already exists, nothing gets created (and an Exception is raised) - more than 1 node could be present in either of the matches :param match_from: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param match_to: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names; e.g., make sure that for match_from, find() used the option: dummy_node_name="from" and for match_to, find() used the option: dummy_node_name="to" :param rel_name: The name to give to the new relationship between the 2 specified nodes. Blanks allowed. :param rel_props: TODO: not currently used. To implement! Unclear what multiple calls would do in this case: update the props or create a new relationship??? :return: The number of edges added. If none got added, or in case of error, an Exception is raised

name

arguments

returns

add_edges

self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name:str, rel_props = None

int

        Add one or more edges (relationships, with the specified rel_name),
        originating in any of the nodes specified by the match_from specifications,
        and terminating in any of the nodes specified by the match_to specifications

        Return the number of edges added; if none were added, or in case of error, raise an Exception.

        Notes:  - if a relationship with the same name already exists, nothing gets created (and an Exception is raised)
                - more than 1 node could be present in either of the matches

        :param match_from:  EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param match_to:    EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
                            IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names;
                                       e.g., make sure that for match_from, find() used the option: dummy_node_name="from"
                                                        and for match_to,   find() used the option: dummy_node_name="to"

        :param rel_name:    The name to give to the new relationship between the 2 specified nodes.  Blanks allowed.
        :param rel_props:   TODO: not currently used.  To implement!
                                  Unclear what multiple calls would do in this case: update the props or create a new relationship???

        :return:            The number of edges added.  If none got added, or in case of error, an Exception is raised

name	arguments	returns
remove_edges	self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name	int
Remove one or more edges (relationships) originating in any of the nodes specified by the match_from specifications, and terminating in any of the nodes specified by the match_to specifications, optionally matching the given relationship name (will remove all edges if the name is blank or None) Return the number of edges removed; if none found, or in case of error, raise an Exception. Notes: - the nodes themselves are left untouched - more than 1 node could be present in either of the matches - the number of relationships deleted could be more than 1 even with a single "from" node and a single "to" node; Neo4j allows multiple relationships with the same name between the same two nodes, as long as the relationships differ in their properties :param match_from: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param match_to: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names; e.g., make sure that for match_from, find() used the option: dummy_node_name="from" and for match_to, find() used the option: dummy_node_name="to" :param rel_name: (OPTIONAL) The name of the relationship to delete between the 2 specified nodes; if None or a blank string, all relationships between those 2 nodes will get deleted. Blanks allowed. :return: The number of edges removed. If none got deleted, or in case of error, an Exception is raised

name

arguments

returns

remove_edges

self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name

int

        Remove one or more edges (relationships)
        originating in any of the nodes specified by the match_from specifications,
        and terminating in any of the nodes specified by the match_to specifications,
        optionally matching the given relationship name (will remove all edges if the name is blank or None)

        Return the number of edges removed; if none found, or in case of error, raise an Exception.

        Notes: - the nodes themselves are left untouched
               - more than 1 node could be present in either of the matches
               - the number of relationships deleted could be more than 1 even with a single "from" node and a single "to" node;
                        Neo4j allows multiple relationships with the same name between the same two nodes,
                        as long as the relationships differ in their properties

        :param match_from:  EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param match_to:    EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
                            IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names;
                                       e.g., make sure that for match_from, find() used the option: dummy_node_name="from"
                                                        and for match_to,   find() used the option: dummy_node_name="to"

        :param rel_name:    (OPTIONAL) The name of the relationship to delete between the 2 specified nodes;
                                if None or a blank string, all relationships between those 2 nodes will get deleted.
                                Blanks allowed.

        :return:            The number of edges removed.  If none got deleted, or in case of error, an Exception is raised

name	arguments	returns
edges_exist	self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name: str	bool
Return True if one or more edges (relationships) with the specified name exist in the direction from and to the nodes (individual nodes or set of nodes) specified in the first two arguments. Typically used to find whether 2 given nodes have a direct link between them. :param match_from: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param match_to: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names; e.g., make sure that for match_from, find() used the option: dummy_node_name="from" and for match_to, find() used the option: dummy_node_name="to" :param rel_name: The name of the relationship to look for between the 2 specified nodes. Blanks are allowed :return: True if one or more relationships were found, or False if not

name

arguments

returns

edges_exist

self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name: str

bool

        Return True if one or more edges (relationships) with the specified name exist in the direction
        from and to the nodes (individual nodes or set of nodes) specified in the first two arguments.
        Typically used to find whether 2 given nodes have a direct link between them.

        :param match_from:  EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param match_to:    EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
                            IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names;
                                       e.g., make sure that for match_from, find() used the option: dummy_node_name="from"
                                                        and for match_to,   find() used the option: dummy_node_name="to"

        :param rel_name:    The name of the relationship to look for between the 2 specified nodes.
                                Blanks are allowed

        :return:            True if one or more relationships were found, or False if not

name	arguments	returns
number_of_edges	self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name: str	int
#TODO: add pytest Return the number of edges (relationships) with the specified name exist in the direction from and to the nodes (individual nodes or set of nodes) specified in the first two arguments. :param match_from: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() :param match_to: EITHER an integer with a Neo4j node id, OR a dictionary of data to identify a node, or set of nodes, as returned by find() IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names; e.g., make sure that for match_from, find() used the option: dummy_node_name="from" and for match_to, find() used the option: dummy_node_name="to" :param rel_name: The name of the relationship to look for between the 2 specified nodes. Blanks are allowed :return: True if one or more relationships were found, or False if not

name

arguments

returns

number_of_edges

self, match_from: Union[int, dict], match_to: Union[int, dict], rel_name: str

int

     #TODO: add pytest
        Return the number of edges (relationships) with the specified name exist in the direction
        from and to the nodes (individual nodes or set of nodes) specified in the first two arguments.

        :param match_from:  EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
        :param match_to:    EITHER an integer with a Neo4j node id,
                                OR a dictionary of data to identify a node, or set of nodes, as returned by find()
                            IMPORTANT: match_from and match_to, if created by calls to find(), MUST use different node dummy names;
                                       e.g., make sure that for match_from, find() used the option: dummy_node_name="from"
                                                        and for match_to,   find() used the option: dummy_node_name="to"

        :param rel_name:    The name of the relationship to look for between the 2 specified nodes.
                                Blanks are allowed

        :return:            True if one or more relationships were found, or False if not

name	arguments	returns
reattach_node	self, node, old_attachment, new_attachment, rel_name:str
Sever the relationship with the given name from the given node to the old_attachment node, and re-create it from the given node to the new_attachment node :param node: A "match" structure, as returned by find(). Use dummy_node_name "node" :param old_attachment: A "match" structure, as returned by find(). Use dummy_node_name "old" :param new_attachment: A "match" structure, as returned by find(). Use dummy_node_name "new" :param rel_name: :return: True if the process was successful, or False otherwise

name

arguments

returns

reattach_node

self, node, old_attachment, new_attachment, rel_name:str

        Sever the relationship with the given name from the given node to the old_attachment node,
        and re-create it from the given node to the new_attachment node

        :param node:            A "match" structure, as returned by find().  Use dummy_node_name "node"
        :param old_attachment:  A "match" structure, as returned by find().  Use dummy_node_name "old"
        :param new_attachment:  A "match" structure, as returned by find().  Use dummy_node_name "new"
        :param rel_name:
        :return:                True if the process was successful, or False otherwise

name	arguments	returns
link_nodes_by_ids	self, node_id1:int, node_id2:int, rel:str, rel_props = None	None
Locate the pair of Neo4j nodes with the given Neo4j internal ID's. If they are found, add a relationship - with the name specified in the rel argument, and with the specified optional properties - from the 1st to 2nd node - unless already present. EXAMPLE: link_nodes_by_ids(123, 88, "AVAILABLE_FROM", {'cost': 1000}) TODO: maybe return a status, or the Neo4j ID of the relationship just created :param node_id1: An integer with the Neo4j internal ID to locate the 1st node :param node_id2: An integer with the Neo4j internal ID to locate the 2nd node :param rel: A string specifying a Neo4j relationship name :param rel_props: Optional dictionary with the relationship properties. EXAMPLE: {'since': 2003, 'code': 'xyz'} :return: None

name

arguments

returns

link_nodes_by_ids

self, node_id1:int, node_id2:int, rel:str, rel_props = None

None

        Locate the pair of Neo4j nodes with the given Neo4j internal ID's.
        If they are found, add a relationship - with the name specified in the rel argument,
        and with the specified optional properties - from the 1st to 2nd node - unless already present.

        EXAMPLE:    link_nodes_by_ids(123, 88, "AVAILABLE_FROM", {'cost': 1000})

        TODO: maybe return a status, or the Neo4j ID of the relationship just created

        :param node_id1:    An integer with the Neo4j internal ID to locate the 1st node
        :param node_id2:    An integer with the Neo4j internal ID to locate the 2nd node
        :param rel:         A string specifying a Neo4j relationship name
        :param rel_props:   Optional dictionary with the relationship properties.  EXAMPLE: {'since': 2003, 'code': 'xyz'}
        :return:            None

name	arguments	returns
link_nodes_on_matching_property	self, label1:str, label2:str, property1:str, rel:str, property2=None	None
Locate any pair of Neo4j nodes where all of the following hold: 1) the first one has label1 2) the second one has label2 3) the two nodes agree in the value of property1 (if property2 is None), or in the values of property1 in the 1st node and property2 in the 2nd node For any such pair found, add a relationship - with the name specified in the rel argument - from the 1st to 2nd node, unless already present. This operation is akin to a "JOIN" in a relational database; in pseudo-code: "WHERE label1.value(property1) = label2.value(property1)" # if property2 is None or "WHERE label1.value(property1) = label2.value(property2)" :param label1: A string against which the label of the 1st node must match :param label2: A string against which the label of the 2nd node must match :param property1: Name of property that must be present in the 1st node (and also in 2nd node, if property2 is None) :param property2: Name of property that must be present in the 2nd node (may be None) :param rel: Name to give to all relationships that get created :return: None

name

arguments

returns

link_nodes_on_matching_property

self, label1:str, label2:str, property1:str, rel:str, property2=None

None

        Locate any pair of Neo4j nodes where all of the following hold:
                            1) the first one has label1
                            2) the second one has label2
                            3) the two nodes agree in the value of property1 (if property2 is None),
                                        or in the values of property1 in the 1st node and property2 in the 2nd node
        For any such pair found, add a relationship - with the name specified in the rel argument - from the 1st to 2nd node,
        unless already present.

        This operation is akin to a "JOIN" in a relational database; in pseudo-code:
                "WHERE label1.value(property1) = label2.value(property1)"       # if property2 is None
                    or
                "WHERE label1.value(property1) = label2.value(property2)"

        :param label1:      A string against which the label of the 1st node must match
        :param label2:      A string against which the label of the 2nd node must match
        :param property1:   Name of property that must be present in the 1st node (and also in 2nd node, if property2 is None)
        :param property2:   Name of property that must be present in the 2nd node (may be None)
        :param rel:         Name to give to all relationships that get created
        :return:            None

name	arguments	returns
get_labels	self	[str]
Extract and return a list of all the Neo4j labels present in the database. No particular order should be expected. TODO: test when there are nodes that have multiple labels :return: A list of strings

name	arguments	returns
get_label_properties	self, label:str	list
Extract and return all the property (key) names used in nodes with the given label, sorted alphabetically :param label: A string with the name of a node label :return: A list of property names, sorted alphabetically

name

arguments

returns

get_label_properties

self, label:str

list

        Extract and return all the property (key) names used in nodes with the given label,
        sorted alphabetically

        :param label:   A string with the name of a node label
        :return:        A list of property names, sorted alphabetically

name	arguments	returns
get_indexes	self	pd.DataFrame
Return all the database indexes, and some of their attributes, as a Pandas dataframe. EXAMPLE: labelsOrTypes name properties type uniqueness 0 ["my_label"] "index_23b59623" ["my_property"] BTREE NONUNIQUE 1 ["L"] "L.client_id" ["client_id"] BTREE UNIQUE :return: A (possibly-empty) Pandas dataframe

name

arguments

returns

get_indexes

self

pd.DataFrame

        Return all the database indexes, and some of their attributes,
        as a Pandas dataframe.

        EXAMPLE:
               labelsOrTypes              name          properties    type  uniqueness
             0    ["my_label"] "index_23b59623"    ["my_property"]   BTREE   NONUNIQUE
             1    ["L"]          "L.client_id"       ["client_id"]   BTREE      UNIQUE

        :return:        A (possibly-empty) Pandas dataframe

name	arguments	returns
create_index	self, label: str, key: str	bool
Create a new database index, unless it already exists, to be applied to the specified label and key (property). The standard name given to the new index is of the form label.key EXAMPLE - to index nodes labeled "car" by their key "color": create_index("car", "color") This new index - if not already in existence - will be named "car.color" If an existing index entry contains a list of labels (or types) such as ["l1", "l2"] , and a list of properties such as ["p1", "p2"] , then the given pair (label, key) is checked against ("l1_l2", "p1_p2"), to decide whether it already exists. :param label: A string with the node label to which the index is to be applied :param key: A string with the key (property) name to which the index is to be applied :return: True if a new index was created, or False otherwise

name

arguments

returns

create_index

self, label: str, key: str

bool

        Create a new database index, unless it already exists,
        to be applied to the specified label and key (property).
        The standard name given to the new index is of the form label.key
        EXAMPLE - to index nodes labeled "car" by their key "color":
                        create_index("car", "color")
                  This new index - if not already in existence - will be named "car.color"
        If an existing index entry contains a list of labels (or types) such as ["l1", "l2"] ,
        and a list of properties such as ["p1", "p2"] ,
        then the given pair (label, key) is checked against ("l1_l2", "p1_p2"), to decide whether it already exists.

        :param label:   A string with the node label to which the index is to be applied
        :param key:     A string with the key (property) name to which the index is to be applied
        :return:        True if a new index was created, or False otherwise

name	arguments	returns
drop_index	self, name: str	bool
Get rid of the index with the given name :param name: Name of the index to jettison :return: True if successful or False otherwise (for example, if the index doesn't exist)

name	arguments	returns
drop_all_indexes	self, including_constraints=True	None
Eliminate all the indexes in the database and, optionally, also get rid of all constraints :param including_constraints: Flag indicating whether to also ditch all the constraints :return: None

name	arguments	returns
get_constraints	self	pd.DataFrame
Return all the database constraints, and some of their attributes, as a Pandas dataframe with 3 columns: name EXAMPLE: "my_constraint" description EXAMPLE: "CONSTRAINT ON ( patient:patient ) ASSERT (patient.patient_id) IS UNIQUE" details EXAMPLE: "Constraint( id=3, name='my_constraint', type='UNIQUENESS', schema=(:patient {patient_id}), ownedIndex=12 )" :return: A (possibly-empty) Pandas dataframe

name

arguments

returns

get_constraints

self

pd.DataFrame

        Return all the database constraints, and some of their attributes,
        as a Pandas dataframe with 3 columns:
            name        EXAMPLE: "my_constraint"
            description EXAMPLE: "CONSTRAINT ON ( patient:patient ) ASSERT (patient.patient_id) IS UNIQUE"
            details     EXAMPLE: "Constraint( id=3, name='my_constraint', type='UNIQUENESS',
                                  schema=(:patient {patient_id}), ownedIndex=12 )"
        :return:  A (possibly-empty) Pandas dataframe

name	arguments	returns
create_constraint	self, label: str, key: str, type="UNIQUE", name=None	bool
Create a uniqueness constraint for a node property in the graph, unless a constraint with the standard name of the form `{label}.{key}.{type}` is already present Note: it also creates an index, and cannot be applied if an index already exists. EXAMPLE: create_constraint("patient", "patient_id") :param label: A string with the node label to which the constraint is to be applied :param key: A string with the key (property) name to which the constraint is to be applied :param type: For now, the default "UNIQUE" is the only allowed option :param name: Optional name to give to the new constraint; if not provided, a standard name of the form `{label}.{key}.{type}` is used. EXAMPLE: "patient.patient_id.UNIQUE" :return: True if a new constraint was created, or False otherwise

name

arguments

returns

create_constraint

self, label: str, key: str, type="UNIQUE", name=None

bool

        Create a uniqueness constraint for a node property in the graph,
        unless a constraint with the standard name of the form `{label}.{key}.{type}` is already present
        Note: it also creates an index, and cannot be applied if an index already exists.
        EXAMPLE: create_constraint("patient", "patient_id")
        :param label:   A string with the node label to which the constraint is to be applied
        :param key:     A string with the key (property) name to which the constraint is to be applied
        :param type:    For now, the default "UNIQUE" is the only allowed option
        :param name:    Optional name to give to the new constraint; if not provided, a
                            standard name of the form `{label}.{key}.{type}` is used.  EXAMPLE: "patient.patient_id.UNIQUE"
        :return:        True if a new constraint was created, or False otherwise

name	arguments	returns
drop_constraint	self, name: str	bool
Eliminate the constraint with the specified name. :param name: Name of the constraint to eliminate :return: True if successful or False otherwise (for example, if the constraint doesn't exist)

name	arguments	returns
drop_all_constraints	self	None
Eliminate all the constraints in the database :return: None

name	arguments	returns
load_pandas	self, df:pd.DataFrame, label:str, rename=None, max_chunk_size = 10000	[int]
Load a Pandas data frame (or Series) into Neo4j. Each row is loaded as a separate node. NOTE: no attempt is made to check if an identical (or at least matching in some primary key) node already exists. TODO: maybe save the Panda data frame's row number as an attribute of the Neo4j nodes, to ALWAYS have a primary key :param df: A Pandas data frame to import into Neo4j :param label: String with a Neo4j label to use on the newly-created nodes :param rename: Optional dictionary to rename the Pandas dataframe's columns to EXAMPLE {"current_name": "name_we_want"} :param max_chunk_size: To limit the number of rows loaded at one time :return: A (possibly-empty) list of the Neo4j internal ID's of the created nodes

name

arguments

returns

load_pandas

self, df:pd.DataFrame, label:str, rename=None, max_chunk_size = 10000

[int]

        Load a Pandas data frame (or Series) into Neo4j.
        Each row is loaded as a separate node.
        NOTE: no attempt is made to check if an identical (or at least matching in some primary key) node already exists.

        TODO: maybe save the Panda data frame's row number as an attribute of the Neo4j nodes, to ALWAYS have a primary key

        :param df:              A Pandas data frame to import into Neo4j
        :param label:           String with a Neo4j label to use on the newly-created nodes
        :param rename:          Optional dictionary to rename the Pandas dataframe's columns to
                                    EXAMPLE {"current_name": "name_we_want"}
        :param max_chunk_size:  To limit the number of rows loaded at one time
        :return:                A (possibly-empty) list of the Neo4j internal ID's of the created nodes

name

arguments

returns

export_dbase_json

self

{}

        Export the entire Neo4j database as a JSON string.
        TODO: offer an option to automatically include today's date in name of exported file

        IMPORTANT: APOC must be activated in the database, to use this function.
                   Otherwise it'll raise an Exception

        EXAMPLE:
        { 'nodes': 2,
          'relationships': 1,
          'properties': 6,
          'data': '[{"type":"node","id":"3","labels":["User"],"properties":{"name":"Adam","age":32,"male":true}},\n
                    {"type":"node","id":"4","labels":["User"],"properties":{"name":"Eve","age":18}},\n
                    {"id":"1","type":"relationship","label":"KNOWS","properties":{"since":2003},"start":{"id":"3","labels":["User"]},"end":{"id":"4","labels":["User"]}}\n
                   ]'
        }

        SIDE NOTE: the Neo4j Browser uses a slightly different format for NODES:
                {
                  "identity": 4,
                  "labels": [
                    "User"
                  ],
                  "properties": {
                    "name": "Eve",
                    "age": 18
                  }
                }
              and a substantially more different format for RELATIONSHIPS:
                {
                  "identity": 1,
                  "start": 3,
                  "end": 4,
                  "type": "KNOWS",
                  "properties": {
                    "since": 2003
                  }
                }

        :return:    A dictionary specifying the number of nodes exported ("nodes"),
                    the number of relationships ("relationships"),
                    and the number of properties ("properties"),
                    as well as a "data" field with the actual export as a JSON string

name

arguments

returns

export_nodes_rels_json

self, nodes_query="", rels_query=""

{}

        Export the specified nodes, plus the specified relationships, as a JSON string.
        The default empty strings are taken to mean (respectively) ALL nodes/relationships.

        For details on the formats, see export_dbase_json()

        IMPORTANT:  APOC must be activated in the database for this function.
                    Otherwise it'll raise an Exception

        :param nodes_query: A Cypher query to identify the desired nodes (exclusive of RETURN statements)
                                    The dummy variable for the nodes must be "n"
                                    Use "" to request all nodes
                                    EXAMPLE: "MATCH (n) WHERE (n:CLASS OR n:PROPERTY)"
        :param rels_query:   A Cypher query to identify the desired relationships (exclusive of RETURN statements)
                                    The dummy variable for the relationships must be "r"
                                    Use "" to request all relationships (whether or not their end nodes are also exported)
                                    EXAMPLE: "MATCH ()-[r:HAS_PROPERTY]->()"

        :return:    A dictionary specifying the number of nodes exported,
                    the number of relationships, and the number of properties,
                    as well as a "data" field with the actual export as a JSON string

name	arguments	returns
is_literal	self, value	bool
Return True if the given value represents a literal (in terms of database storage) :param value: :return:

name

arguments

returns

import_json

self, json_str: str, root_labels="import_root_label", parse_only=False, provenance=None

List[int]

        Import the data specified by a JSON string into the database.

        CAUTION: A "postorder" approach is followed: create subtrees first (with recursive calls), then create root last;
        as a consequence, in case of failure mid-import, there's no top root, and there could be several fragments.
        A partial import might need to be manually deleted.
        TODO: maintain a list of all created nodes - so as to be able to delete them all in case of failure.

        :param json_str:    A JSON string representing the data to import
        :param root_labels: String, or list of strings, to be used as Neo4j labels for the root node(s)
        :param parse_only:  If True, the parsed data will NOT be added to the database
        :param provenance:  Optional string to store in a "source" attribute in the root node
                                (only used if the top-level JSON structure is an object, i.e. if there's a single root node)

        :return:            List of integer ID's (possibly empty), of the root node(s) created

name

arguments

returns

create_nodes_from_python_data

self, python_data, root_labels: Union[str, List[str]], level=1

List[int]

        Recursive function to add data from a JSON structure to the database, to create a tree:
        either a single node, or a root node with children.
        A "postorder" approach is followed: create subtrees first (with recursive calls), then create root last.

        If the data is a literal, first turn it into a dictionary using a key named "value".

        Return the Neo4j ID's of the root node(s)

        :param python_data: Python data to import
        :param root_labels: String, or list of strings, to be used as Neo4j labels for the root node(s)
        :param level:       Recursion level (also used for debugging, to make the indentation more readable)
        :return:            List of integer Neo4j internal ID's (possibly empty), of the root node(s) created

name

arguments

returns

dict_importer

self, d: dict, labels, level: int

int

        Import data from a Python dictionary.  It uses a recursive call to create_nodes_from_python_data()

        :param d:       A Python dictionary with data to import
        :param labels:  String, or list of strings, to be used as Neo4j labels for the node
        :param level:   Integer with recursion level (used to format debugging output)
        :return:        Integers with the Neo4j node id of the newly-created node

name

arguments

returns

list_importer

self, l: list, labels, level

[int]

        Import data from a list.  It uses a recursive call to create_nodes_from_python_data()

        :param l:       A list with data to import
        :param labels:  String, or list of strings, to be used as Neo4j labels for the node
        :param level:   Integer with recursion level (used to format debugging output)
        :return:        List (possibly empty) of integers with Neo4j node id's of the newly-created nodes

name

arguments

returns

import_json_dump

self, json_str: str

str

        Used to import data from a database dump done with export_dbase_json() or export_nodes_rels_json()
        Import nodes and/or relationships into the database, as directed by the given data dump in JSON form.
        Note: the id's of the nodes need to be shifted,
              because one cannot force the Neo4j internal id's to be any particular value...
              and, besides (if one is importing into an existing database), particular id's may already be taken.
        :param json_str:    A JSON string with the format specified under export_dbase_json()
        :return:            A status message with import details if successful, or raise an Exception if not

name

arguments

returns

debug_query_print

self, q: str, data_binding=None, method=None, force_output=False

None

        Print out some info on the given Cypher query (and, optionally, on the passed data binding and/or method name),
        BUT only if self.debug is True, or if force_output is True

        :param q:               String with Cypher query
        :param data_binding:    OPTIONAL dictionary
        :param method:          OPTIONAL name of the calling method
        :param force_output:    If True, print out regardless of the self.debug property
        :return:                None

name	arguments	returns
debug_print	self, info: str, trim=False	None
If the class' property "debug" is set to True, print out the passed info string, optionally trimming it, if too long :param info: :param trim: :return: None

name

arguments

returns

debug_trim

self, data, max_len = 150

str

        Abridge the given data (first turning it into a string if needed), if excessively long,
        using ellipses " ..." for the omitted data.
        Return the abridged data.

        :param data:    Data to possibly abridge
        :param max_len:
        :return:        The (possibly) abridged text

name	arguments	returns
debug_trim_print	self, data, max_len = 150	None
Abridge the given data (first turning it into a string if needed), if it is excessively long; then print it :param data: Data to possibly abridge, and then print :param max_len: :return: None

name	arguments	returns
indent_chooser	self, level: int	str
Create an indent based on a "level": handy for debugging recursive functions :param level: :return:

Class CypherUtils

    Helper class.  Most of it is used for matters involving node matching and the "match structure".
    Meant as a private class for NeoAccess; not indicated for the end user.

    A "match" structure is a Python dictionary with the following 4 keys:
            1) "node": a string, defining a node in a Cypher query, *excluding* the "MATCH" keyword
            2) "where": a string, defining the "WHERE" part of the subquery (*excluding* the "WHERE"), if applicable;
                        otherwise, a blank
            3) "data_binding": a (possibly empty) data-binding dictionary
            4) "dummy_node_name": a string used for the node name inside the Cypher query (by default, "n");
                                  potentially relevant to the "node" and "where" values

        TODO: explore the possibility of storing in the structure all the args passed to define_match -
                so that, in case of later conflicts in "dummy_node_name", the "dummy_node_name" can be
                automatically changed, and the structure re-constructed

        EXAMPLES:
            *   {"node": "(n  )" , "where": "" , "data_binding": {}, "dummy_node_name": "n"}
            *   {"node": "(p :`person` )" , "where": "" , "data_binding": {}, "dummy_node_name": "p"}
            *   {"node": "(n  )" , "where": "id(n) = 123" , "data_binding": {}, "dummy_node_name": "n"}
            *   {"node": "(n :`car`:`surplus inventory` )" ,
                 "where": "" ,
                 "data_binding": {},
                 "dummy_node_name": "n"}
            *   {"node": "(n :`person` {`gender`: $n_par_1, `age`: $n_par_2})",
                 "where": "",
                 "data_binding": {"n_par_1": "F", "n_par_2": 22},
                 "dummy_node_name": "n"}
            *   {"node": "(n :`person` {`gender`: $n_par_1, `age`: $n_par_2})",
                 "where": "n.income > 90000 OR n.state = 'CA'",
                 "data_binding": {"n_par_1": "F", "n_par_2": 22},
                 "dummy_node_name": "n"}
            *   {"node": "(n :`person` {`gender`: $n_par_1, `age`: $n_par_2})",
                 "where": "n.income > $min_income",
                 "data_binding": {"n_par_1": "F", "n_par_2": 22, "min_income": 90000},
                 "dummy_node_name": "n"}

name

arguments

returns

define_match

cls, labels=None, neo_id=None, key_name=None, key_value=None, properties=None, subquery=None, dummy_node_name="n"

dict

        Turn the set of specification into the MATCH part, and (if applicable) the WHERE part,
        of a Cypher query (using the specified dummy variable name),
        together with its data-binding dictionary.

        The keywords "MATCH" and "WHERE" are *not* returned, to facilitate the assembly of larger Cypher queries
        that involve multiple matches.

        ALL THE ARGUMENTS ARE OPTIONAL (no arguments at all means "match everything in the database")
        :param labels:      A string (or list/tuple of strings) specifying one or more Neo4j labels.
                                (Note: blank spaces ARE allowed in the strings)
                                EXAMPLES:  "cars"
                                            ("cars", "vehicles")

        :param neo_id:      An integer with the node's internal ID.
                                If specified, it OVER-RIDES all the remaining arguments, except for the labels

        :param key_name:    A string with the name of a node attribute; if provided, key_value must be present, too
        :param key_value:   The required value for the above key; if provided, key_name must be present, too
                                Note: no requirement for the key to be primary

        :param properties:  A (possibly-empty) dictionary of property key/values pairs, indicating a condition to match.
                                EXAMPLE: {"gender": "F", "age": 22}

        :param subquery:    Either None, or a (possibly empty) string containing a Cypher subquery,
                            or a pair/list (string, dict) containing a Cypher subquery and the data-binding dictionary for it.
                            The Cypher subquery should refer to the node using the assigned dummy_node_name (by default, "n")
                                IMPORTANT:  in the dictionary, don't use keys of the form "n_par_i",
                                            where n is the dummy node name and i is an integer,
                                            or an Exception will be raised - those names are for internal use only
                                EXAMPLES:   "n.age < 25 AND n.income > 100000"
                                            ("n.weight < $max_weight", {"max_weight": 100})

        :param dummy_node_name: A string with a name by which to refer to the node (by default, "n")

        :return:            A dictionary of data storing the parameters of the match.
                            For details, see the info stored in the comments for this Class

name

arguments

returns

assert_valid_match_structure

cls, match: dict

None

        Verify that an alleged "match" dictionary is a valid one; if not, raise an Exception
        TODO: tighten up the checks

        :param match:   A dictionary of data to identify a node, or set of nodes, as returned by find()
        :return:        None

name

arguments

returns

validate_and_standardize

cls, match, dummy_node_name="n"

dict

        If match is a non-negative integer, it's assumed to be a Neo4j ID, and a match dictionary is created and returned.
        Otherwise, verify that an alleged "match" dictionary is a valid one:
        if yes, return it back; if not, raise an Exception

        TIP:
              Calling methods that accept "match" arguments can have a line such as:
                    match = CypherUtils.validate_and_standardize(match)
              and, at that point, they will be automatically also accepting Neo4j IDs as "matches"

        TODO: also, accept as argument a list/tuple - and, in addition to the above ops, carry out checks for compatibilities

        :param match:           Either a valid Neo4j internal ID, or a "match" dictionary (TODO: or a list/tuple of those)
        :param dummy_node_name: A string with a name by which to refer to the node (by default, "n")

        :return:                A valid "match" structure, i.e. a dictionary of data to identify a node, or set of nodes

name	arguments	returns
extract_node	cls, match: dict	str
Return the node information from the given "match" data structure :param match: A dictionary, as created by define_match() :return:

name

arguments

returns

unpack_match

cls, match: dict, include_dummy=True

list

        Turn the passed "match" dictionary structure into a list containing:
        [node, where, data_binding, dummy_node_name]
        or
        [node, where, data_binding]
        depending on the include_dummy flag

        TODO:   gradually phase out, as more advanced util methods make the unpacking of all the "match" internal structure unnecessary
                Maybe switch default value for include_dummy to False...

        :param match:           A dictionary, as created by define_match()
        :param include_dummy:   Flag indicating whether to also include the "dummy_node_name" value, as a 4th element in the returned list
        :return:

name	arguments	returns
check_match_compatibility	cls, match1, match2	None
If the two given match structures are incompatible (in terms of bringing them to, raise an Exception. :param match1: :param match2: :return:

name

arguments

returns

prepare_labels

cls, labels

str

        Turn the given string, or list/tuple of strings - representing Neo4j labels - into a string
        suitable for inclusion in a Cypher query.
        Blanks ARE allowed in the names.
        EXAMPLES:
            "" or None          give rise to    ""
            "client"            gives rise to   ":`client`"
            "my label"          gives rise to   ":`my label`"
            ["car", "vehicle"]  gives rise to   ":`car`:`vehicle`"

        :param labels:  A string, or list/tuple of strings, representing one or multiple Neo4j labels
        :return:        A string suitable for inclusion in the node part of a Cypher query

name

arguments

returns

combined_where

cls, match_list: list

str

        Given a list of "match" structures, return the combined version of all their WHERE statements.
        For details, see prepare_where()
        TODO: Make sure there's no conflict in the dummy node names

        :param match_list:  A list of "match" structures
        :return:            A string with the combined WHERE statement,
                            suitable for inclusion into a Cypher query (empty if there were no subclauses)

name

arguments

returns

prepare_where

cls, where_list: Union[str, list]

str

        Given a WHERE clauses, or list/tuple of them, combined them all into one -
        and also prefix to the result (if appropriate) the WHERE keyword.
        The combined clauses of the WHERE statement are parentheses-enclosed, to protect against code injection

        EXAMPLES:   "" or "      " or [] or ("  ", "") all result in  ""
                    "n.name = 'Julian'" returns "WHERE (n.name = 'Julian')"
                        Likewise for ["n.name = 'Julian'"]
                    ("p.key1 = 123", "   ",  "p.key2 = 456") returns "WHERE (p.key1 = 123 AND p.key2 = 456)"

        :param where_list:  A string with a subclause, or list or tuple of subclauses,
                            suitable for insertion in a WHERE statement

        :return:            A string with the combined WHERE statement,
                            suitable for inclusion into a Cypher query (empty if there were no subclauses)

name

arguments

returns

combined_data_binding

cls, match_list: list

dict

        Given a list of "match" structures, returned the combined version of all their data binding dictionaries.
        TODO: Make sure there's no conflicts
        TODO: Since this also works with a 1-element list, it can be use to simply unpack the data binding from the match structure
              (i.e. ought to drop the "combined" from the name)

name

arguments

returns

dict_to_cypher

cls, data_dict: {}, prefix="par_"

(str, {})

        Turn a Python dictionary (meant for specifying node or relationship attributes)
        into a string suitable for Cypher queries,
        plus its corresponding data-binding dictionary.

        EXAMPLE :
                {'cost': 65.99, 'item description': 'the "red" button'}

                will lead to the pair:
                    (
                        '{`cost`: $par_1, `item description`: $par_2}',
                        {'par_1': 65.99, 'par_2': 'the "red" button'}
                    )

        Note that backticks are used in the Cypher string to allow blanks in the key names.
        Consecutively-named dummy variables ($par_1, $par_2, etc) are used,
        instead of names based on the keys of the data dictionary (such as $cost),
        because the keys might contain blanks.

        SAMPLE USAGE:
            (cypher_properties, data_binding) = dict_to_cypher(data_dict)

        :param data_dict:   A Python dictionary
        :param prefix:      Optional prefix string for the data-binding dummy names (parameter tokens); handy to prevent conflict;
                                by default, "par_"

        :return:            A pair consisting of a string suitable for Cypher queries,
                                and a corresponding data-binding dictionary.
                            If the passed dictionary is empty or None,
                                the pair returned is ("", {})