NeoSchema Reference Guide

This guide is for version Beta 36


Source code

Background Information: Using Schema in Graph Databases such as Neo4j

User Guide

Tutorial 1 : basic Schema operations (Classes, Properties, Data Nodes)
Tutorial 2 : set up a simple Schema (Classes, Properties) and perform a data import (Data Nodes and relationships among them)


Class NeoSchema


    A layer above the class NeoAccess (or, in principle, another library providing a compatible interface),
    to provide an optional schema to the underlying database.

    Schemas may be used to either:
        1) acknowledge the existence of typical patterns in the data
        OR
        2) to enforce a mold for the data to conform to

    MOTIVATION

        Relational databases are suffocatingly strict for the real world.
        Neo4j by itself may be too anarchic.
        A schema (whether "lenient/lax/loose" or "strict") in conjunction with Neo4j may be the needed compromise.

    GOALS

        - Data integrity
        - Assist the User Interface
        - Infuse into Neo4j functionality that some people turn to RDF for.  However, carving out a new path
          rather than attempting to emulate RDF!



    OVERVIEW

        - "Class" nodes capture the abstraction of entities that share similarities.
          Example: "car", "star", "protein", "patient"

          In RDFS lingo, a "Class" node is the counterpart of a resource (entity)
                whose "rdf:type" property has the value "rdfs:Class"

        - The "Property" nodes linked to a given "Class" node, represent the attributes of the data nodes of that class

        - Data nodes are linked to their respective classes by a "SCHEMA" relationship.

        - Some classes contain an attribute named "schema_code" that identifies the UI code to display/edit them,
          as well as their descendants under the "INSTANCE_OF" relationships.
          Conceptually, the "schema_code" is a relationship to an entity consisting of software code.

        - Class can be of the "S" (Strict) or "L" (Lenient) type.
            A "lenient" Class will accept data nodes with any properties, whether declared in the Class Schema or not;
            by contrast, a "strict" class will prevent data nodes that contains properties not declared in the Schema

            (COMING AT A LATER DATE:  "required properties" and "property data types")


    IMPLEMENTATION DETAILS

        - Every node used by this class, as well as the data nodes it manages,
          contains has a unique attribute "uri" (formerly "schema_id" and "item_id", respectively);
          note that this is actually a "token", i.e. a part of a URI - not a full URI.
          The uri's of schema nodes have the form "schema-n", where n is a unique number.
          Data nodes can have any unique uri's, with optional prefixes and suffixes chosen by the higher layers.
          The Schema layer manages the auto-increments for any desired set of namespaces (and itself makes use
          of the "schema_node" namespace)

        - The names of the Classes and Properties are stored in node attributes called "name".
          We also avoid calling them "label", as done in RDFS, because in Labeled Graph Databases
          like Neo4j, the term "label" has a very specific meaning, and is pervasively used.

        - For convenience, data nodes contain a label equal to their Class name,
          and a redundant attribute (that might be phased out) named "schema_code"


    AUTHOR:
        Julian West



    ----------------------------------------------------------------------------------
	MIT License

        Copyright (c) 2021-2024 Julian A. West and the BrainAnnex.org project

        This file is part of the "Brain Annex" project (https://BrainAnnex.org)

        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:

        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.

        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
	----------------------------------------------------------------------------------
    
nameargumentsreturns
set_databasecls, db :NeoAccessNone
        IMPORTANT: this method MUST be called before using this class!

        :param db:  Database-interface object, created with the NeoAccess library
        :return:    None
        

Schema CLASSES

nameargumentsreturns
assert_valid_class_namecls, class_name: strNone
        Raise an Exception if the passed argument is not a valid Class name

        :param class_name:  A string with the putative name of a Schema Class
        :return:            None
        
nameargumentsreturns
is_valid_class_namecls, class_name: strbool
        Return True if the passed argument is a valid Class name, or False otherwise

        :param class_name:  A string with the putative name of a Schema Class
        :return:            None
        
nameargumentsreturns
assert_valid_class_identifiercls, class_node: Union[int, str]None
        Raise an Exception is the argument is not a valid "identifier" for a Class node,
        meaning either a valid name or a valid internal database ID

        :param class_node:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :return:            None (an Exception is raised if the validation fails)
        
nameargumentsreturns
create_classcls, name :str, code = None, strict = False, no_datanodes = False(int, str)
        Create a new Class node with the given name and type of schema,
        provided that the name isn't already in use for another Class.

        Return a pair with the Neo4j ID of the new ID,
        and the auto-incremented unique ID assigned to the new Class.
        Raise an Exception if a class by that name already exists.

        NOTE: if you want to add Properties at the same time that you create a new Class,
              use the function create_class_with_properties() instead.

        :param name:        Name to give to the new Class
        :param code:        Optional string indicative of the software handler for this Class and its subclasses
        :param strict:      If True, the Class will be of the "S" (Strict) type;
                                otherwise, it'll be of the "L" (Lenient) type
                            Explained under the comments for the NeoSchema class

        :param no_datanodes If True, it means that this Class does not allow data node to have a "SCHEMA" relationship to it;
                                typically used by Classes having an intermediate role in the context of other Classes

        :return:            An (int, str) pair of integers with the internal database ID and the unique uri assigned to the node just created,
                                if it was created;
                                an Exception is raised if a class by that name already exists
        
nameargumentsreturns
get_class_internal_idcls, class_name :strint
        Returns the internal database ID of the Class node with the given name,
        or raise an Exception if not found, or if more than one is found.
        Note: unique Class names are assumed.

        :param class_name:  The name of the desired class
        :return:            The internal database ID of the specified Class
        
nameargumentsreturns
get_class_uricls, class_name :strstr
        Returns the Schema uri of the Class with the given name;
        raise an Exception if not found

        :param class_name:  The name of the desired class
        :return:            The Schema uri of the specified Class
        
nameargumentsreturns
get_class_uri_by_internal_idcls, internal_class_id: intint
        Returns the Schema uri of the Class with the given internal database ID.

        :param internal_class_id:
        :return:            The Schema ID of the specified Class; raise an Exception if not found
        
nameargumentsreturns
class_neo_id_existscls, neo_id: intbool
        Return True if a Class by the given internal database ID already exists, or False otherwise

        :param neo_id:  Integer with internal database ID
        :return:        A boolean indicating whether the specified Class exists
        
nameargumentsreturns
class_uri_existscls, schema_uri :strbool
        Return True if a Class by the given uri already exists, or False otherwise

        :param schema_uri:  The uri of the Class node of interest
        :return:            True if the Class already exists, or False otherwise
        
nameargumentsreturns
class_name_existscls, class_name: strbool
        Return True if a Class by the given name already exists, or False otherwise

        :param class_name:  The name of the class of interest
        :return:            True if the Class already exists, or False otherwise
        
nameargumentsreturns
get_class_name_by_schema_uricls, schema_uri :strstr
        Returns the name of the class with the given Schema ID;
        raise an Exception if not found

        :param schema_uri:  A string uniquely identifying the desired Class
        :return:            The name of the Class with the given Schema uri
        
nameargumentsreturns
get_class_namecls, internal_id: intstr
        Returns the name of the class with the given internal database ID,
        or raise an Exception if not found

        :param internal_id: An integer with the internal database ID
                                of the desired class
        :return:            The name of the class with the given Schema ID;
                                raise an Exception if not found
        
nameargumentsreturns
get_class_attributescls, class_internal_id: intdict
        Returns all the attributes (incl. the name) of the Class node with the given internal database ID,
        or raise an Exception if the Class is not found.
        If no "name" attribute is found, an Exception is raised.

        :param class_internal_id:   An integer with the Neo4j ID of the desired class
        :return:                    A dictionary of attributed of the class with the given Schema ID;
                                        an Exception is raised if not found
                                        EXAMPLE:  {'name': 'MY CLASS', 'uri': '123', 'strict': False}
        
nameargumentsreturns
get_all_classescls, only_names=True[str]
        Fetch and return a list of all the existing Schema classes - either just their names (sorted alphabetically)

        :return:    A list of all the existing Class names
        
nameargumentsreturns
delete_classcls, name: str, safe_delete=TrueNone
        Delete the given Class AND all its attached Properties.
        If safe_delete is True (recommended) delete ONLY if there are no data nodes of that Class
        (i.e., linked to it by way of "SCHEMA" relationships.)

        :param name:        Name of the Class to delete
        :param safe_delete: Flag indicating whether the deletion is to be restricted to
                            situations where no data node would be left "orphaned".
                            CAUTION: if safe_delete is False,
                                     then data nodes may be left without a Schema
        :return:            None.  In case of no node deletion, an Exception is raised
        
nameargumentsreturns
is_strict_classcls, class_internal_id: int, schema_cache=Nonebool
        Return True if the given Class is of "Strict" type,
        or False otherwise (or if the information is missing)

        :param class_internal_id:   The internal ID of a Schema Class node
        :param schema_cache:        (OPTIONAL) "SchemaCache" object
        :return:                    True if the Class is "strict" or False if not (i.e., if it's "lax")
        
nameargumentsreturns
allows_data_nodescls, class_name = None, class_internal_id = None, schema_cache=Nonebool
        Determine if the given Class allows data nodes directly linked to it

        :param class_name:      Name of the Class
        :param class_internal_id :(OPTIONAL) Alternate way to specify the class; if both specified, this one prevails
        :param schema_cache:    (OPTIONAL) "SchemaCache" object
        :return:                True if allowed, or False if not
                                    If the Class doesn't exist, raise an Exception
        

RELATIONSHIPS AMONG CLASSES

nameargumentsreturns
assert_valid_relationship_namecls, rel_name :strNone
        Raise an Exception if the passed argument is not a valid name for a database relationship

        :param rel_name:
        :return:        None
        
nameargumentsreturns
create_class_relationshipcls, from_class: Union[int, str], to_class: Union[int, str], rel_name="INSTANCE_OF", use_link_node=FalseNone
        Create a relationship (provided that it doesn't already exist) with the specified name
        between the 2 existing Class nodes (identified by names or by their internal database IDs),
        in the ( from -> to ) direction.

        In case of error, an Exception is raised.

        Note: multiple relationships by the same name between the same nodes are allowed by Neo4j,
              as long as the relationships differ in their attributes
              (but this method doesn't allow setting properties on the new relationship)

        :param from_class:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name.
                                Used to identify the node from which the new relationship originates.
        :param to_class:    Either an integer with the internal database ID of an existing Class node,
                                or a string with its name.
                                Used to identify the node to which the new relationship terminates.
        :param rel_name:    Name of the relationship to create, in the from -> to direction
                                (blanks allowed)
        :param use_link_node: EXPERIMENTAL feature - if True, insert an intermediate "LINK" node in the newly-created
                                relationship; otherwise, simply create a direct link.
                                If rel_name has the special value "INSTANCE_OF", it must be False
        :return:            None
        
nameargumentsreturns
rename_class_relcls, from_class: int, to_class: int, new_rel_namebool
        Rename the old relationship between the specified classes

        :param from_class:
        :param to_class:
        :param new_rel_name:
        :return:            True if another relationship was found, and successfully renamed;
                            otherwise, False
        
nameargumentsreturns
delete_class_relationshipcls, from_class: str, to_class: str, rel_nameint
        Delete the relationship(s) with the specified name
        between the 2 existing Class nodes (identified by their respective names),
        going in the from -> to direction direction.
        In case of error or if no relationship was found, an Exception is raised

        Note: there might be more than one - relationships with the same name between the same nodes
              are allowed, provided that they have different properties.
              If more than one is found, they will all be deleted.
              The number of relationships deleted will be returned

        :param from_class:  Name of one existing Class node (blanks allowed in name)
        :param to_class:    Name of another existing Class node (blanks allowed in name)
        :param rel_name:    Name of the relationship(s) to delete,
                                if found in the from -> to direction (blanks allowed in name)

        :return:            The number of relationships deleted.
                            In case of error, or if no relationship was found, an Exception is raised
        
nameargumentsreturns
unlink_classescls, class1 :Union[int, str], class2 :Union[int, str]int
        Remove ALL relationships (in any direction) between the specified Classes

        :param class1:  Either the integer internal database ID, or name, to identify the first Class
        :param class2:  Either the integer internal database ID, or name, to identify the second Class
        :return:        The number of relationships deleted (possibly zero)
        
nameargumentsreturns
class_relationship_existscls, from_class: str, to_class: str, rel_namebool
        Return True if a relationship with the specified name exists between the two given Classes,
        in the specified direction.
        The Schema allows several scenarios:
            - A direct relationship from one Class node to the other
            - A relationship that goes thru an intermediary "LINK" node
            - Either of the 2 above scenarios, but between "ancestors" of the two nodes;
              "ancestors" are defined by means of following
              any number of "INSTANCE_OF" hops to other Class nodes

        :param from_class:  Name of an existing Class node (blanks allowed in name)
        :param to_class:    Name of another existing Class node (blanks allowed in name)
        :param rel_name:    Name of the relationship(s) to delete,
                                if found in the from -> to direction (blanks allowed in name)
        :return:            True if the Class relationship exists, or False otherwise
        
nameargumentsreturns
get_class_instancescls, class_name: str, leaf_only=False[str]
        Get the names of all Classes that are, directly or indirectly, instances of the given Class,
        i.e. pointing to that node thru a series of 1 or more "INSTANCE_OF" relationships;
        if leaf_only is True, then only as long as they are leaf nodes (with no other Class
        that is an instance of them.)

        :param class_name:  Name of the Class for which we want to find
                            other Classes that are an instance of it
        :param leaf_only:   If True, only return the leaf nodes (those that
                            don't have other Classes that are instances of them)
        :return:            A list of Class names
        
nameargumentsreturns
get_linked_class_namescls, class_name: str, rel_name: str, enforce_unique=FalseUnion[str, List[str]]
        Given a Class, specified by its name, locate and return the name(s) of the other Class(es)
        that it's linked to by means of the relationship with the specified name.
        Typically, the result will contain no more than 1 name, but it could be more;
        it's probably a bad design to use the same relationship name to connect a class to multiple other classes
        (though currently allowed.)
        Relationships are followed in the OUTbound direction only.

        :param class_name:      Name of a Class in the schema
        :param rel_name:        Name of relationship to follow (in the OUTbound direction) from the above Class
        :param enforce_unique:  If True, it raises an Exception if the number of results isn't exactly one

        :return:                If enforce_unique is True, return a string with the class name;
                                otherwise, return a list of names (typically just one)
        
nameargumentsreturns
get_class_relationshipscls, class_name :str, link_dir="BOTH", omit_instance=FalseUnion[dict, list]
        Fetch and return the names of all the relationships (both inbound and outbound)
        attached to the given Class.
        Treat separately the inbound and the outbound ones.
        If the Class doesn't exist, empty lists are returned.

        :param class_name:      The name of the desired Class
        :param link_dir:        Desired direction(s) of the relationships; one of "BOTH" (default), "IN" or "OUT"
        :param omit_instance:   If True, the common outbound relationship "INSTANCE_OF" is omitted

        :return:                If link_dir is "BOTH", return a dictionary of the form
                                    {"in": list of inbound-relationship names,
                                     "out": list of outbound-relationship names}
                                Otherwise, just return the inbound or outbound list, based on the value of link_dir
        
nameargumentsreturns
get_class_outbound_datacls, class_neo_id :int, omit_instance=Falsedict
        Efficient all-at-once query to fetch and return the names of all the outbound relationship
        attached to the given Class, as well as the names of the other Classes on the other side of those links.

        IMPORTANT: it's probably a bad design to use the same relationship name to connect a class
        to multiple other classes.  Though currently allowed in the Schema, this particular method
        assumes - and enforces - uniqueness

        :param class_neo_id:    An integer to identify the desired Class
        :param omit_instance:   If True, the common outbound relationship "INSTANCE_OF" is omitted

        :return:                A (possibly empty) dictionary,
                                    where the keys are the name of outbound relationships,
                                    and the values are the names of the Class nodes on the other side of those links.
                                    An Exception will be raised if link names are not unique [though currently allowed by the Schema]
                                    EXAMPLE: {'IS_ATTENDED_BY': 'doctor', 'HAS_RESULT': 'result'}
        

CLASS PROPERTIES

nameargumentsreturns
get_class_propertiescls, class_node: Union[int, str], include_ancestors=False, sort_by_path_len=None, exclude_system=False[str]
        Return the list of all the names of the Properties associated with the given Class
        (including those inherited thru ancestor nodes by means of "INSTANCE_OF" relationships,
        if include_ancestors is True),
        sorted by the schema-specified position (or, optionally, by path length)

        :param class_node:          Either an integer with the internal database ID of an existing Class node,
                                        or a string with its name
        :param include_ancestors:   If True, also include the Properties attached to Classes that are ancestral
                                    to the given one by means of a chain of outbound "INSTANCE_OF" relationships
                                    Note: the sorting by relationship index won't mean much if ancestral nodes are included,
                                          with their own indexing of relationships; if order matters in those cases, use the
                                          "sort_by_path_len" argument, below
        :param sort_by_path_len:    Only applicable if include_ancestors is True.
                                    If provided, it must be either "ASC" or "DESC", and it will sort the results by path length
                                    (either ascending or descending), before sorting by the schema-specified position for each Class.
                                    Note: with "ASC", the immediate Properties of the given Class will be listed first
        :param exclude_system:      (OPTIONAL) If True, Property nodes with the attribute "system" set to True will be excluded

        :return:                    A list of the Properties of the specified Class (including indirectly, if include_ancestors is True)
        
nameargumentsreturns
add_properties_to_classcls, class_node = None, class_uri = None, property_list = Noneint
        Add a list of Properties to the specified (ALREADY-existing) Class.
        The properties are given an inherent order (an attribute named "index", starting at 1),
        based on the order they appear in the list.
        If other Properties already exist, the existing numbering gets extended.

        NOTE: if the Class doesn't already exist, use create_class_with_properties() instead;
              attempting to add properties to an non-existing Class will result in an Exception

        :param class_node:      An integer with the internal database ID of an existing Class node,
        :param class_uri:       (OPTIONAL) String with the schema_uri of the Class to which attach the given Properties
                                Deprecated!

        :param property_list:   A list of strings with the names of the properties, in the desired order.
                                    Whitespace in any of the names gets stripped out.
                                    If any name is a blank string, an Exception is raised
                                    If the list is empty, an Exception is raised
        :return:                The number of Properties added
        
nameargumentsreturns
set_property_attributecls, class_name :str, prop_name :str, attribute_name :str, attribute_valueNone
        Set an attribute on an existing "PROPERTY" node of the specified Class

        EXAMPLE:    set_property_attribute(class_name="Content Item", prop_name="uri",
                                           attribute_name="system", attribute_value=True)
        :param class_name:
        :param prop_name:
        :param attribute_name:
        :param attribute_value:
        :return:                None
        
nameargumentsreturns
create_class_with_propertiescls, name :str, property_list: [str], code=None, strict=False, class_to_link_to=None, link_name="INSTANCE_OF", link_dir="OUT"(int, int)
        Create a new Class node, with the specified name, and also create the specified Properties nodes,
        and link them together with "HAS_PROPERTY" relationships.

        Return the internal database ID and the auto-incremented unique ID ("scheme ID") assigned to the new Class.
        Each Property node is also assigned a unique "schema ID";
        the "HAS_PROPERTY" relationships are assigned an auto-increment index,
        representing the default order of the Properties.

        If a class_to_link_to name is specified, link the newly-created Class node to that existing Class node,
        using an outbound relationship with the specified name.  Typically used to create "INSTANCE_OF"
        relationships from new Classes.

        If a Class with the given name already exists, nothing is done,
        and an Exception is raised.

        NOTE: if the Class already exists, use add_properties_to_class() instead

        :param name:            String with name to assign to the new class
        :param property_list:   List of strings with the names of the Properties, in their default order (if that matters)
        :param code:            Optional string indicative of the software handler for this Class and its subclasses.
                                    Deprecated!

        :param strict:          If True, the Class will be of the "S" (Strict) type;
                                    otherwise, it'll be of the "L" (Lenient) type

        :param class_to_link_to: If this name is specified, and a link_to_name (below) is also specified,
                                    then create an OUTBOUND relationship from the newly-created Class
                                    to this existing Class
        :param link_name:       Name to use for the above relationship, if requested.  Default is "INSTANCE_OF"
        :param link_dir:        Desired direction(s) of the relationships: either "OUT" (default) or "IN"

        :return:                If successful, the pair (internal ID, integer "schema_uri" assigned to the new Class);
                                otherwise, raise an Exception
        
nameargumentsreturns
remove_property_from_classcls, class_uri :str, property_uri :strNone
        Take out the specified (single) Property from the given Class.
        If the Class or Property was not found, or if the Property could not be removed, an Exception is raised

        :param class_uri:   The uri of the Class node
        :param property_uri:The uri of the Property node
        :return:            None
        
nameargumentsreturns
get_schema_codecls, class_name: strstr
        Obtain the "schema code" of a Class, specified by its name.
        The "schema code" is an optional but convenient text code,
        stored either on a Class node, or on any of its ancestors by way of "INSTANCE_OF" relationships

        :return:    A string with the Schema code (empty string if not found)
                    EXAMPLE: "i"
        
nameargumentsreturns
get_schema_uricls, schema_code :strstr
        Get the Schema ID most directly associated to the given Schema Code

        :return:    An integer with the Schema uri (or "" if not present)
        

DATA NODES

nameargumentsreturns
all_propertiescls, label :str, primary_key_name :str, primary_key_value[str]
        Return the list of the *names* of all the Properties associated with the given DATA node,
        based on the Schema it is associated with, sorted their by schema-specified position.
        The desired node is identified by specifying which one of its attributes is a primary key,
        and providing a value for it.

        IMPORTANT : this function returns the NAMES of the Properties; not their values

        :param label:
        :param primary_key_name:
        :param primary_key_value:
        :return:                    A list of the names of the Properties associated
                                        with the given DATA node
        
nameargumentsreturns
get_data_node_internal_idcls, uri :str, label=Noneint
        Returns the internal database ID of the given Data Node,
        specified by the value of its uri attribute
        (and optionally by a label)

        :param uri:     A string to identify a Data Node by the value of its "uri" attribute
        :param label:   (OPTIONAL) String to require the Data Node to have (redundant,
                            since "uri" already uniquely specifies a Data Node - but
                            could be used for speed or data integrity)

        :return:        The internal database ID of the specified Data Node;
                            if none (or more than one) found, an Exception is raised
        
nameargumentsreturns
get_data_node_idcls, key_value :str, key_name="uri"int
        Get the internal database ID of a Data Node, given some other primary key

        :param key_value:
        :param key_name:
        :return:            The internal database ID of the specified Data Node
        
nameargumentsreturns
data_node_existscls, data_node: Union[int, str]bool
        Return True if the specified Data Node exists, or False otherwise.

        :param data_node:   Either an integer (representing an internal database ID),
                                or a string (representing the value of the "uri" field)
        :return:            True if the specified Data Node exists, or False otherwise
        
nameargumentsreturns
fetch_data_nodecls, uri = None, internal_id = None, labels=None, properties=NoneUnion[dict, None]
        Return a dictionary with all the key/value pairs of the attributes of given data node

        See also locate_node()

        :param uri:         The "uri" field to uniquely identify the data node
        :param internal_id: OPTIONAL alternate way to specify the data node;
                                if present, it takes priority
        :param labels:      OPTIONAL (generally, redundant) ways to locate the data node
        :param properties:  OPTIONAL (generally, redundant) ways to locate the data node

        :return:            A dictionary with all the key/value pairs, if found; or None if not
        
nameargumentsreturns
locate_nodecls, node_id: Union[int, str], id_type=None, labels=None, dummy_node_name="n"CypherMatch
        EXPERIMENTAL - a generalization of fetch_data_node()

        Return the "match" structure to later use to locate a node identified
        either by its internal database ID (default), or by a primary key (with optional label.)

        NOTE: No database operation is actually performed.

        :param node_id: This is understood be the Neo4j ID, unless an id_type is specified
        :param id_type: For example, "uri";
                            if not specified, the node ID is assumed to be Neo4j ID's
        :param labels:  (OPTIONAL) Labels - a string or list/tuple of strings - for the node
        :param dummy_node_name: (OPTIONAL) A string with a name by which to refer to the node (by default, "n")

        :return:        A "CypherMatch" object
        
nameargumentsreturns
class_of_data_nodecls, node_id: int, id_type=None, labels=Nonestr
        Return the name of the Class of the given data node: identified
        either by its internal database ID (default), or by a primary key (with optional label)

        :param node_id:     Either an internal database ID or a primary key value
        :param id_type:     OPTIONAL - name of a primary key used to identify the data node;
                                leave blank to use the internal database ID
        :param labels:      Optional string, or list/tuple of strings, with Neo4j labels
                                (DEPRECATED)

        :return:            A string with the name of the Class of the given data node
        
nameargumentsreturns
data_nodes_of_classcls, class_name :str[int]
        Return the uri's of all the Data Nodes of the given Class

        :param class_name:  Name of a Schema Class
        :return:            Return the Item ID's of all the Data Nodes of the given Class
        
nameargumentsreturns
count_data_nodes_of_classcls, class_id: Union[int, str][int]
        Return the count of all the Data Nodes attached to the given Class

        :param class_id:    Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :return:            The count of all the Data Nodes attached to the given Class
        
nameargumentsreturns
data_nodes_lacking_schemacls, label :str[dict]
        Locate and return all nodes with the given label
        that aren't associated to any Schema Class

        :label:     A string with a graph-database label
        :return:    A list containing a single dictionary, with key 'n';
                        the value is a dict with all the properties of the located nodes
        
nameargumentsreturns
allowable_propscls, class_internal_id: int, requested_props: dict, silently_drop: bool, schema_cache=Nonedict
        If any of the properties in the requested list of properties is not a declared (and thus allowed) Schema property,
        then:
            1) if silently_drop is True, drop that property from the returned pared-down list
            2) if silently_drop is False, raise an Exception

        :param class_internal_id:   The internal database ID of a Schema Class node
        :param requested_props:     A dictionary of properties one wishes to assign to a new data node, if the Schema allows
        :param silently_drop:       If True, any requested properties not allowed by the Schema are simply dropped;
                                        otherwise, an Exception is raised if any property isn't allowed
        :param schema_cache:        (OPTIONAL) "SchemaCache" object

        :return:                    A possibly pared-down version of the requested_props dictionary
        
nameargumentsreturns
create_data_nodecls, class_node :Union[int, str], properties = None, extra_labels = None, new_uri=None, silently_drop=Falseint
        Create a new data node, of the type indicated by specified Class,
        with the given (possibly none) properties and extra label(s);
        the name of the Class is always used as a label.

        The new data node, if successfully created, will optionally be assigned
        a passed URI value, or a unique auto-gen value, for its field uri.

        If the requested Class doesn't exist, an Exception is raised

        If the data node needs to be created with links to other existing data nodes,
        use add_data_node_with_links() instead

        Note: the responsibility for picking a URI belongs to the calling function
              (which will typically make use of a namespace)

        Not: if creating multiple data nodes at once, one might use import_pandas_nodes()

        :param class_node:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :param properties:  (OPTIONAL) Dictionary with the properties of the new data node.
                                EXAMPLE: {"make": "Toyota", "color": "white"}
        :param extra_labels:(OPTIONAL) String, or list/tuple of strings, with label(s) to assign to the new data node,
                                IN ADDITION TO the Class name (which is always used as label)
        :param new_uri:     If new_uri is provided, then a field called "uri"
                                is set to that value;
                                also, an extra attribute named "schema_code" gets set

        :param silently_drop: If True, any requested properties not allowed by the Schema are simply dropped;
                                otherwise, an Exception is raised if any property isn't allowed
                                Note: only applicable for "Strict" schema - with a "Lenient" schema anything goes

        :return:            The internal database ID of the new data node just created
        
nameargumentsreturns
import_pandas_nodescls, df :pd.DataFrame, class_node :Union[int, str], datetime_cols=None, int_cols=None, extra_labels=None, schema_code=None, report_frequency=100[int]
        Import a group of entities, from the rows of a Pandas dataframe, as data nodes in the database.

        NaN's and empty strings are dropped - and never make it into the database

        :param df:          A Pandas Data Frame with the data to import;
                                each row represents a record - to be turned into a graph-database node.
                                Each column represents a Property of the data node, and it must have been
                                previously declared in the Schema
        :param class_node:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :param datetime_cols: (OPTIONAL) String, or list/tuple of strings, of column name(s)
                                that contain datetime strings such as '2015-08-15 01:02:03'
                                (compatible with the python "datetime" format)
        :param int_cols:    (OPTIONAL) String, or list/tuple of strings, of column name(s)
                                that contain integers, or that are to be converted to integers
                                (typically necessary because numeric Pandas columns with NaN's
                                 are automatically turned into floats)
        :param extra_labels:(OPTIONAL) String, or list/tuple of strings, with label(s) to assign to the new data node,
                                IN ADDITION TO the Class name (which is always used as label)
        :param schema_code: (OPTIONAL) Legacy element, deprecated.  Extra string to add as value
                                to a "schema_code" property for each new data node created
        :param report_frequency: (OPTIONAL) How often to print out the status of the import-in-progress

        :return:            A list of the internal database ID's of the newly-created data nodes
        
nameargumentsreturns
import_pandas_linkscls, df :pd.DataFrame, col_from :str, col_to :str, link_name :str, col_link_props=None, name_map=None, skip_errors = False, report_frequency=100[int]
        Import a group of relationships between existing database Data Nodes,
        from the rows of a Pandas dataframe, as database links between the existing Data Nodes.

        :param df:          A Pandas Data Frame with the data RELATIONSHIP to import
        :param col_from:    Name of the Data Frame column identifying the data nodes from which the relationship starts,
                                (the values are expected to be foreign keys)
        :param col_to:      Name of the Data Frame column identifying the data nodes from which the relationship starts
                                (the values are expected to be foreign keys)
        :param link_name:   Name of the new relationship being created
        :param col_link_props: (OPTIONAL) Name of a property to assign to the relationships,
                                as well as name of the Data Frame column containing the values.
                                Any NaN values are ignored (no property set on that relationship.)
        :param name_map:    (OPTIONAL) Dict with mapping from Pandas column names
                                to Property names in the data nodes in the database
        :param skip_errors: (OPTIONAL) If True, the import continues even in the presence of errors;
                                default is False
        :param report_frequency: (OPTIONAL) How often to print out the status of the import-in-progress

        :return:            A list of of the internal database ID's of the created links
        
nameargumentsreturns
add_data_node_mergecls, class_name :str, properties :dict(int, bool)
        A new Data Node gets created only if
        there's no other Data Node with the same properties,
        and attached to the given Class.

        An Exception is raised if any of the requested properties is not registered with the given Schema Class,
        or if that Class doesn't accept Data Nodes.

        :param class_name:  The Class node for the Data Node to locate, or create if not found
        :param properties:  A dictionary with the properties to look up the Data Node by,
                                or to give to a new one if an existing one wasn't found.
                                EXAMPLE: {"make": "Toyota", "color": "white"}

        :return:            A pair with:
                                1) The internal database ID of either an existing Data Node or of a new one just created
                                2) True if a new Data Node was created, or False if not (i.e. an existing one was found)
        
nameargumentsreturns
add_data_column_mergecls, class_name :str, property_name: str, value_list: listdict
        Add a data column (i.e. a set of single-property data nodes).
        Individual nodes are created only if there's no other data node with the same property/value

        :param class_name:      The Class node for the Data Node to locate, or create if not found
        :param property_name:   The name of the data column (i.e. the name of the data field)
        :param value_list:      A list of values that make up the the data column
        :return:                A dictionary with 2 keys - "new_nodes" and "old_nodes";
                                    their values are the respective numbers of nodes (created vs. found)
        
nameargumentsreturns
add_data_node_with_linkscls, class_name = None, class_internal_id = None, properties = None, labels = None, links = None, assign_uri=False, new_uri=Noneint
        This is NeoSchema's counterpart of NeoAccess.create_node_with_links()

        Add a new data node, of the Class specified by its name,
        with the given (possibly none) attributes and label(s),
        optionally linked to other, already existing, DATA nodes.

        If the specified Class doesn't exist, or doesn't allow for Data Nodes, an Exception is raised.

        The new data node, if successfully created:
            1) will be given the Class name as a label, unless labels are specified
            2) will optionally be assigned an "uri" unique value
               that is either automatically assigned or passed.

        EXAMPLES:   add_data_node_with_links(class_name="Cars",
                                              properties={"make": "Toyota", "color": "white"},
                                              links=[{"internal_id": 123, "rel_name": "OWNED_BY", "rel_dir": "IN"}])

        :param class_name:  The name of the Class that this new data node is an instance of.
                                Also use to set a label on the new node, if labels isn't specified
        :param class_internal_id: OPTIONAL alternative to class_name.  If both specified,
                                class_internal_id prevails
        :param properties:  An optional dictionary with the properties of the new data node.
                                EXAMPLE: {"make": "Toyota", "color": "white"}
        :param labels:      OPTIONAL string, or list of strings, with label(s) to assign to the new data node;
                                if not specified, use the Class name.  
        :param links:       OPTIONAL list of dicts identifying existing nodes,
                                and specifying the name, direction and optional properties
                                to give to the links connecting to them;
                                use None, or an empty list, to indicate if there aren't any
                                Each dict contains the following keys:
                                    "internal_id"   REQUIRED - to identify an existing node
                                    "rel_name"      REQUIRED - the name to give to the link
                                    "rel_dir"       OPTIONAL (default "OUT") - either "IN" or "OUT" from the new node
                                    "rel_attrs"     OPTIONAL - A dictionary of relationship attributes

        :param assign_uri:  If True, the new node is given an extra attribute named "uri",
                                    with a unique auto-increment value, as well an extra attribute named "schema_code".
                                    Default is False

        :param new_uri:     Normally, the Item ID is auto-generated, but it can also be provided (Note: MUST be unique)
                                    If new_uri is provided, then assign_uri is automatically made True

        :return:                If successful, an integer with the internal database ID of the node just created;
                                    otherwise, an Exception is raised
        
nameargumentsreturns
update_data_nodecls, data_node :Union[int, str], set_dict :dict, drop_blanks = Trueint
        Update, possibly adding and/or dropping fields, the properties of an existing Data Node

        :param data_node:   Either an integer with the internal database ID, or a string with a URI value
        :param set_dict:    A dictionary of field name/values to create/update the node's attributes
                                (note: blanks ARE allowed within the keys)
        :param drop_blanks: If True, then any blank field is interpreted as a request to drop that property
                                (as opposed to setting its value to "")
        :return:            The number of properties set or removed;
                                if the record wasn't found, or an empty set_dict was passed, return 0
                                Important: a property is counted as "set" even if the new value is
                                           identical to the old value!
        
nameargumentsreturns
delete_data_nodecls, node_id=None, uri=None, class_node=None, labels=NoneNone
        Delete the given data node.
        If no node gets deleted, or if more than 1 get deleted, an Exception is raised

        :param node_id:     An integer with the internal database ID of an existing data node
        :param uri:         NOT IN CURRENT USE.  An alternate way to refer to the node.
        :param class_node:  NOT IN CURRENT USE.  Specify the Class to which this node belongs
        :param labels:      (OPTIONAL) String or list of strings.
                                If passed, each label must be present in the node, for a match to occur
                                (no problem if the node also includes other labels not listed here.)
                                Generally, redundant, as a precaution against deleting wrong node
        :return:            None
        
nameargumentsreturns
delete_data_pointcls, uri: str, labels=Noneint
        Delete the given data point.  DEPRECATED in favor of delete_data_node()

        :param uri:
        :param labels:      OPTIONAL (generally, redundant)
        :return:            The number of nodes deleted (possibly zero)
        
nameargumentsreturns
register_existing_data_nodecls, class_name="", schema_uri=None, existing_neo_id=None, new_uri=Noneint
        Register (declare to the Schema) an existing data node with the Schema Class specified by its name or ID.
        An uri is generated for the data node and stored on it; likewise, for a schema_code (if applicable).
        Return the newly-assigned uri

        EXAMPLES:   register_existing_data_node(class_name="Chemicals", existing_neo_id=123)
                    register_existing_data_node(schema_uri="schema-19", existing_neo_id=456)


        :param class_name:      The name of the Class that this new data node is an instance of
        :param schema_uri:      Alternate way to specify the Class; if both present, class_name prevails

        :param existing_neo_id: Internal ID to identify the node to register with the above Class.
        :param new_uri:         OPTIONAL. Normally, the Item ID is auto-generated,
                                    but it can also be provided (Note: MUST be unique)

        :return:                If successful, an integer with the auto-increment "uri" value of the node just created;
                                otherwise, an Exception is raised
        
nameargumentsreturns
add_data_relationship_hubcls, center_id :int, periphery_ids :[int], periphery_class :str, rel_name :str, rel_dir = "OUT"int
        Add a group of relationships between a single Data Node ("center")
        and each of the Data Nodes in the given list ("periphery"),
        with the specified relationship name and direction.

        All Data Nodes must already exist.
        All the "periphery" Data Nodes must belong to the same Class
            (whose name is passed by periphery_class)

        :param center_id:       Internal database ID of an existing Data Node
                                    that we wish to connect
                                    to all other Data Nodes specified in the next argument
        :param periphery_ids:   List of internal database IDs of existing Data Nodes,
                                    all belonging to the Class passed by the next argument
        :param periphery_class: The name of the common Class to which all the Data Nodes
                                    specified in periphery_ids belong to
        :param rel_name:        A string with the name to give to all the newly-created relationships
        :param rel_dir:         Either "IN" (towards the "center" node)
                                    or "OUT" (away from it, towards the "periphery" nodes)

        :return:                The number of relationships created
        
nameargumentsreturns
add_data_relationshipcls, from_id :int, to_id :int, rel_name :str, rel_props = NoneNone
        Simpler (and possibly faster) version of add_data_relationship_OLD()

        Add a new relationship with the given name, from one to the other of the 2 given data nodes,
        identified by their Neo4j ID's.

        The requested new relationship MUST be present in the Schema, or an Exception will be raised.

        Note that if a relationship with the same name already exists between the data nodes exists,
        nothing gets created (and an Exception is raised)

        :param from_id: The internal database ID of the data node at which the new relationship is to originate
        :param to_id:   The internal database ID of the data node at which the new relationship is to end
        :param rel_name:    The name to give to the new relationship between the 2 specified data nodes
                                IMPORTANT: it MUST match an existing relationship in the Schema,
                                           between the respective Classes of the 2 data nodes
        :param rel_props:   NOT YET IMPLEMENTED

        :return:            None.  If the specified relationship didn't get created (for example,
                                in case the the new relationship doesn't exist in the Schema), raise an Exception
        
nameargumentsreturns
remove_data_relationshipcls, from_uri :str, to_uri :str, rel_name :str, labels=NoneNone
        Drop the relationship with the given name, from one to the other of the 2 given DATA nodes.
        Note: the data nodes are left untouched.
        If the specified relationship didn't get deleted, raise an Exception

        :param from_uri:    String with the "uri" value of the data node at which the relationship originates
        :param to_uri:      String with the "uri" value of the data node at which the relationship ends
        :param rel_name:    The name of the relationship to delete
        :param labels:      OPTIONAL (generally, redundant).  Labels required to be on both nodes

        :return:            None.  If the specified relationship didn't get deleted, raise an Exception
        
nameargumentsreturns
remove_multiple_data_relationshipscls, node_id: Union[int, str], rel_name: str, rel_dir: str, labels=NoneNone
        Drop all the relationships with the given name, from or to the given data node.
        Note: the data node is left untouched.

        IMPORTANT: this function cannot be used to remove relationship involving any Schema node

        :param node_id:     The internal database ID (integer) or name (string) of the data node of interest
        :param rel_name:    The name of the relationship(s) to delete
        :param rel_dir:     Either 'IN', 'OUT', or 'BOTH'
        :param labels:      [OPTIONAL]
        :return:            None
        

DATA IMPORT

nameargumentsreturns
import_json_datacls, json_str: str, class_name: str, parse_only=False, provenance=NoneUnion[None, int, List[int]]
        Import the data specified by a JSON string into the database -
        but only the data that is described in the existing Schema;
        anything else is silently ignored.

        CAUTION: A "postorder" approach is followed: create subtrees first (with recursive calls), then create the root last;
        as a consequence, in case of failure mid-import, there's no top root, and there could be several fragments.
        A partial import might need to be manually deleted.

        :param json_str:    A JSON string representing (at the top level) an object or a list to import
        :param class_name:  Name of Schema class to use for the top-level element(s)
        :param parse_only:  Flag indicating whether to stop after the parsing (i.e. no database import)
        :param provenance:  Metadata (such as a file name) to store in the "source" attribute
                                of a special extra node ("Import Data")

        :return:
        
nameargumentsreturns
create_data_nodes_from_python_datacls, data, class_name: str, provenance=None[int]
        Import the data specified by the "data" python structure into the database -
        but only the data that is described in the existing Schema;
        anything else is silently ignored.
        For additional notes, see import_json_data()

        :param data:        A python dictionary or list, with the data to import
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param provenance:  Optional string to be stored in a "source" attribute
                                in a special "Import Data" node for metadata about the import

        :return:            List (possibly empty) of Neo4j ID's of the root node(s) created
        
nameargumentsreturns
create_tree_from_dictcls, d: dict, class_name: str, level=1, cache=NoneUnion[int, None]
        Add a new data node (which may turn into a tree root) of the specified Class,
        with data from the given dictionary:
            1) literal values in the dictionary are stored as attributes of the node, using the keys as names
            2) other values (such as dictionaries or lists) are recursively turned into subtrees,
               linked from the new data node through outbound relationships using the dictionary keys as names

        Return the Neo4j ID of the newly created root node,
        or None is nothing is created (this typically arises in recursive calls that "skip subtrees")

        IMPORTANT:  any part of the data that doesn't match the Schema,
                    gets silently dropped. 

        EXAMPLES:
        (1) {"state": "California", "city": "Berkeley"}
            results in the creation of a new node, with 2 attributes, named "state" and "city"

        (2) {"name": "Julian", "address": {"state": "California", "city": "Berkeley"}}
            results in the creation of 2 nodes, namely the tree root (with a single attribute "name"), with
            an outbound link named "address" to another node (the subtree) that has the "state" and "city" attributes

        (3) {"headquarter_state": [{"state": "CA"}, {"state": "NY"}, {"state": "FL"}]}
            results in the creation of a node (the tree root), with no attributes, and 3 links named "headquarter_state" to,
            respectively, 3 nodes - each of which containing a "state" attribute

        (4) {"headquarter_state": ["CA", "NY", "FL"]}
            similar to (3), above, but the children nodes will use the default attribute name "value"

        :param d:           A dictionary with data from which to create a tree in the database
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param level:       The level of the recursive call (used for debug printing)
        :return:            The Neo4j ID of the newly created node,
                                or None is nothing is created (this typically arises in recursive calls that "skip subtrees")
        
nameargumentsreturns
create_trees_from_listcls, l: list, class_name: str, level=1, cache=None[int]
        Add a set of new data nodes (the roots of the trees), all of the specified Class,
        with data from the given list.
        Each list elements MUST be a literal, or dictionary or a list:
            - if a literal, it first gets turned into a dictionary of the form {"value": literal_element};
            - if a dictionary, it gets processed by create_tree_from_dict()
            - if a list, it generates a recursive call

        Return a list of the Neo4j ID of the newly created nodes.

        IMPORTANT:  any part of the data that doesn't match the Schema,
                    gets silently dropped.

        EXAMPLE:
            If the Class is named "address" and has 2 properties, "state" and "city",
            then the data:
                    [{"state": "California", "city": "Berkeley"},
                     {"state": "Texas", "city": "Dallas"}]
            will give rise to 2 new data nodes with label "address", and each of them having a "SCHEMA"
            link to the shared Class node.

        :param l:           A list of data from which to create a set of trees in the database
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param level:       The level of the recursive call (used for debug printing)

        :return:            A list of the Neo4j values of the newly created nodes (each of which
                                might be a root of a tree)
        

EXPORT SCHEMA

nameargumentsreturns
export_schemacls{}
        Export all the Schema nodes and relationships as a JSON string.

        IMPORTANT:  APOC must be activated in the database, to use this function.
                    Otherwise it'll raise an Exception

        :return:    A dictionary specifying the number of nodes exported,
                    the number of relationships, and the number of properties,
                    as well as a "data" field with the actual export as a JSON string
        

URI

nameargumentsreturns
is_valid_uricls, uri :strbool
        Check the validity of the passed uri.
        If the uri belongs to a Schema node, a tighter check can be performed with is_valid_schema_uri()

        :param uri: A string with a value that is expected to be a uri of a node
        :return:    True if the passed uri has a valid value, or False otherwise
        
nameargumentsreturns
advance_autoincrementcls, namespace :str, advance=1int
        Utilize an ATOMIC database operation to both read AND advance the autoincrement counter,
        based on a (single) node that contains the label `Schema Autoincrement`
        as well as an attribute indicating the desired namespace (group);
        if no such node exists (for example, after a new installation), it gets created, and 1 is returned.

        Note that the returned number (or the last of an implied sequence of numbers, if advance > 1)
        is de-facto "permanently reserved" on behalf of the calling function,
        and can't be used by any other competing thread, thus avoid concurrency problems (racing conditions)

        :param namespace:   A string used to maintain completely separate groups of auto-increment values;
                                leading/trailing blanks are ignored
        :param advance:     Normally, auto-increment advances by 1 unit, but a different positive integer
                                may be used to "reserve" a group of numbers in the above namespace

        :return:            An integer that is a unique auto-increment for the specified namespace
                                (starting with 1); it's ready-to-use and "reserved", i.e. could be used
                                at any future time
        
nameargumentsreturns
reserve_next_uricls, namespace="data_node", prefix="", suffix=""str
        Generate and reserve a URI (or fragment thereof, aka "token"),
        using the given prefix and/or suffix;
        the middle part is a unique auto-increment value
        (separately maintained in various groups, or "namespaces")

        EXAMPLES:   reserve_next_uri("Documents", "doc.", ".new") might produce "doc.3.new"
                    reserve_next_uri("Images", prefix="i-") might produce "i-123"

        Prefixes and suffixes only need to be passed when first using a new namespace;
        if they're passed in later calls, they over-ride their stored counterparts.

        An ATOMIC database operation is utilized to both read AND advance the autoincrement counter,
        based on a (single) node with label `Schema Autoincrement`
        as well as an attribute indicating the desired namespace (group);
        if no such node exists (for example, after a new installation), it gets created,
        and 1 is used as the reserved autoincrement count.

        Note that the returned uri is de-facto "permanently reserved" on behalf of the calling function,
        and can't be used by any other competing thread, thus avoid concurrency problems (racing conditions)

        :param namespace:   A string used to maintain completely separate groups of auto-increment values;
                                leading/trailing blanks are ignored
        :param prefix:      (OPTIONAL) String to prefix to the auto-increment number.
                                If it's the 1st call for the given namespace, store it in the database;
                                otherwise, if a value is passed, use it to over-ride the stored one
        :param suffix:      (OPTIONAL) String to suffix to the auto-increment number
                                If it's the 1st call for the given namespace, store it in the database;
                                otherwise, if a value is passed, use it to over-ride the stored one

        :return:            An integer that is a unique auto-increment for the specified namespace
                                (starting with 1); it's ready-to-use and "reserved", i.e. could be used
                                at any future time
        

UTILITIES

nameargumentsreturns
debug_printcls, info: str, trim=FalseNone
        If the class' property "debug" is set to True,
        print out the passed info string,
        optionally trimming it, if too long

        :param info:
        :param trim:    (OPTIONAL) Flag indicating whether to only print a shortened version
        :return:        None