NeoSchema Reference Guide

This guide is for version 5.0.0-beta.48+


Source code

Background Information: Using Schema in Graph Databases such as Neo4j

User Guide

Tutorial 1 : basic Schema operations (Classes, Properties, Data Nodes)

Tutorial 2 : set up a simple Schema (Classes, Properties) and perform a data import (Data Nodes and relationships among them)


Class NeoSchema


    A layer above the class NeoAccess (or, in principle, another library providing a compatible interface),
    to provide an optional schema to the underlying database.

    Schemas may be used to either:
        1) acknowledge the existence of typical patterns in the data
        OR
        2) to enforce a mold for the data to conform to

    MOTIVATION

        Relational databases are suffocatingly strict for the real world.
        Neo4j by itself may be too anarchic.
        A schema (whether "lenient/lax/loose" or "strict") in conjunction with Neo4j may be the needed compromise.

    GOALS

        - Data integrity
        - Data filtering upon import
        - Assist the User Interface
        - Self-documentation of the database
        - Graft into graph database some of the semantic functionality that some people turn to RDF for.
            However, carving out a new path rather than attempting to emulate RDF!



    OVERVIEW

        - "Class" nodes capture the abstraction of entities that share similarities.
          Example: "car", "star", "protein", "patient"

          In RDFS lingo, a "Class" node is the counterpart of a resource (entity)
                whose "rdf:type" property has the value "rdfs:Class"

        - The "Property" nodes linked to a given "Class" node, represent the attributes of the data nodes of that class

        - Data nodes are linked to their respective classes by a "SCHEMA" relationship.

        - Some classes contain an attribute named "code" that identifies the UI code to display/edit them [this might change!],
          as well as their descendants under the "INSTANCE_OF" relationships.
          Conceptually, the "code" is a relationship to an entity consisting of software code.

        - Class can be of the "S" (Strict) or "L" (Lenient) type.
            A "lenient" Class will accept data nodes with any properties, whether declared in the Class Schema or not;
            by contrast, a "strict" class will prevent data nodes that contains properties not declared in the Schema




    IMPLEMENTATION DETAILS

        - Every node used by this class, as well as the data nodes it manages,
          contains has a unique attribute "uri" (formerly "schema_id" and "item_id", respectively);
          note that this is actually a "token", i.e. a part of a URI - not a full URI.
          The uri's of schema nodes have the form "schema-n", where n is a unique number.
          Data nodes can have any unique uri's, with optional prefixes and suffixes chosen by the higher layers.
          The Schema layer manages the auto-increments for any desired set of namespaces (and itself makes use
          of the "schema_node" namespace)

        - The names of the Classes and Properties are stored in node attributes called "name".
          We also avoid calling them "label", as done in RDFS, because in Labeled Graph Databases
          like Neo4j, the term "label" has a very specific meaning, and is pervasively used.

        - For convenience, data nodes contain a label equal to their Class name


    AUTHOR:
        Julian West



    ----------------------------------------------------------------------------------
	MIT License

        Copyright (c) 2021-2024 Julian A. West and the BrainAnnex.org project

        This file is part of the "Brain Annex" project (https://BrainAnnex.org)

        Permission is hereby granted, free of charge, to any person obtaining a copy
        of this software and associated documentation files (the "Software"), to deal
        in the Software without restriction, including without limitation the rights
        to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
        copies of the Software, and to permit persons to whom the Software is
        furnished to do so, subject to the following conditions:

        The above copyright notice and this permission notice shall be included in all
        copies or substantial portions of the Software.

        THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
        IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
        FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
        AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
        LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
        OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
        SOFTWARE.
	----------------------------------------------------------------------------------
    
nameargumentsreturns
set_databasedb :NeoAccessNone
        IMPORTANT: this method MUST be called before using this class!!

        :param db:  Database-interface object, created with the NeoAccess library
        :return:    None
        

Schema CLASSES

nameargumentsreturns
assert_valid_class_nameclass_name: strNone
        Raise an Exception if the passed argument is not a valid Class name

        :param class_name:  A string with the putative name of a Schema Class
        :return:            None
        
nameargumentsreturns
is_valid_class_nameclass_name: strbool
        Return True if the passed argument is a valid Class name, or False otherwise

        :param class_name:  A string with the putative name of a Schema Class
        :return:            None
        
nameargumentsreturns
assert_valid_class_identifierclass_node :Union[int, str]None
        Raise an Exception is the argument is not a valid "identifier" for a Class node,
        meaning either a valid name or a valid internal database ID

        :param class_node:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :return:            None (an Exception is raised if the validation fails)
        
nameargumentsreturns
create_classname :str, code = None, strict = False, no_datanodes = False(int, str)
        Create a new Class node with the given name and type of schema,
        provided that the name isn't already in use for another Class.

        Return a pair with internal database ID,
        and the auto-incremented uri, assigned to the new Class.
        Raise an Exception if a class by that name already exists.

        NOTE: if you want to add Properties at the same time that you create a new Class,
              use the function create_class_with_properties() instead.

        :param name:        Name to give to the new Class
        :param code:        Optional string indicative of the software handler for this Class and its subclasses
        :param strict:      If True, the Class will be of the "S" (Strict) type;
                                otherwise, it'll be of the "L" (Lenient) type
                            Explained under the comments for the NeoSchema class

        :param no_datanodes If True, it means that this Class does not allow data node to have a "SCHEMA" relationship to it;
                                typically used by Classes having an intermediate role in the context of other Classes

        :return:            An (int, str) pair of integers with the internal database ID and the unique uri
                                assigned to the node just created, if it was created;
                                an Exception is raised if a class by that name already exists
        
nameargumentsreturns
get_class_internal_idclass_name :strint
        Returns the internal database ID of the Class node with the given name,
        or raise an Exception if not found, or if more than one is found.
        Note: unique Class names are assumed.

        :param class_name:  The name of the desired class
        :return:            The internal database ID of the specified Class
        
nameargumentsreturns
get_class_uriclass_name :strstr
        Returns the Schema uri of the Class with the given name;
        raise an Exception if not found

        :param class_name:  The name of the desired class
        :return:            The Schema uri of the specified Class
        
nameargumentsreturns
get_class_uri_by_internal_idinternal_class_id: intint
        Returns the Schema uri of the Class with the given internal database ID.

        :param internal_class_id:
        :return:            The Schema ID of the specified Class; raise an Exception if not found
        
nameargumentsreturns
class_neo_id_existsneo_id: intbool
        Return True if a Class by the given internal database ID already exists, or False otherwise

        :param neo_id:  Integer with internal database ID
        :return:        A boolean indicating whether the specified Class exists
        
nameargumentsreturns
class_uri_existsschema_uri :strbool
        Return True if a Class by the given uri already exists, or False otherwise

        :param schema_uri:  The uri of the Class node of interest
        :return:            True if the Class already exists, or False otherwise
        
nameargumentsreturns
class_name_existsclass_name: strbool
        Return True if a Class by the given name already exists, or False otherwise

        :param class_name:  The name of the class of interest
        :return:            True if the Class already exists, or False otherwise
        
nameargumentsreturns
get_class_name_by_schema_urischema_uri :strstr
        Returns the name of the class with the given Schema URI;
        raise an Exception if not found

        :param schema_uri:  A string uniquely identifying the desired Class
        :return:            The name of the Class with the given Schema uri
        
nameargumentsreturns
get_class_nameinternal_id: intstr
        Returns the name of the class with the given internal database ID,
        or raise an Exception if not found

        :param internal_id: An integer with the internal database ID
                                of the desired class
        :return:            The name of the class with the given Schema ID;
                                raise an Exception if not found
        
nameargumentsreturns
get_class_attributesclass_internal_id: intdict
        Returns all the attributes (incl. the name) of the Class node with the given internal database ID,
        or raise an Exception if the Class is not found.
        If no "name" attribute is found, an Exception is raised.

        :param class_internal_id:   An integer with the Neo4j ID of the desired class
        :return:                    A dictionary of attributed of the class with the given Schema ID;
                                        an Exception is raised if not found
                                        EXAMPLE:  {'name': 'MY CLASS', 'uri': '123', 'strict': False}
        
nameargumentsreturns
get_all_classesonly_names=True[str]
        Fetch and return a list of all the existing Schema classes - either just their names (sorted alphabetically)
        (TODO: or a fuller listing - not yet implemented)

        TODO: disregard capitalization in sorting

        :return:    A list of all the existing Class names
        
nameargumentsreturns
rename_classold_name :str, new_name :str, rename_data_fields=TrueNone
        Rename the specified Class.
        If the Class is not found, an Exception is raised

        :param old_name:            The current name (to be changed) of the Class of interest
        :param new_name:            The new name to give to the above Class
        :param rename_data_fields:  If True (default), the corresponding label in the data nodes of that Class
                                        is renamed as well
        :return:                    None
        
nameargumentsreturns
delete_classname: str, safe_delete=TrueNone
        Delete the given Class AND all its attached Properties.
        If safe_delete is True (highly recommended), then delete ONLY if there are no data nodes of that Class
        (i.e., linked to it by way of "SCHEMA" relationships.)

        :param name:        Name of the Class to delete
        :param safe_delete: Flag indicating whether the deletion is to be restricted to
                            situations where no data node would be left "orphaned".
                            CAUTION: if safe_delete is False,
                                     then data nodes may be left without a Schema
        :return:            None.  In case of no node deletion, an Exception is raised
        
nameargumentsreturns
is_strict_classname :strbool
        Return True if the given Class is of "Strict" type,
        or False otherwise (or if the information is missing).
        If no Class by that name exists, an Exception is raised

        :param name:    The name of a Schema Class node
        :return:        True if the Class is "strict" or False if not (i.e., if it's "lax")
        
nameargumentsreturns
is_strict_class_fastclass_internal_id: int, schema_cache=Nonebool
        Return True if the given Class is of "Strict" type,
        or False otherwise (or if the information is missing)

        :param class_internal_id:   The internal ID of a Schema Class node
        :param schema_cache:        (OPTIONAL) "SchemaCache" object
        :return:                    True if the Class is "strict" or False if not (i.e., if it's "lax")
        
nameargumentsreturns
allows_data_nodesclass_name = None, class_internal_id = None, schema_cache=Nonebool
        Determine if the given Class allows data nodes directly linked to it

        :param class_name:      Name of the Class
        :param class_internal_id :(OPTIONAL) Alternate way to specify the class; if both specified, this one prevails
        :param schema_cache:    (OPTIONAL) "SchemaCache" object
        :return:                True if allowed, or False if not
                                    If the Class doesn't exist, raise an Exception
        

RELATIONSHIPS AMONG CLASSES

nameargumentsreturns
assert_valid_relationship_namerel_name :strNone
        Raise an Exception if the passed argument is not a valid name for a database relationship

        :param rel_name:A string with the relationship (link) name whose validity we want to check
        :return:        None
        
nameargumentsreturns
create_class_relationshipfrom_class: Union[int, str], to_class: Union[int, str], rel_name="INSTANCE_OF", use_link_node=False, link_properties=NoneNone
        Create a relationship (provided that it doesn't already exist) with the specified name
        between the 2 existing Class nodes (identified by names or by their internal database IDs),
        in the ( from -> to ) direction.

        Note: multiple relationships by the same name between the same nodes are allowed by Neo4j,
              as long as the relationships differ in their attributes
              (but this method doesn't allow setting properties on the new relationship)

        :param from_class:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name.
                                Used to identify the node from which the new relationship originates.
        :param to_class:    Either an integer with the internal database ID of an existing Class node,
                                or a string with its name.
                                Used to identify the node to which the new relationship terminates.
        :param rel_name:    Name of the relationship to create, in the from -> to direction
                                (blanks allowed)
        :param use_link_node: If True, insert an intermediate "LINK" node in the newly-created
                                relationship; otherwise, simply create a direct link.
                                Note: if rel_name has the special value "INSTANCE_OF",
                                      this argument must be False
        :param link_properties: [OPTIONAL] List of Property names to attach to the newly-created link.
                                    Note: if link_properties is specified, then use_link_node is automatically True
        :return:            None
        
nameargumentsreturns
rename_class_relfrom_class: int, to_class: int, new_rel_namebool
        #### TODO: NOT IN CURRENT USE
        Rename the old relationship between the specified classes
        TODO: if more than 1 relationship exists between the given Classes,
              then they will all be replaced??  TO FIX!  (the old name ought be provided)

        :param from_class:
        :param to_class:
        :param new_rel_name:
        :return:            True if another relationship was found, and successfully renamed;
                            otherwise, False
        
nameargumentsreturns
delete_class_relationshipfrom_class: str, to_class: str, rel_nameint
        Delete the relationship(s) with the specified name
        between the 2 existing Class nodes (identified by their respective names),
        going in the from -> to direction direction.
        In case of error or if no relationship was found, an Exception is raised

        Note: there might be more than one - relationships with the same name between the same nodes
              are allowed, provided that they have different properties.
              If more than one is found, they will all be deleted.
              The number of relationships deleted will be returned

        :param from_class:  Name of one existing Class node (blanks allowed in name)
        :param to_class:    Name of another existing Class node (blanks allowed in name)
        :param rel_name:    Name of the relationship(s) to delete,
                                if found in the from -> to direction (blanks allowed in name)

        :return:            The number of relationships deleted.
                            In case of error, or if no relationship was found, an Exception is raised
        
nameargumentsreturns
unlink_classesclass1 :Union[int, str], class2 :Union[int, str]int
        Remove ALL relationships (in any direction) between the specified Classes

        :param class1:  Either the integer internal database ID, or name, to identify the first Class
        :param class2:  Either the integer internal database ID, or name, to identify the second Class
        :return:        The number of relationships deleted (possibly zero)
        
nameargumentsreturns
class_relationship_existsfrom_class: str, to_class: str, rel_namebool
        Return True if a relationship with the specified name exists between the two given Classes,
        in the specified direction.
        The Schema allows several scenarios:
            - A direct relationship from one Class node to the other
            - A relationship that goes thru an intermediary "LINK" node
            - Either of the 2 above scenarios, but between "ancestors" of the two nodes;
              "ancestors" are defined by means of following
              any number of "INSTANCE_OF" hops to other Class nodes

        SEE ALSO:  is_link_allowed()

        :param from_class:  Name of an existing Class node (blanks allowed in name)
        :param to_class:    Name of another existing Class node (blanks allowed in name)
        :param rel_name:    Name of the relationship(s) to delete,
                                if found in the from -> to direction (blanks allowed in name)
        :return:            True if the Class relationship exists, or False otherwise
        
nameargumentsreturns
get_class_instancesclass_name: str, leaf_only=False[str]
        Get the names of all Classes that are, directly or indirectly, instances of the given Class,
        i.e. pointing to that node thru a series of 1 or more "INSTANCE_OF" relationships;
        if leaf_only is True, then only as long as they are leaf nodes (with no other Class
        that is an instance of them.)

        :param class_name:  Name of the Class for which we want to find
                            other Classes that are an instance of it
        :param leaf_only:   If True, only return the leaf nodes (those that
                            don't have other Classes that are instances of them)
        :return:            A list of Class names
        
nameargumentsreturns
get_linked_class_namesclass_name: str, rel_name: str, enforce_unique=FalseUnion[str, List[str]]
        Given a Class, specified by its name, locate and return the name(s) of the other Class(es)
        that it's linked to by means of the relationship with the specified name.
        Typically, the result will contain no more than 1 name, but it could be more;
        it's probably a bad design to use the same relationship name to connect a class to multiple other classes
        (though currently allowed.)
        Relationships are followed in the OUTbound direction only.

        :param class_name:      Name of a Class in the schema
        :param rel_name:        Name of relationship to follow (in the OUTbound direction) from the above Class
        :param enforce_unique:  If True, it raises an Exception if the number of results isn't exactly one

        :return:                If enforce_unique is True, return a string with the class name;
                                otherwise, return a list of names (typically just one)
        
nameargumentsreturns
get_class_relationshipsclass_name :str, link_dir="BOTH", omit_instance=FalseUnion[dict, list]
        Fetch and return the names of all the relationships (both inbound and outbound)
        attached to the given Class.
        Treat separately the inbound and the outbound ones.
        If the Class doesn't exist, empty lists are returned.

        :param class_name:      The name of the desired Class
        :param link_dir:        Desired direction(s) of the relationships; one of "BOTH" (default), "IN" or "OUT"
        :param omit_instance:   If True, the common outbound relationship "INSTANCE_OF" is omitted

        :return:                If link_dir is "BOTH", return a dictionary of the form
                                    {"in": list of inbound-relationship names,
                                     "out": list of outbound-relationship names}
                                Otherwise, just return the inbound or outbound list, based on the value of link_dir
        
nameargumentsreturns
get_class_outbound_dataclass_neo_id :int, omit_instance=Falsedict
        Efficient all-at-once query to fetch and return the names of all the outbound relationship
        attached to the given Class, as well as the names of the other Classes on the other side of those links.

        IMPORTANT: it's probably a bad design to use the same relationship name to connect a class
        to multiple other classes.  Though currently allowed in the Schema, this particular method
        assumes - and enforces - uniqueness

        :param class_neo_id:    An integer to identify the desired Class
        :param omit_instance:   If True, the common outbound relationship "INSTANCE_OF" is omitted

        :return:                A (possibly empty) dictionary,
                                    where the keys are the name of outbound relationships,
                                    and the values are the names of the Class nodes on the other side of those links.
                                    An Exception will be raised if link names are not unique [though currently allowed by the Schema]
                                    EXAMPLE: {'IS_ATTENDED_BY': 'doctor', 'HAS_RESULT': 'result'}
        

CLASS PROPERTIES

nameargumentsreturns
get_class_propertiesclass_node: Union[int, str], include_ancestors=False, sort_by_path_len="ASC", exclude_system=False[str]
        Return the list of all the names of the Properties associated with the given Class
        (including those inherited thru ancestor nodes by means of "INSTANCE_OF" relationships,
        if include_ancestors is True),
        sorted by the schema-specified position (or, optionally, by path length)

        EXAMPLES:
            get_class_properties(class_node="Quote", include_ancestors=False)
                    => ['quote', 'attribution', 'notes']

            NeoSchema.get_class_properties(class_node="Quote", include_ancestors=True, exclude_system=False)
                    => ['quote', 'attribution', 'notes', 'uri']

            NeoSchema.get_class_properties(class_node="Quote", include_ancestors=True, sort_by_path_len="DESC", exclude_system=False)
                    => ['uri', 'quote', 'attribution', 'notes']

            NeoSchema.get_class_properties(class_node="Quote", include_ancestors=True, exclude_system=True)
                    => ['quote', 'attribution', 'notes']


        :param class_node:          Either an integer with the internal database ID of an existing Class node,
                                        or a string with its name
        :param include_ancestors:   If True, also include the Properties attached to Classes that are ancestral
                                    to the given one by means of a chain of outbound "INSTANCE_OF" relationships
                                    Note: the sorting by relationship index won't mean much if ancestral nodes are included,
                                          with their own indexing of relationships; if order matters in those cases, use the
                                          "sort_by_path_len" argument, below
        :param sort_by_path_len:    Only applicable if include_ancestors is True.
                                        If provided, it must be either "ASC" or "DESC", and it will sort the results by path length
                                        (either ascending or descending), before sorting by the schema-specified position for each Class.
                                        Note: with "ASC", the immediate Properties of the given Class will be listed first
        :param exclude_system:      [OPTIONAL] If True, Property nodes with the attribute "system" set to True will be excluded;
                                        default is False

        :return:                    A list of the Properties of the specified Class
                                        (including indirect Properties, if include_ancestors is True)
        
nameargumentsreturns
add_properties_to_classclass_node = None, class_uri = None, property_list = Noneint
        Add a list of Properties to the specified (ALREADY-existing) Class.
        The properties are given an inherent order (an attribute named "index", starting at 1),
        based on the order they appear in the list.
        If other Properties already exist, the existing numbering gets extended.

        NOTE: if the Class doesn't already exist, use create_class_with_properties() instead;
              attempting to add properties to an non-existing Class will result in an Exception

        :param class_node:      An integer with the internal database ID of an existing Class node,
                                    or a string with its name
        :param class_uri:       (OPTIONAL) String with the schema_uri of the Class to which attach the given Properties
                                TODO: remove

        :param property_list:   A list of strings with the names of the properties, in the desired order.
                                    Whitespace in any of the names gets stripped out.
                                    If any name is a blank string, an Exception is raised
                                    If the list is empty, an Exception is raised
        :return:                The number of Properties added
        
nameargumentsreturns
set_property_attributeclass_name :str, prop_name :str, attribute_name :str, attribute_valueNone
        Set an attribute on an existing "PROPERTY" node of the specified Class

        EXAMPLES:   set_property_attribute(class_name="Content Item", prop_name="uri",
                                           attribute_name="system", attribute_value=True)

                    set_property_attribute(class_name="User", prop_name="admin",
                                           attribute_name="dtype", attribute_value="boolean")
                    set_property_attribute(class_name="User", prop_name="user_id",
                                           attribute_name="dtype", attribute_value="integer")

                    set_property_attribute(class_name="User", prop_name="username",
                                           attribute_name="required", attribute_value=True)

        :param class_name:      The name of an existing CLASS node
        :param prop_name:       The name of an existing PROPERTY node
        :param attribute_name:  The name of an attribute (field) of the PROPERTY node
        :param attribute_value: The value to give to the above attribute (field) of the PROPERTY node;
                                    if a value was already set, it will be over-written
        :return:                None
        
nameargumentsreturns
create_class_with_propertiesname :str, properties :[str], code=None, strict=False, class_to_link_to=None, link_name="INSTANCE_OF", link_dir="OUT"(int, str)
        Create a new Class node, with the specified name, and also create the specified Properties nodes,
        and link them together with "HAS_PROPERTY" relationships.

        Return the internal database ID and the auto-incremented unique ID ("scheme ID") assigned to the new Class.
        Each Property node is also assigned a unique "schema ID";
        the "HAS_PROPERTY" relationships are assigned an auto-increment index,
        representing the default order of the Properties.

        If a class_to_link_to name is specified, link the newly-created Class node to that existing Class node,
        using an outbound relationship with the specified name.  Typically used to create "INSTANCE_OF"
        relationships from new Classes.

        If a Class with the given name already exists, nothing is done,
        and an Exception is raised.

        NOTE: if the Class already exists, use add_properties_to_class() instead

        :param name:            String with name to assign to the new class
        :param properties:      List of strings with the names of the Properties, in their default order (if that matters)
        :param code:            Optional string indicative of the software handler for this Class and its subclasses.
        :param strict:          If True, the Class will be of the "Strict" type;
                                    otherwise, it'll be of the "Lenient" type

        :param class_to_link_to: If this name is specified, and a link_to_name (below) is also specified,
                                    then create an OUTBOUND relationship from the newly-created Class
                                    to this existing Class
        :param link_name:       Name to use for the above relationship, if requested.  Default is "INSTANCE_OF"
        :param link_dir:        Desired direction(s) of the relationships: either "OUT" (default) or "IN"

        :return:                If successful, the pair (internal database ID, string "schema_uri" assigned to the new Class);
                                otherwise, raise an Exception
        
nameargumentsreturns
remove_property_from_classclass_uri :str, property_uri :strNone
        Take out the specified (single) Property from the given Class.
        If the Class or Property was not found, an Exception is raised

        :param class_uri:   The uri of the Class node
        :param property_uri:The uri of the Property node
        :return:            None
        
nameargumentsreturns
rename_propertyold_name :str, new_name :str, class_name :str, rename_data_fields=TrueNone
        Rename the specified (single) Property from the given Class.
        If the Class or Property is not found, an Exception is raised

        :param old_name:            The current name (to be changed) of the Property of interest
        :param new_name:            The new name to give to the above Property
        :param class_name:          The name of the Class node to which the Property is attached
        :param rename_data_fields:  If True (default), the field names in the data nodes of that Class
                                        are renamed as well (NOT YET IMPLEMENTED)
        :return:                    None
        
nameargumentsreturns
is_property_allowedproperty_name :str, class_name :strbool
        Return True if the given Property is allowed by the specified Class,
        or False otherwise.

        For a Property to be allowed, at least one of the following must hold:

            A) the Class isn't strict (i.e. every property is allowed)
        OR
            B) the Property has been registered with the Schema, for that Class
        OR
            C) the Property has been registered with the Schema, for an ancestral Class - reachable
               from our given Class thru a chain of "INSTANCE_OF" relationships

        It's permissible for the specified Class not to exist; in that case, False will be returned
        (TODO: may be better to raise an Exception in such cases!)

        :param property_name:   Name of a Property (i.e. a field name) whose permissibility
                                    we want to check
        :param class_name:      Name of a Class in the Schema
        :return:                True if the given Property is allowed by the specified Class,
                                    or False otherwise
        
nameargumentsreturns
is_link_allowedlink_name :str, from_class :str, to_class :strbool
        Return True if the given Link is allowed between the specified Classes (in the given direction),
        or False otherwise.

        For a Link to be allowed, at least one of the following must hold:

            A) BOTH of the Classes aren't strict (in which case any arbitrary link is allowed!)
        OR
            B) the Link has been registered with the Schema, for those Classes (possibly going thru intermediate "INSTANCE_OF" hops)

        Note: links being allowed is inherited from other Classes
              that are ancestors of the given Class thru "INSTANCE_OF" relationships

        If either of the specified Classes doesn't exist, an Exception is raised

        :param link_name:   Name of a Link (i.e. relationship) whose permissibility
                                we want to check
        :param from_class:  Name of a Class that we want to check whether the given Link can originate from
        :param to_class:    Name of a Class that we want to check whether the given Link can terminate into
        :return:            True if the given Link is allowed by the specified Classes,
                                or False otherwise
        
nameargumentsreturns
allowable_propsclass_internal_id: int, requested_props: dict, silently_drop: bool, schema_cache=Nonedict
        If any of the properties in the requested list of properties is not a declared (and thus allowed) Schema property,
        then:
            1) if silently_drop is True, drop that property from the returned pared-down list
            2) if silently_drop is False, raise an Exception

        :param class_internal_id:   The internal database ID of a Schema Class node
        :param requested_props:     A dictionary of properties one wishes to assign to a new data node, if the Schema allows
        :param silently_drop:       If True, any requested properties not allowed by the Schema are simply dropped;
                                        otherwise, an Exception is raised if any property isn't allowed
        :param schema_cache:        (OPTIONAL) "SchemaCache" object

        :return:                    A possibly pared-down version of the requested_props dictionary
        

SCHEMA CODE

nameargumentsreturns
get_schema_codeclass_name: strstr
        Obtain the "schema code" of a Class, specified by its name.
        The "schema code" is an optional but convenient text code,
        stored either on a Class node, or on any of its ancestors by way of "INSTANCE_OF" relationships

        :return:    A string with the Schema code (empty string if not found)
                    EXAMPLE: "i"
        
nameargumentsreturns
get_schema_urischema_code :strstr
        Get the Schema URI most directly associated to the given Schema Code

        :return:    A string with the Schema uri (or "" if not present)
        

DATA NODES READING

nameargumentsreturns
all_propertieslabel :str, primary_key_name :str, primary_key_value[str]
        Return the list of the *names* of all the Properties associated with the specified DATA node,
        based on the Schema it is associated with, sorted their by schema-specified position.
        The desired node is identified by specifying which one of its attributes is a primary key,
        and providing a value for it.

        IMPORTANT : this function returns the NAMES of the Properties; not their values

        :param label:
        :param primary_key_name:    A field name used to identify our desired Data Node
        :param primary_key_value:   The corresponding field value to identify our desired Data Node
        :return:                    A list of the names of the Properties associated
                                        with the given DATA node
        
nameargumentsreturns
get_data_node_internal_iduri :str, label=Noneint
        Returns the internal database ID of the given Data Node,
        specified by the value of its uri attribute
        (and optionally by a label)

        :param uri:     A string to identify a Data Node by the value of its "uri" attribute
        :param label:   (OPTIONAL) String to require the Data Node to have (redundant,
                            since "uri" already uniquely specifies a Data Node - but
                            could be used for speed or data integrity)

        :return:        The internal database ID of the specified Data Node;
                            if none (or more than one) found, an Exception is raised
        
nameargumentsreturns
get_data_node_idkey_value :str, key_name="uri"int
        Get the internal database ID of a Data Node, given some other primary key

        :param key_value:   The name of a primary key to use for the node lookup
        :param key_name:    The value of the above primary key
        :return:            The internal database ID of the specified Data Node
        
nameargumentsreturns
data_node_existsnode_id: Union[int, str], id_key=None, class_name=Nonebool
        Return True if the specified Data Node exists, or False otherwise.

        :param node_id:     Either an internal database ID or a primary key value
        :param id_key:      [OPTIONAL] Name of a primary key used to identify the data node; for example, "uri".
                                Leave blank to use the internal database ID
        :param class_name:  [OPTIONAL] Used for a stricter check
        :return:            True if the specified Data Node exists, or False otherwise
        
nameargumentsreturns
data_link_existsnode_1_id, node_2_id, rel_name :str, id_key=Nonebool
        Return True if the specified Data Link exists, or False otherwise.

        :return:            True if the specified Data Node link, or False otherwise
        
nameargumentsreturns
get_data_nodeclass_name :str, node_id, id_key=NoneUnion[dict, None]
        Locate a Data Node from its Class name, and a unique identifier

        :param class_name:  The name of the Schema Class that this Data Node is associated to
        :param node_id:     Either an internal database ID or a primary key value
        :param id_key:      OPTIONAL - name of a primary key used to identify the data node; for example, "uri".
                                Leave blank to use the internal database ID
        :return:
        
nameargumentsreturns
search_data_nodeuri = None, internal_id = None, labels=None, properties=NoneUnion[dict, None]
        Return a dictionary with all the key/value pairs of the attributes of given data node

        See also locate_node()

        :param uri:         The "uri" field to uniquely identify the data node
        :param internal_id: OPTIONAL alternate way to specify the data node;
                                if present, it takes priority
        :param labels:      OPTIONAL (generally redundant) ways to locate the data node
        :param properties:  OPTIONAL (generally redundant) ways to locate the data node

        :return:            A dictionary with all the key/value pairs, if node is found; or None if not
        
nameargumentsreturns
locate_nodenode_id: Union[int, str], id_type=None, labels=None, dummy_node_name="n"CypherMatch
        EXPERIMENTAL - a generalization of get_data_node()

        Return the "match" structure to later use to locate a node identified
        either by its internal database ID (default), or by a primary key (with optional label.)

        NOTE: No database operation is actually performed.

        :param node_id: This is understood be the Neo4j ID, unless an id_type is specified
        :param id_type: For example, "uri";
                            if not specified, the node ID is assumed to be Neo4j ID's
        :param labels:  (OPTIONAL) Labels - a string or list/tuple of strings - for the node
        :param dummy_node_name: (OPTIONAL) A string with a name by which to refer to the node (by default, "n")

        :return:        A "CypherMatch" object
        
nameargumentsreturns
get_all_data_nodes_of_classclass_name :strlist[dict]
        Return all the values stored all all the Data Nodes in the specified Class.
        The values comprise all node fields, the internal database ID and the node labels.

        EXAMPLE: [{'year': 2023, 'make': 'Ford', 'internal_id': 123, 'neo4j_labels': ['Motor Vehicle']},
                  {'year': 2013, 'make': 'Toyota', 'internal_id': 4, 'neo4j_labels': ['Motor Vehicle']}
                 ]

        :param class_name:  The name of a Class in the Schema
        :return:            A list of dicts; each list item contains data from a node
        
nameargumentsreturns
class_of_data_nodenode_id, id_key=None, labels=Nonestr
        Return the name of the Class of the given data node: identified
        either by its internal database ID (default), or by a primary key (such as "uri")
        with optional label)

        :param node_id:     Either an internal database ID or a primary key value
        :param id_key:      OPTIONAL - name of a primary key used to identify the data node; for example, "uri".
                                Leave blank to use the internal database ID
        :param labels:      Optional string, or list/tuple of strings, with internal database labels
                                (DEPRECATED)

        :return:            A string with the name of the Class of the given data node
        
nameargumentsreturns
data_nodes_of_classclass_name :str, return_option="uri"Union[List[str], List[int]]
        Return the uri's, or alternatively the internal database ID's,
        of all the Data Nodes of the given Class

        :param class_name:      Name of a Schema Class
        :param return_option:   Either "uri" or "internal_id"
        :return:                Return the uri's or internal database ID's
                                        of all the Data Nodes of the given Class
        
nameargumentsreturns
count_data_nodes_of_classclass_name: str[int]
        Return the count of all the Data Nodes attached to the given Class.
        If the Class doesn't exist, an Exception is raised

        :param class_name:  The name of the Schema Class of interest
        :return:            The count of all the Data Nodes attached to the given Class
        
nameargumentsreturns
data_nodes_lacking_schemalabel :str[dict]
        Locate and return all nodes with the given label
        that aren't associated to any Schema Class

        :label:     A string with a graph-database label
        :return:    A list containing a single dictionary, with key 'n';
                        the value is a dict with all the properties of the located nodes
        
nameargumentsreturns
follow_linksclass_name :str, node_id, link_name :str, id_key=None, properties=None, labels=NoneList
        From the given starting node, follow all the relationships that have the specified name,
        from/into neighbor nodes (optionally having the given labels),
        and return some of the properties of those found nodes.

        :param class_name:  String with the name of the Class of the given data node
        :param node_id:     Either an internal database ID or a primary key value
        :param link_name:   A string with the name of the link(s) to follow
        :param id_key:      [OPTIONAL] Name of a primary key used to identify the data node; for example, "uri";
                                use None to refer to the internal database ID
        :param properties:  [OPTIONAL] String, or list/tuple of strings, with the name(s)
                                of the properties to return on the found nodes;
                                if not specified, ALL properties are returned
        :param labels:      [OPTIONAL] string, or list/tuple of strings,
                                with node labels required to be present on the neighbor nodes
                                TODO: not currently in use

        :return:            A (possibly empty) list of values, if properties only contains a single element;
                                otherwise, a list of dictionaries
        

DATA NODES CREATE MODIFY

nameargumentsreturns
create_data_nodeclass_node :Union[int, str], properties = None, extra_labels = None, new_uri=None, silently_drop=FalseUnion[int, None]
        Create a single new data node, of the type indicated by specified Class,
        with the given (possibly None) properties, and optional extra label(s);
        the name of the Class is always used as a label.

        If the requested Class doesn't exist, an Exception is raised.

        CAUTION: no check is made whether another data node with identical fields already exists;
                 if that should be prevented, use add_data_node_merge() instead.

        The new data node, if successfully created, will optionally be assigned
        the passed URI value (new_uri) for its field `uri`.

        Note: the responsibility for picking a URI belongs to the calling function,
              which will typically make use of a namespace, and make use of reserve_next_uri()

        Alternatives:
            - If the data node needs to be created with links to other existing data nodes,
              use add_data_node_with_links() instead.
            - If creating multiple data nodes at once, consider using import_pandas_nodes()

        :param class_node:  Either an integer with the internal database ID of an existing Class node,
                                or a string with its name
        :param properties:  (OPTIONAL) Dictionary with the properties of the new data node.
                                EXAMPLE: {"make": "Toyota", "color": "white"}
        :param extra_labels:(OPTIONAL) String, or list/tuple of strings, with label(s) to assign to the new data node,
                                IN ADDITION TO the Class name (which is always used as label)
        :param new_uri:     (OPTIONAL)  If new_uri is provided, then a field called "uri"
                                is set to that value;
                                also, an extra attribute named "schema_code" gets set
                                (based on the Class to use for this Data Node);
                                this extra attribute might eventually get obsoleted
        :param silently_drop: If True, any requested properties not allowed by the Schema are simply dropped;
                                otherwise, an Exception is raised if any property isn't allowed
                                Note: only applicable for "Strict" schema - with a "Lenient" schema anything goes

        :return:            The internal database ID of the new data node just created, if created;
                                or None if not created
        
nameargumentsreturns
_prepare_data_node_labelsclass_name :str, extra_labels=None[str]
        Return a list of labels to use on a Data Node,
        given its Schema Class (whose name is always used as one of the labels)
        and an optional list of extra labels.

        The given Class name must be valid, but the Class does not need to exist yet.

        Any leading/trailing blanks in the extra labels are removed.  Duplicate names are ignored.

        :param class_name:      The name of a Schema Class
        :param extra_labels:    [OPTIONAL] Either a string, list/tuple of strings
        :return:
        
nameargumentsreturns
_create_data_node_helperclass_internal_id :int, labels=None, properties_to_set=None, uri_namespace=None, primary_key=None, duplicate_option=NoneUnion[int, None]
        Helper function, to (possibly) create a new data node, of the type indicated by specified Class,
        with the given label(s) and properties.

        IMPORTANT: all validations/schema checks are assumed to have been performed by the caller functions;
                   this is a private method not meant for the end user!

        :param class_internal_id:   The internal database ID of an existing Class node in the Schema
        :param labels:              String, or list/tuple of strings, with label(s)
                                        to assign to the new Data node,
                                        (note: the Class name is expected to be among the labels)
        :param properties_to_set:   [OPTIONAL] Dictionary with the properties of the new data node.
                                        EXAMPLE: {"make": "Toyota", "color": "white"}
        :param uri_namespace:       [OPTIONAL] String with a namespace to use to auto-assign a uri value on the new data node;
                                        if not passed, no uri value gets set on the new node
        :param primary_key:         [OPTIONAL] Name of a field that is to be regarded as a primary key
        :param duplicate_option:    Only applicable if primary_key is specified;
                                        if provided, must be "merge" or "replace"

        :return:                    If a new Data node gets created, return its internal database ID;
                                        otherwise (in case of a duplicate node already present) return None
        
nameargumentsreturns
add_data_node_mergeclass_name :str, properties :dict(int, bool)
        A new Data Node gets created ONLY IF there's no other Data Node
        containing the same specified properties (and possibly unspecified others),
        and attached to the given Class.

        An Exception is raised if any of the requested properties
        is not registered with the given Schema Class,
        or if that Class doesn't accept Data Nodes.

        :param class_name:  The Class node for the Data Node to locate, or create if not found
        :param properties:  A dictionary with the properties to look up the Data Node by,
                                or to give to a new one if an existing one wasn't found.
                                EXAMPLE: {"make": "Toyota", "color": "white"}

        :return:            A pair with:
                                1) The internal database ID of either an existing Data Node or of a new one just created
                                2) True if a new Data Node was created, or False if not (i.e. an existing one was found)
        
nameargumentsreturns
add_data_column_mergeclass_name :str, property_name: str, value_list: listdict
        Add a data column (i.e. a set of single-property data nodes).
        Individual nodes are created only if there's no other data node with the same property/value

        :param class_name:      The Class node for the Data Node to locate, or create if not found
        :param property_name:   The name of the data column (i.e. the name of the data field)
        :param value_list:      A list of values that make up the the data column
        :return:                A dictionary with 2 keys - "new_nodes" and "old_nodes";
                                    their values are the respective numbers of nodes (created vs. found)
        
nameargumentsreturns
add_data_node_with_linksclass_name = None, class_internal_id = None, properties = None, labels = None, links = None, assign_uri=False, new_uri=Noneint
        # TODO: eventually absorb into create_data_node()
        This is NeoSchema's counterpart of NeoAccess.create_node_with_links()

        Add a new data node, of the Class specified by its name,
        with the given (possibly none) attributes and label(s),
        optionally linked to other, already existing, DATA nodes.

        If the specified Class doesn't exist, or doesn't allow for Data Nodes, an Exception is raised.

        The new data node, if successfully created:
            1) will be given the Class name as a label, unless labels are specified
            2) will optionally be assigned an "uri" unique value
               that is either automatically assigned or passed.

        EXAMPLES:   add_data_node_with_links(class_name="Cars",
                                              properties={"make": "Toyota", "color": "white"},
                                              links=[{"internal_id": 123, "rel_name": "OWNED_BY", "rel_dir": "IN"}])

        TODO: verify the all the passed attributes are indeed properties of the class (if the schema is Strict)
        TODO: verify that required attributes are present
        TODO: verify that all the requested links conform to the Schema
        TODO: invoke special plugin-code, if applicable???
        TODO: maybe rename to add_data_node()

        :param class_name:  The name of the Class that this new data node is an instance of.
                                Also use to set a label on the new node, if labels isn't specified
        :param class_internal_id: OPTIONAL alternative to class_name.  If both specified,
                                class_internal_id prevails
                            TODO: merge class_name and class_internal_id into class_node, as done
                                  for create_data_node()
        :param properties:  An optional dictionary with the properties of the new data node.
                                EXAMPLE: {"make": "Toyota", "color": "white"}
        :param labels:      OPTIONAL string, or list of strings, with label(s) to assign to the new data node;
                                if not specified, use the Class name.  TODO: ALWAYS include the Class name, as done in create_data_node()
        :param links:       OPTIONAL list of dicts identifying existing nodes,
                                and specifying the name, direction and optional properties
                                to give to the links connecting to them;
                                use None, or an empty list, to indicate if there aren't any
                                Each dict contains the following keys:
                                    "internal_id"   REQUIRED - to identify an existing node
                                    "rel_name"      REQUIRED - the name to give to the link
                                    "rel_dir"       OPTIONAL (default "OUT") - either "IN" or "OUT" from the new node
                                    "rel_attrs"     OPTIONAL - A dictionary of relationship attributes

        :param assign_uri:  If True, the new node is given an extra attribute named "uri",
                                    with a unique auto-increment value.
                                    Default is False
                                    OBSOLETED

        :param new_uri:     Normally, the Item ID is auto-generated, but it can also be provided (Note: MUST be unique)
                                    If new_uri is provided, then assign_uri is automatically made True

        :return:                If successful, an integer with the internal database ID of the node just created;
                                    otherwise, an Exception is raised
        
nameargumentsreturns
update_data_nodedata_node :Union[int, str], set_dict :dict, drop_blanks = True, class_name=Noneint
        Update, possibly adding and/or dropping fields, the properties of an existing Data Node

        :param data_node:   Either an integer with the internal database ID, or a string with a URI value
        :param set_dict:    A dictionary of field name/values to create/update the node's attributes
                                (note: blanks ARE allowed within the keys)
                                Blanks at the start/end of string values are zapped
        :param drop_blanks: If True, then any blank field is interpreted as a request to drop that property
                                (as opposed to setting its value to "")
        :param class_name:  [OPTIONAL] The name of the Class to which the given Data Note is part of;
                                if provided, it gets enforced
        :return:            The number of properties set or removed;
                                if the record wasn't found, or an empty set_dict was passed, return 0
                                Important: a property is counted as "set" even if the new value is
                                           identical to the old value!
        
nameargumentsreturns
delete_data_nodesclass_name :strint
        Delete all the Data Nodes of the given Schema Class

        :param class_name:  The name of a Schema Class
        :return:            The number of deleted Data Nodes
        
nameargumentsreturns
delete_data_pointuri: str, labels=Noneint
        Delete the given data point.  TODO: obsolete in favor of delete_data_nodes()

        :param uri:
        :param labels:      OPTIONAL (generally, redundant)
        :return:            The number of nodes deleted (possibly zero)
        
nameargumentsreturns
register_existing_data_nodeclass_name="", schema_uri=None, existing_neo_id=None, new_uri=Noneint
        Register (declare to the Schema) an existing data node with the Schema Class specified by its name or ID.
        An uri is generated for the data node and stored on it.
        Return the newly-assigned uri

        EXAMPLES:   register_existing_data_node(class_name="Chemicals", existing_neo_id=123)
                    register_existing_data_node(schema_uri="schema-19", existing_neo_id=456)

        TODO: verify the all the passed attributes are indeed properties of the class (if the schema is Strict)
        TODO: verify that required attributes are present
        TODO: invoke special plugin-code, if applicable

        :param class_name:      The name of the Class that this new data node is an instance of
        :param schema_uri:      Alternate way to specify the Class; if both present, class_name prevails

        :param existing_neo_id: Internal ID to identify the node to register with the above Class.
                                TODO: expand to use the match() structure
        :param new_uri:     OPTIONAL. Normally, the Item ID is auto-generated,
                                but it can also be provided (Note: MUST be unique)

        :return:                If successful, an integer with the auto-increment "uri" value of the node just created;
                                otherwise, an Exception is raised
        
nameargumentsreturns
add_data_relationship_hubcenter_id :int, periphery_ids :[int], periphery_class :str, rel_name :str, rel_dir = "OUT"int
        Add a group of relationships between a single Data Node ("center")
        and each of the Data Nodes in the given list ("periphery"),
        with the specified relationship name and direction.

        All Data Nodes must already exist.
        All the "periphery" Data Nodes must belong to the same Class
            (whose name is passed by periphery_class)

        :param center_id:       Internal database ID of an existing Data Node
                                    that we wish to connect
                                    to all other Data Nodes specified in the next argument
        :param periphery_ids:   List of internal database IDs of existing Data Nodes,
                                    all belonging to the Class passed by the next argument
        :param periphery_class: The name of the common Class to which all the Data Nodes
                                    specified in periphery_ids belong to
        :param rel_name:        A string with the name to give to all the newly-created relationships
        :param rel_dir:         Either "IN" (towards the "center" node)
                                    or "OUT" (away from it, towards the "periphery" nodes)

        :return:                The number of relationships created
        
nameargumentsreturns
add_data_relationshipfrom_id, to_id, rel_name :str, rel_props = None, id_type=NoneNone
        Add a new relationship with the given name, from one to the other of the 2 given data nodes,
        identified by their Neo4j ID's.

        The requested new relationship MUST be present in the Schema, or an Exception will be raised.


        Note that if a relationship with the same name already exists between the data nodes exists,
        nothing gets created (and an Exception is raised)

        :param from_id: Either an internal database ID or a primary key value
                            of the data node at which the new relationship is to originate
        :param to_id:   Either an internal database ID or a primary key value
                            of the data node at which the new relationship is to end
        :param rel_name:The name to give to the new relationship between the 2 specified data nodes
                            IMPORTANT: it MUST be allowed by the Schema
        :param rel_props:TODO: not currently used.  Unclear what multiple calls would do in this case

        :param id_type: OPTIONAL - name of a primary key used to identify the data nodes; for example, "uri".
                            Leave blank to use the internal database ID's instead

        :return:            None.  If the specified relationship didn't get created (for example,
                                in case the the new relationship doesn't exist in the Schema), raise an Exception
        
nameargumentsreturns
remove_data_relationshipfrom_id :str, to_id :str, rel_name :str, id_type="uri", labels=NoneNone
        Drop the relationship with the given name, from one to the other of the 2 given DATA nodes.
        Note: the data nodes are left untouched.
        If the specified relationship didn't get deleted, raise an Exception

        :param from_id:     String with the "uri" value of the data node at which the relationship originates
        :param to_id:       String with the "uri" value of the data node at which the relationship ends
        :param rel_name:    The name of the relationship to delete
        :param id_type:     For now, only "uri" (default) is implemented
        :param labels:      OPTIONAL (generally, redundant).  Labels required to be on both nodes

        :return:            None.  If the specified relationship didn't get deleted, raise an Exception
        
nameargumentsreturns
remove_multiple_data_relationshipsnode_id: Union[int, str], rel_name: str, rel_dir: str, labels=NoneNone
        Drop all the relationships with the given name, from or to the given data node.
        Note: the data node is left untouched.

        IMPORTANT: this function cannot be used to remove relationship involving any Schema node

        :param node_id:     The internal database ID (integer) or name (string) of the data node of interest
        :param rel_name:    The name of the relationship(s) to delete
        :param rel_dir:     Either 'IN', 'OUT', or 'BOTH'
        :param labels:      [OPTIONAL]
        :return:            None
        

BULK DATA IMPORT

nameargumentsreturns
import_pandas_nodes_NO_BATCHdf :pd.DataFrame, class_name: str, class_node=None, select=None, drop=None, rename=None, primary_key=None, duplicate_option="merge", datetime_cols=None, int_cols=None, extra_labels=None, uri_namespace=None, report_frequency=100[int]
        OLD VERSION of the much-faster import_pandas_nodes(), largely obsoleted by it!

        Import a group of entities (records), from the rows of a Pandas dataframe,
        as Data Nodes in the database.

        Dataframe cells with NaN's and empty strings are dropped - and never make it into the database.

        Note: if you have a CSV file whose first row contains the field names, you can first do imports such as
                    df = pd.read_csv("C:/Users/me/some_name.csv", encoding = "ISO-8859-1")

        :param df:          A Pandas Data Frame with the data to import;
                                each row represents a record - to be turned into a graph-database node.
                                Each column represents a Property of the data node, and it must have been
                                previously declared in the Schema
        :param class_name:  The name of a Class node already present in the Schema
        :param class_node:  OBSOLETED

        :param select:      [OPTIONAL] Name of the field, or list of names, to import; all others will be ignored
                                (Note: original name prior to any rename, if applicable)
        :param drop:        [OPTIONAL] Name of a field, or list of names, to ignore during import
                                (Note: original name prior to any rename, if applicable)
                                If both arguments "select" and "drop" are passed, an Exception gets raised
        :param rename:      [OPTIONAL] dictionary to rename the Pandas dataframe's columns to
                                EXAMPLE {"current_name": "name_we_want"}

        :param primary_key: [OPTIONAL] Name of a field that is to be regarded as a primary key;
                                            any import of a record that is a duplicate in that field,
                                            will result in the modification of the existing record, rather than the creation of new one;
                                            the details of the modification are based on the argument `duplicate_option'
        :param duplicate_option:    Only applicable if primary_key is specified;
                                    if provided, must be "merge" (default) or "replace".
                                    Any field present in both the original (old) and the new (being imported) record will get over-written with the new value;
                                    any field present in the original record but not the new one
                                    will EITHER be left standing ("merge" option)
                                    or ditched ("replace" option)
                                    EXAMPLE: if the database contains the record  {'vehicle ID': 'c2', 'make': 'Toyota', 'year': 2013}
                                             then the import of                   {'vehicle ID': 'c2', 'make': 'BMW',    'color': 'white'}
                                             with a primary_key of 'vehicle ID', will result in NO new record addition;
                                             the existing record will transform into either
                                             (if duplicate_option is "merge"):
                                                    {'vehicle ID': 'c2', 'make': 'BMW', 'color': 'white', 'year':2013}
                                             (if duplicate_option is "replace"):
                                                    {'vehicle ID': 'c2', 'make': 'BMW', 'color': 'white'}
                                            Notice that the only difference between the 2 option
                                            is fields present in the original record but not in the imported one.

        :param datetime_cols:[OPTIONAL] String, or list/tuple of strings, of column name(s)
                                that contain datetime strings such as '2015-08-15 01:02:03'
                                (compatible with the python "datetime" format)
        :param int_cols:    [OPTIONAL] String, or list/tuple of strings, of column name(s)
                                that contain integers, or that are to be converted to integers
                                (typically necessary because numeric Pandas columns with NaN's
                                 are automatically turned into floats;
                                 this argument will cast them to int's, and drop the NaN's)
        :param extra_labels:[OPTIONAL] String, or list/tuple of strings, with label(s) to assign to the new Data nodes,
                                IN ADDITION TO the Class name (which is always used as label)
        :param uri_namespace:[OPTIONAL] String with a namespace to use to auto-assign uri values on the new Data nodes;
                                if that namespace hasn't previously been created with create_namespace() or with reserve_next_uri(),
                                a new one will be created with no prefix nor suffix (i.e. all uri's be numeric strings.)
                                If not passed, no uri values will get set on the new nodes
        :param report_frequency: [OPTIONAL] How often to print the status of the import-in-progress (default 100)

        :return:            A list of the internal database ID's of the newly-created Data nodes
        
nameargumentsreturns
import_pandas_nodesdf :pd.DataFrame, class_name: str, select=None, drop=None, rename=None, primary_key=None, duplicate_option="merge", datetime_cols=None, int_cols=None, extra_labels=None, report=True, report_frequency=1, max_batch_size=1000dict
        Import a group of entities (records), from the rows of a Pandas dataframe,
        as Data Nodes in the database.

        Dataframe cells with NaN's and empty strings are dropped - and never make it into the database.

        Note: if you have a CSV file whose first row contains the field names, you can first do imports such as
                    df = pd.read_csv("C:/Users/me/some_name.csv", encoding = "ISO-8859-1")

        :param df:          A Pandas Data Frame with the data to import;
                                each row represents a record - to be turned into a graph-database node.
                                Each column represents a Property of the data node, and it must have been
                                previously declared in the Schema
        :param class_name:  The name of a Class node already present in the Schema

        :param select:      [OPTIONAL] Name of the Pandas field, or list of names, to import; all others will be ignored
                                (Note: original name prior to any rename, if applicable)
        :param drop:        [OPTIONAL] Name of a Pandas field, or list of names, to ignore during import
                                (Note: original name prior to any rename, if applicable)
                                If both arguments "select" and "drop" are passed, an Exception gets raised
        :param rename:      [OPTIONAL] dictionary to rename the Pandas dataframe's column names to
                                EXAMPLE {"current_name": "name_we_want"}

        :param primary_key: [OPTIONAL] Name of a Pandas field that is to be regarded as a primary key;
                                            any import of a record that is a duplicate in that field,
                                            will result in the modification of the existing record, rather than the creation of new one;
                                            the details of the modification are based on the argument `duplicate_option'
                                            (Note: original name prior to any rename, if applicable)

        :param duplicate_option:    Only applicable if primary_key is specified;
                                    if provided, must be "merge" (default) or "replace".
                                    Any field present in both the original (old) and the new (being imported) record will get over-written with the new value;
                                    any field present in the original record but not the new one
                                    will EITHER be left standing ("merge" option)
                                    or ditched ("replace" option)
                                    EXAMPLE: if the database contains the record  {'vehicle ID': 'c2', 'make': 'Toyota', 'year': 2013}
                                             then the import of                   {'vehicle ID': 'c2', 'make': 'BMW',    'color': 'white'}
                                             with a primary_key of 'vehicle ID', will result in NO new record addition;
                                             the existing record will transform into either
                                             (if duplicate_option is "merge"):
                                                    {'vehicle ID': 'c2', 'make': 'BMW', 'color': 'white', 'year':2013}
                                             (if duplicate_option is "replace"):
                                                    {'vehicle ID': 'c2', 'make': 'BMW', 'color': 'white'}
                                            Notice that the only difference between the 2 option
                                            is fields present in the original record but not in the imported one.

        :param datetime_cols:   [OPTIONAL] String, or list/tuple of strings, of column name(s)
                                    that contain datetime strings such as '2015-08-15 01:02:03'
                                    (compatible with the python "datetime" format)
        :param int_cols:        [OPTIONAL] String, or list/tuple of strings, of column name(s)
                                    that contain integers, or that are to be converted to integers
                                    (typically necessary because numeric Pandas columns with NaN's
                                     are automatically turned into floats;
                                     this argument will cast them to int's, and drop the NaN's)
        :param extra_labels:    [OPTIONAL] String, or list/tuple of strings, with label(s) to assign to the new Data nodes,
                                    IN ADDITION TO the Class name (which is always used as label)
        :param report:          [OPTIONAL] If True (default), print the status of the import-in-progress
                                    at the end of each batch round
        :param report_frequency: [OPTIONAL] Only applicable if report is True

        :param max_batch_size:  To limit the number of Pandas rows loaded into the database at one time

        :return:                A dict with 2 keys:
                                    'number_nodes_created': the number of newly-created nodes
                                    'affected_nodes_ids'    list of the internal database ID's nodes that were created or updated,
                                                            in the import order ("updated" doesn't necessarily mean changed).
                                                            Note that ID's might occur more than once when the "primary_key" arg
                                                            is specified, because imports might then refer to existing,
                                                            or previously-created. nodes.
        
nameargumentsreturns
import_pandas_linksdf :pd.DataFrame, class_from :str, class_to :str, col_from :str, col_to :str, link_name :str, col_link_props=None, name_map=None, skip_errors = False, report_frequency=100[int]
        Import a group of relationships between existing database Data Nodes,
        from the rows of a Pandas dataframe, as database links between the existing Data Nodes.

        :param df:          A Pandas Data Frame with the data RELATIONSHIP to import
        :param class_from:  Name of the Class of the data nodes that the relationship originates from
        :param class_to:    Name of the Class of the data nodes that the relationship ends into
        :param col_from:    Name of the Data Frame column identifying the data nodes from which the relationship starts
                                (the values are expected to be foreign keys)
        :param col_to:      Name of the Data Frame column identifying the data nodes to which the relationship ends
                                (the values are expected to be foreign keys)
        :param link_name:   Name of the new relationship being created
        :param col_link_props: [OPTIONAL] Name of a property to assign to the relationships,
                                as well as name of the Data Frame column containing the values.
                                Any NaN values are ignored (no property set on that relationship.)
        :param name_map:    [OPTIONAL] Dict with mapping from Pandas column names
                                to Property names in the data nodes in the database
        :param skip_errors: [OPTIONAL] If True, the import continues even in the presence of errors;
                                default is False
        :param report_frequency: [OPTIONAL] How often to print out the status of the import-in-progress
                                    (in terms of number of imported links)

        :return:            A list of of the internal database ID's of the created links
        
nameargumentsreturns
scrub_dictd :dictdict
        Helper function to clean up data during imports.

        Given a dictionary, assemble and return a new dict where string values are trimmed of
        any leading or trailing blanks.
        Entries whose values are blank or NaN get omitted from the new dictionary being returned.

        EXAMPLE:    {"a": 1, "b": 3.5, "c": float("nan"), "d": "some value", "e": "   needs  cleaning!    ",
                     "f": "", "g": "            "}
                gets simplified to:
                    {"a": 1, "b": 3.5, "d": "some value", "e": "needs  cleaning!"  }

        :param d:   A python dictionary with data to "clean up"
        :return:    A python dictionary with the cleaned-up data
        
nameargumentsreturns
import_triplestoredf :pd.DataFrame, class_node :Union[int, str], col_names = None, uri_prefix = None, datetime_cols=None, int_cols=None, extra_labels=None, report_frequency=100[int]
        Import "triplestore" data from a Pandas dataframe that contains 3 columns called:
                subject , predicate , object

        The values of the "subject" column are used for identifying entities, and then turned into URI's.
        The values of the "predicate" column are taken to be the names of the Properties (possibly mapped
            by means of the dictionary "col_names"
        The values of the "object" column are taken to be the values (literals) of the Properties

        Note: "subject" and "predicate" is typically an integer or a string

        EXAMPLE -
            Panda's data frame:
             	subject 	 predicate 	  object
            0 	    57 	            1 	  Advanced Graph Databases
            1 	    57 	            2 	  New York University
            2 	    57 	            3 	  Fall 2024

            col_names = {1: "Course Title", 2: "School", 3: "Semester"}
            uri_prefix = "r-"

            The above will result in the import of a node with the following properties:

                {"uri": "r-57",
                "Course Title": "Advanced Graph Databases",
                "School": "New York University",
                "Semester": "Fall 2024"}

        :param df:              A Pandas dataframe that contains 3 columns called:
                                    subject , predicate , object
        :param class_node:      Either an integer with the internal database ID of an existing Class node,
                                    or a string with its name
        :param col_names:       [OPTIONAL] Dict with mapping from values in the "predicate" column of the data frame
                                           and the names of the new nodes' Properties
        :param uri_prefix:      [OPTIONAL] String to prefix to the values in the "subjec" column
        :param datetime_cols:   [SEE import_pandas_nodes()]
        :param int_cols:        [SEE import_pandas_nodes()]
        :param extra_labels:    [SEE import_pandas_nodes()]
        :param report_frequency:[SEE import_pandas_nodes()]

        :return:                A list of the internal database ID's of the newly-created Data nodes
        
nameargumentsreturns
import_json_datajson_str: str, class_name: str, parse_only=False, provenance=NoneUnion[None, int, List[int]]
        Import the data specified by a JSON string into the database -
        but only the data that is described in the existing Schema;
        anything else is silently ignored.

        CAUTION: A "postorder" approach is followed: create subtrees first (with recursive calls), then create the root last;
        as a consequence, in case of failure mid-import, there's no top root, and there could be several fragments.
        A partial import might need to be manually deleted.
        TODO: maintain a list of all created nodes - so as to be able to delete them all in case of failure.

        :param json_str:    A JSON string representing (at the top level) an object or a list to import
        :param class_name:  Name of Schema class to use for the top-level element(s)
        :param parse_only:  Flag indicating whether to stop after the parsing (i.e. no database import)
        :param provenance:  Metadata (such as a file name) to store in the "source" attribute
                                of a special extra node ("Import Data")

        :return:
        
nameargumentsreturns
create_data_nodes_from_python_datadata, class_name: str, provenance=None[int]
        Import the data specified by the "data" python structure into the database -
        but only the data that is described in the existing Schema;
        anything else is silently ignored.
        For additional notes, see import_json_data()

        :param data:        A python dictionary or list, with the data to import
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param provenance:  Optional string to be stored in a "source" attribute
                                in a special "Import Data" node for metadata about the import

        :return:            List (possibly empty) of internal database ID's of the root node(s) created

        TODO:   * The "Import Data" Class must already be in the Schema; should automatically add it, if not already present
                * DIRECTION OF RELATIONSHIP (cannot be specified by Python dict/JSON)
                * LACK OF "Import Data" node (ought to be automatically created if needed)
                * LACK OF "BA" (or "DATA"?) labels being set
                * INABILITY TO LINK TO EXISTING NODES IN DBASE (try using: "uri": some_int  as the only property in nodes to merge)
                * OFFER AN OPTION TO IGNORE BLANK STRINGS IN ATTRIBUTES
                * INTERCEPT AND BLOCK IMPORTS FROM FILES ALREADY IMPORTED
                * issue some report about any part of the data that doesn't match the Schema, and got silently dropped
        
nameargumentsreturns
create_tree_from_dictd: dict, class_name: str, level=1, cache=NoneUnion[int, None]
        Add a new data node (which may turn into a tree root) of the specified Class,
        with data from the given dictionary:
            1) literal values in the dictionary are stored as attributes of the node, using the keys as names
            2) other values (such as dictionaries or lists) are recursively turned into subtrees,
               linked from the new data node through outbound relationships using the dictionary keys as names

        Return the Neo4j ID of the newly created root node,
        or None is nothing is created (this typically arises in recursive calls that "skip subtrees")

        IMPORTANT:  any part of the data that doesn't match the Schema,
                    gets silently dropped.  TODO: issue some report about anything that gets dropped

        EXAMPLES:
        (1) {"state": "California", "city": "Berkeley"}
            results in the creation of a new node, with 2 attributes, named "state" and "city"

        (2) {"name": "Julian", "address": {"state": "California", "city": "Berkeley"}}
            results in the creation of 2 nodes, namely the tree root (with a single attribute "name"), with
            an outbound link named "address" to another node (the subtree) that has the "state" and "city" attributes

        (3) {"headquarter_state": [{"state": "CA"}, {"state": "NY"}, {"state": "FL"}]}
            results in the creation of a node (the tree root), with no attributes, and 3 links named "headquarter_state" to,
            respectively, 3 nodes - each of which containing a "state" attribute

        (4) {"headquarter_state": ["CA", "NY", "FL"]}
            similar to (3), above, but the children nodes will use the default attribute name "value"

        :param d:           A dictionary with data from which to create a tree in the database
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param level:       The level of the recursive call (used for debug printing)
        :return:            The Neo4j ID of the newly created node,
                                or None is nothing is created (this typically arises in recursive calls that "skip subtrees")
        
nameargumentsreturns
create_trees_from_listl: list, class_name: str, level=1, cache=None[int]
        Add a set of new data nodes (the roots of the trees), all of the specified Class,
        with data from the given list.
        Each list elements MUST be a literal, or dictionary or a list:
            - if a literal, it first gets turned into a dictionary of the form {"value": literal_element};
            - if a dictionary, it gets processed by create_tree_from_dict()
            - if a list, it generates a recursive call

        Return a list of the Neo4j ID of the newly created nodes.

        IMPORTANT:  any part of the data that doesn't match the Schema,
                    gets silently dropped.  TODO: issue some report about that

        EXAMPLE:
            If the Class is named "address" and has 2 properties, "state" and "city",
            then the data:
                    [{"state": "California", "city": "Berkeley"},
                     {"state": "Texas", "city": "Dallas"}]
            will give rise to 2 new data nodes with label "address", and each of them having a "SCHEMA"
            link to the shared Class node.

        :param l:           A list of data from which to create a set of trees in the database
        :param class_name:  The name of the Schema Class for the root node(s) of the imported data
        :param level:       The level of the recursive call (used for debug printing)

        :return:            A list of the Neo4j values of the newly created nodes (each of which
                                might be a root of a tree)
        

IMPORT EXPORT SCHEMA

nameargumentsreturns
create_schema_from_sample_datamatch
        Create a Schema from sample data node, for example as created with the Arrow app
        TODO: NOT YET COMPLETED.  NOT FOR PRODUCTION

        :param match:   # Maybe allow a label, or range of ID's, instead
        :return:
        
nameargumentsreturns
export_schemacls{}
        Export all the Schema nodes and relationships as a JSON string.

        IMPORTANT:  APOC must be activated in the database, to use this function.
                    Otherwise it'll raise an Exception

        :return:    A dictionary specifying the number of nodes exported,
                    the number of relationships, and the number of properties,
                    as well as a "data" field with the actual export as a JSON string
        

URI

nameargumentsreturns
is_valid_uriuri :strbool
        Check the validity of the passed uri.
        If the uri belongs to a Schema node, a tighter check can be performed with is_valid_schema_uri()

        :param uri: A string with a value that is expected to be a uri of a node
        :return:    True if the passed uri has a valid value, or False otherwise
        
nameargumentsreturns
is_valid_schema_urischema_uri :strbool
        Check the validity of the passed Schema uri.
        It should be of the form "schema-n" for some integer n
        To check the validity of the uri of a Data node rather than a Schema node,
        use is_valid_uri() instead

        :param schema_uri:  A string with a value that is expected to be a uri of a Schema node
        :return:            True if the passed uri has a valid value, or False otherwise
        
nameargumentsreturns
assign_uriinternal_id :int, namespace="data_node"str
        Given an existing Data Node that lacks a URI value, assign one to it (and save it in the database.)
        If a URI value already exists on the node, an Exception is raised

        :param internal_id: Internal database ID to identify a Data Node tha currently lack a URI value
        :param namespace:   A string used to maintain completely separate groups of auto-increment values;
                                leading/trailing blanks are ignored
        :return:            A string with the newly-assigned URI value
        
nameargumentsreturns
create_namespacename :str, prefix="", suffix=""None
        Set up a new namespace for URI's.

        :param name:    A string used to maintain completely separate groups of auto-increment values;
                            leading/trailing blanks are ignored
        :param prefix:  (OPTIONAL) String to prefix to the auto-increment number;
                            it will be stored in the database
        :param suffix:  (OPTIONAL) String to suffix to the auto-increment number;
                            it will be stored in the database
        :return:        None
        
nameargumentsreturns
namespace_existsname :strbool
        Return True if the specified namespace already exists, or False otherwise

        :param name:
        :return:
        
nameargumentsreturns
reserve_next_urinamespace="data_node", prefix="", suffix=""str
        Generate and reserve a URI (or fragment thereof, aka "token"),
        using the given namespace and, optionally the given prefix and/or suffix.

        The middle part of the generated URI is a unique auto-increment value
        (separately maintained for various groups, or "namespaces").

        If the requested namespace is not the default one, make sure to first create it
        with create_namespace()

        If no prefix or suffix is specified, use the values provided when the namespace
        was first created.

        EXAMPLES:   reserve_next_uri("Document", "doc.", ".new") might produce "doc.3.new"
                    reserve_next_uri("Image", prefix="i-") might produce "i-123"

        IMPORTANT: Prefixes and suffixes only need to be passed when first creating a new namespace;
                   if they're passed in here, they over-ride their stored counterparts.

        Note that the returned uri is de-facto "permanently reserved" on behalf of the calling function,
        and can't be used by any other competing thread, thus avoid concurrency problems (racing conditions)

        :param namespace:   A string used to maintain completely separate groups of auto-increment values;
                                leading/trailing blanks are ignored.
                                It must exist, unless the default value is accepted (in which case,
                                it gets created as needed)
        :param prefix:      (OPTIONAL) String to prefix to the auto-increment number.
                                If it's the 1st call for the given namespace, store it in the database;
                                otherwise, if a value is passed, use it to over-ride the stored one
        :param suffix:      (OPTIONAL) String to suffix to the auto-increment number
                                If it's the 1st call for the given namespace, store it in the database;
                                otherwise, if a value is passed, use it to over-ride the stored one

        :return:            A string (with the prefix and suffix from above) that contains an integer
                                that is a unique auto-increment for the specified namespace
                                (starting with 1); it's ready-to-use and "reserved", i.e. could be used
                                at any future time
        
nameargumentsreturns
advance_autoincrementnamespace :str, advance=1(int, str, str)
        Utilize an ATOMIC database operation to both read AND advance the autoincrement counter,
        based on a (single) node that:
            1) contains the label `Schema Autoincrement`
            2) and also contains, as an attribute, the desired namespace (group);
        if no such node exists (for example, after a new installation), an Exception is  raised.

        An ATOMIC database operation is utilized to both read AND advance the autoincrement counter,
        based on a (single) node with label `Schema Autoincrement`
        as well as an attribute indicating the desired namespace (group)

        Note that the returned number (or the last of an implied sequence of numbers, if advance > 1)
        is de-facto "permanently reserved" on behalf of the calling function,
        and can't be used by any other competing thread, thus avoid concurrency problems (racing conditions)

        :param namespace:   A string used to maintain completely separate groups of auto-increment values;
                                leading/trailing blanks are ignored
        :param advance:     Normally, auto-increment advances by 1 unit, but a different positive integer
                                may be used to "reserve" a group of numbers in the above namespace

        :return:            An integer that is a unique auto-increment for the specified namespace
                                (starting with 1); it's ready-to-use and "reserved", i.e. could be used
                                at any future time.
                                If advance > 1, the first of the reserved numbers is returned
        
nameargumentsreturns
_next_available_schema_uriclsstr
        Return the next available uri for nodes managed by this class.
        For unique uri's to use on Data Nodes, use reserve_next_uri() instead

        :return:     A string based on unique auto-increment values, used for Schema nodes
        
nameargumentsreturns
assign_namespace_to_classclass_name :str, namespace :strNone
        Link up a Class node to the node of a namespace to be used for data nodes of that Class

        :param class_name:
        :param namespace:
        :return:            None
        
nameargumentsreturns
lookup_class_namespaceclass_name :strUnion[str, None]
        Look up the namespace, if any, assigned to the given Class,
        by means of a standard "HAS_URI_GENERATOR" relationship.
        If not found, return None

        :param class_name:  Name of a Schema Class
        :return:
        
nameargumentsreturns
generate_uriclass_name :strstr
        Use, as appropriate for the given Class,
        a specific namespace - or the general data node namespace - to generate a URI
        to use on a newly-create Data Node

        :param class_name:  Name of a Schema Class
        :return:
        

UTILITIES

nameargumentsreturns
debug_printinfo: str, trim=FalseNone
        If the class' property "debug" is set to True,
        print out the passed info string,
        optionally trimming it, if too long

        :param info:
        :param trim:    (OPTIONAL) Flag indicating whether to only print a shortened version
        :return:        None