API Reference
Database Classes
A hierarchical soft-clustering database for thematic search. |
|
A hierarchical soft clustering structure storing inclusion strengths as uint8 sparse matrices (0-255, divide by 255 to recover floats). |
Query Classes
|
Entry point for all queries on a TopicDatabase. |
|
A query object carrying a set of topic indices. |
|
A query object carrying a set of document indices. |
|
A query object operating on the entire soft cluster matrix at once. |
Utility Functions
- thematic_search.utilities.cluster_layers_from_leaf_matrix(cluster_tree: dict[tuple, list[tuple]], matrix: ndarray) list[ndarray]
Given a cluster_tree and a matrix of inclusion strengths for the 0th layer, compute a set of cluster layers by summing over the children of each node.
- thematic_search.utilities.convert_tree(tree: dict, layers: dict[any, int] = {}) dict[tuple, list[tuple]]
Given an tree in the form of a dictionary containing vertex:[list of children], convert it to a cluster_tree.
- Parameters:
tree (dict) – The tree to convert. Must have pairs vertex:[list of children]
layers (dict) – Custom layer assignment dictionary of the form vertex:layer. If not specified, leaves are assigned layer=0 and all other nodes are assigned layer=max_layer_of_children+1
- thematic_search.utilities.print_subtree(node: tuple, cluster_tree: dict[tuple, list[tuple]], cluster_labels: dict[tuple, str] = {}, depth: int = 0)
Print the subtree of a node in a cluster_tree.
- Parameters:
node (tuple,) – A key in cluster_tree to print the subtree of.
cluster_tree (dict[tuple, list[tuple]],) – The cluster tree to print,
cluster_labels (dict[tuple, str], (optional, default={})) – A dictionary containing display names for the
- thematic_search.utilities.print_tree(cluster_tree: dict[tuple, list[tuple]], cluster_labels={})
Print the cluster tree to the console.
- Parameters:
cluster_tree (dict[tuple, list[tuple]],) – The cluster tree to print,
cluster_labels (dict[tuple, str], (optional, default={})) – A dictionary containing display names for the
- thematic_search.utilities.topic_uid(tup: tuple) str
Given a tuple (layer, cluster_number) returns its UID string.
- thematic_search.utilities.uid_to_ints(s: str) tuple
Given a UID s, returns (layer, cluster_number)