jsonvectorizer.Schema¶
-
class
jsonvectorizer.
Schema
¶ Class for learning a schema from JSON documents
Parameters: - schema : dict, optional (default={})
A valid JSON schema for initializing the object.
- path : tuple of str, optional (default=(‘root’,))
Path from the top-most node (including the root) to this node.
- tuple_items : bool, optional (default=False)
If True, JSON arrays are regarded as tuples with different schemas for each index, otherwise it is assumed that all items conform to the same schema.
Attributes: - path : tuple of str
Path from the top-most node (including the root) to this node.
- tuple_items : bool
If True, JSON arrays are regarded as tuples with different schemas for each index, otherwise it is assumed that all items conform to the same schema.
- type : set
Valid data types for documents conforming to the current schema.
- required : set of str
Set of required properties for JSON objects (dictionaries).
- properties : dict
Mapping between property names, and
Schema
instances corresponding to different properties in JSON objects (Python dictionaries).- items : list
Schema
instances corresponding to different items in JSON arrays.
Methods
extend
()Extend the schema to conform to the provided documents find_nodes
()Find nodes that match any of the provided regular expressions -
extend
()¶ Extend the schema to conform to the provided documents
Parameters: - docs : iterable object
Iterable containing JSON documents.
-
find_nodes
()¶ Find nodes that match any of the provided regular expressions
Parameters: - patterns : str or list of str
Returns: - paths : list of tuple
List of paths for matching nodes. Each item is a tuple, containing the path from the top-most node (including the root) to a matching node.