jsonvectorizer.Schema

class jsonvectorizer.Schema

Class for learning a schema from JSON documents

Parameters:
schema : dict, optional (default={})

A valid JSON schema for initializing the object.

path : tuple of str, optional (default=(‘root’,))

Path from the top-most node (including the root) to this node.

tuple_items : bool, optional (default=False)

If True, JSON arrays are regarded as tuples with different schemas for each index, otherwise it is assumed that all items conform to the same schema.

Attributes:
path : tuple of str

Path from the top-most node (including the root) to this node.

tuple_items : bool

If True, JSON arrays are regarded as tuples with different schemas for each index, otherwise it is assumed that all items conform to the same schema.

type : set

Valid data types for documents conforming to the current schema.

required : set of str

Set of required properties for JSON objects (dictionaries).

properties : dict

Mapping between property names, and Schema instances corresponding to different properties in JSON objects (Python dictionaries).

items : list

Schema instances corresponding to different items in JSON arrays.

Methods

extend() Extend the schema to conform to the provided documents
find_nodes() Find nodes that match any of the provided regular expressions
extend()

Extend the schema to conform to the provided documents

Parameters:
docs : iterable object

Iterable containing JSON documents.

find_nodes()

Find nodes that match any of the provided regular expressions

Parameters:
patterns : str or list of str
Returns:
paths : list of tuple

List of paths for matching nodes. Each item is a tuple, containing the path from the top-most node (including the root) to a matching node.