kgdata.wikidata.db#

Wikidata embedded key-value databases.

Functions

deserialize(cls, data)

get_entity_db(dbfile[, create_if_missing, ...])

get_entity_label_db(dbfile[, ...])

get_entity_redirection_db(dbfile[, ...])

get_entity_wikilinks_db(dbfile[, ...])

get_wdclass_db(dbfile[, create_if_missing, ...])

get_wdprop_db(dbfile[, create_if_missing, ...])

get_wdprop_domain_db(dbfile[, ...])

get_wdprop_range_db(dbfile[, ...])

get_wp2wd_db(dbpath[, create_if_missing, ...])

Mapping from wikipedia article's title to wikidata id

query_wikidata_entities(qnode_ids[, endpoint])

serialize(ent)

Classes

WDProxyDB

WikidataDB(database_dir)

Helper class to make it easier to load all Wikidata databases stored in a directory.

class WDProxyDB[source]#

Bases: RocksDBDict, HugeMutableMapping[str, V]

set_extract_ent_from_entity(func: Callable[[WDEntity], V])[source]#
Parameters:

func (Callable[[WDEntity], V]) –

get(k[, d]) D[k] if k in D, else d.  d defaults to None.[source]#
Parameters:
does_not_exist_locally(key: str)[source]#
Parameters:

key (str) –

cache()#

Return a new mapping that will cache the results to avoid calling to an external mapping

clear() None.  Remove all items from D.#
compact()#
deser_value#
get_int_property()#

Retrieves a RocksDB property’s value and cast it to an integer.

Full list of properties that return int values could be find [here](https://github.com/facebook/rocksdb/blob/08809f5e6cd9cc4bc3958dd4d59457ae78c76660/include/rocksdb/db.h#L428-L634).

items() a set-like object providing a view on D's items#
keys() a set-like object providing a view on D's keys#
options#
pop(k[, d]) v, remove specified key and return the corresponding value.#

If key is not found, d is returned if given, otherwise KeyError is raised.

popitem() (k, v), remove and return some (key, value) pair#

as a 2-tuple; but raise KeyError if D is empty.

seek_items()#

Seek to the first key that matches the entire prefix. From there, the itereator will continue to read pairs as long as the prefix extracted from key matches the prefix extracted from prefix.

Note: for this function to always iterate over keys that match the entire prefix, set options.prefix_extractor to the length of the prefix.

seek_keys()#

Seek to the first key that matches the entire prefix. From there, the itereator will continue to read pairs as long as the prefix extracted from key matches the prefix extracted from prefix.

Note: for this function to always iterate over keys that match the entire prefix, set options.prefix_extractor to the length of the prefix.

ser_value#
setdefault(k[, d]) D.get(k,d), also set D[k]=d if k not in D#
try_catch_up_with_primary()#
update([E, ]**F) None.  Update D from mapping/iterable E and F.#

If E present and has a .keys() method, does: for k in E: D[k] = E[k] If E present and lacks .keys() method, does: for (k, v) in E: D[k] = v In either case, this is followed by: for k, v in F.items(): D[k] = v

update_cache()#
values() an object providing a view on D's values#
query_wikidata_entities(qnode_ids: Union[Set[str], List[str]], endpoint: str = 'https://www.wikidata.org/w/api.php') Dict[str, WDEntity][source]#
Parameters:
Return type:

Dict[str, WDEntity]

serialize(ent: V) bytes[source]#
Parameters:

ent (V) –

Return type:

bytes

deserialize(cls: Type[V], data: Union[str, bytes]) V[source]#
Parameters:
Return type:

V

get_wdclass_db(dbfile: Union[Path, str], create_if_missing=True, read_only=False, proxy: bool = False) HugeMutableMapping[str, WDClass][source]#
Parameters:
Return type:

HugeMutableMapping[str, WDClass]

get_wdprop_db(dbfile: Union[Path, str], create_if_missing=True, read_only=False, proxy: bool = False) HugeMutableMapping[str, WDProperty][source]#
Parameters:
Return type:

HugeMutableMapping[str, WDProperty]

get_entity_db(dbfile: Union[Path, str], create_if_missing=True, read_only=False, proxy: bool = False) HugeMutableMapping[str, WDEntity][source]#
Parameters:
Return type:

HugeMutableMapping[str, WDEntity]

get_entity_redirection_db(dbfile: Union[Path, str], create_if_missing=False, read_only=True) HugeMutableMapping[str, str][source]#
Parameters:

dbfile (Union[Path, str]) –

Return type:

HugeMutableMapping[str, str]

get_entity_label_db(dbfile: Union[Path, str], create_if_missing=False, read_only=True) HugeMutableMapping[str, WDEntityLabel][source]#
Parameters:

dbfile (Union[Path, str]) –

Return type:

HugeMutableMapping[str, WDEntityLabel]

Parameters:

dbfile (Union[Path, str]) –

Return type:

HugeMutableMapping[str, WDEntityWikiLink]

get_wp2wd_db(dbpath: Union[Path, str], create_if_missing=False, read_only=True) HugeMutableMapping[str, str][source]#

Mapping from wikipedia article’s title to wikidata id

Parameters:

dbpath (Union[Path, str]) –

Return type:

HugeMutableMapping[str, str]

get_wdprop_range_db(dbfile: Union[Path, str], create_if_missing=False, read_only=True) HugeMutableMapping[str, Mapping[str, int]][source]#
Parameters:

dbfile (Union[Path, str]) –

Return type:

HugeMutableMapping[str, Mapping[str, int]]

get_wdprop_domain_db(dbfile: Union[Path, str], create_if_missing=False, read_only=True) HugeMutableMapping[str, Mapping[str, int]][source]#
Parameters:

dbfile (Union[Path, str]) –

Return type:

HugeMutableMapping[str, Mapping[str, int]]

class WikidataDB(database_dir: Union[str, Path])[source]#

Bases: object

Helper class to make it easier to load all Wikidata databases stored in a directory. The Wikidata database is expected to be stored in the directory under specific names.

It makes use of the functools.cached_property decorator to cache the database objects as attributes of the class after the first access.

Parameters:

database_dir (Union[str, Path]) –

instance = None#
property wdentities#
property wdclasses#
property wdprops#
property wdprop_domains#
property wdprop_ranges#
property wdredirections#
property wp2wd#
property wdpagerank#
wdattr(attr: Literal['aliases', 'instanceof']) HugeMutableMapping[str, list[str]][source]#
wdattr(attr: Literal['label', 'description']) HugeMutableMapping[str, str]
static init(database_dir: Union[str, Path]) WikidataDB[source]#
Parameters:

database_dir (Union[str, Path]) –

Return type:

WikidataDB

static get_instance() WikidataDB[source]#
Return type:

WikidataDB