kgdata.wikipedia.datasets.articles#

Functions

articles()

Extract articles from XML dumps

deser_page_article(line)

iter_from_dump(infile)

articles() Dataset[WikiPageArticle][source]#

Extract articles from XML dumps

Return type:

Dataset[WikiPageArticle]

deser_page_article(line: Union[str, bytes]) WikiPageArticle[source]#
Parameters:

line (Union[str, bytes]) –

Return type:

WikiPageArticle

iter_from_dump(infile: Union[BZ2File, GzipFile, BinaryIO])[source]#
Parameters:

infile (Union[BZ2File, GzipFile, BinaryIO]) –