Wikidata’s knowledge graph can be turned into a source of training data for mapping fine-grained entities to types. We apply its instance of relation recursively to determine the set of types for any given entity — for example, any descendent node of the human node has type human. Wikipedia can also provide entity-to-type mapping through its category link .

Wikipedia-internal link statistics provide a good estimate of the chance a particular phrase refers to some entity. However, this is noisy since Wikipedia will often link to specific instance of a type rather than the type itself (anaphora — e.g. king → Charles I of England) or link from a nickname (metonymy). This results in an explosion of associated entities (e.g. king has 974 associated entities) and distorted link frequencies (e.g. queen links to the band Queen 4920 times, Elizabeth II 1430 times, and monarch only 32 times).

The easiest approach is to prune rare links, but this loses information. We instead use the Wikidata property graph to heuristically turn links into their “generic” meaning, as illustrated below.