Human-Annotated Sense-Disambiguated Word Contexts for Russian

  • Oct 30, 2018

We release a subset of the bts-rnc dataset which was annotated using crowdsourcing. The dataset is available on Zenodo via doi:10.5281/zenodo.1117228.

DOI

This dataset contains human-annotated sense identifiers for 2562 contexts of 20 words used in the RUSSE’2018 shared task on Word Sense Induction and Disambiguation for the Russian language. These sense identifiers are disambiguated as according to the sense inventory of the Large Explanatory Dictionary of Russian.

The organizers’ paper is published in the proceedings of Dialogue 2018. Please cite it as follows:

@inproceedings{Panchenko:18:dialogue,
  author    = {Panchenko, Alexander and Lopukhina, Anastasia and Ustalov, Dmitry and Lopukhin, Konstantin and Arefyev, Nikolay and Leontyev, Alexey and Loukachevitch, Natalia},
  title     = {{RUSSE'2018: A Shared Task on Word Sense Induction for the Russian Language}},
  booktitle = {Computational Linguistics and Intellectual Technologies: Papers from the Annual International Conference ``Dialogue''},
  year      = {2018},
  pages     = {547--564},
  url       = {http://www.dialog-21.ru/media/4539/panchenkoaplusetal.pdf},
  address   = {Moscow, Russia},
  publisher = {RSUH},
  issn      = {2221-7932},
  language  = {english},
}
Share