Xueling LIN (Sherry, 林雪玲)

AI Framework and Data Technology Lab,
Hong Kong Research Center, Huawei

Email: xlinai [-at-] connect.ust.hk
  • Since September 2021, I worked as a Researcher in The AI Framework and Data Technology Lab in Hong Kong Research Center, Huawei.
  • In July 2021, I received my Ph.D. degree from Department of Computer Science and Engineering (CSE) in Hong Kong University of Science and Technology (HKUST), where I worked on knowledge base refinement, datafusion and truth discevery in our Knowledge Base Group, advised by Prof. Lei Chen.
  • Before my Ph.D. journey, I obtained my M.Phil degree in Computer Science and Engineering from HKUST, and received my Bachelor degree in Software Engineering from Sun Yat-sen University.
  • Please refer to my GitHub repository for more details about my research topics.
  • Publications


    Teaching Assistant

    Datasets for Truth Discovery (VLDB 2018)

    I have collected two datasets for my research in truth discorvery. You can download both datasets and the groundtruths via this link. The details of both datasets is listed as follows:

    Datasets for Canonicalization of Open Knowledge Bases (ICDE 2019)

    I use two major datasets for my research in canonicalization of open knowledge bases. The details of both datasets and the side information are listed as follows:

    • ReVerb45K: This is a new Open KB canonicalization dataset proposed by CESI and has been published by the authors. ReVerb45K is constructed based on Reverb Open KB, Clueweb09 corpus, as well as Freebase entity linking information.
    • NYTimes2018: We collect this dataset from nytimes.com in 2018. This dataset contains news articles from 5 different domains, including sports, arts, business, science and health. We collect 500 articles and apply Stanford Open IE Tool on this article to produce Open IE triples.
    • Side Information: For both datasets, we obtain the side information for each source text as follows. First, we apply NLTK to recognize the named entity mentions (with PERSON, ORGANIZATION, LOCATION... as the types) in the source text. We then use Wikidata Integrator to link each named entity mention to a list of candidate entities in Wikidata.
    • More details can be founded here.

    My Personal Life