OKBQA-7 Tasks: Description

Schedule is here.

OKBQA-7 Hackathon

2018.08.07 - 2018.08.08

Task 1. OKBQA Platform


As research in the field of artificial intelligence has been activated, studies have been attracting more attention to building humanized knowledge as a structured knowledge base for understanding machines and utilizing such a knowledge base. These knowledge bases are used for improved information retrieval systems such as the Google Knowledge Graph or as a foundation for the services of AI agents like Apple Siri, Amazon Echo, and IBM Watson.

OKBQA (Open Knowledge Base and Question Answering) is a community aimed at constructing such a knowledge base and a question and answer system using a knowledge base, especially supporting the architecture and platform for resource disclosure and integration and collaboration. OKBQA has been working with Hackathon in 2014 and international exchanges such as Colling and SIGIR to exchange technology with domestic and international experts and build systems.

In OKBQA Hackathon in 2018, Task 1. The Open KB-based QA task aims to:

    • Practice and share how to publish and use OKBQA frameworks and modules

    • Opportunity and method to contribute to OKBQA system and evaluate it

OKBQA-7 Hackathon

2018.08.07 - 2018.08.08

Task 2. Dialog Corpus and Evaluation


Along with interest in AI research, studies and application development for AI Conversation Agents such as Amazon Echo, Apple Siri, and Google Assistant are also being sparked. These conversation agents focus on task-oriented conversations, that is, conversations that respond to user requests (e.g. "Give me a song.").

However, actual person-to-person conversations are not made by one-sided requests and responses. Instead, they get information from each other and are asked to re-request shortcomings of the other's request. For examples, when a student is taking a class with a teacher, it is a good idea to ask what you do not know. In addition, person-to-person conversations carry on a consistent conversation on a single topic. When you talk about a topic, you communicate with each other based on the knowledge you know, and if you do not know each other, you are asked to ask your opponent to get information.

Task 2. The Information-seeking Dialog Agent task aims to develop a conversation agent that can achieve the following goals:

    • Conduct consistent conversations based on given knowledge

    • Generate conversation that provides knowledge to its opponent based on given knowledge

    • Generate conversation that find insufficient knowledge and acquire knowledge of the other.

OKBQA-7 Hackathon

2018.08.07 - 2018.08.08

Task 3. Multimodal Character Identification on Videos

Task Definition

This task aims to link each mention to a certain character in dialogue based on given dialogue text and corresponding video. Let a mention be a nominal referring to a person (e.g., she, mom, Judy), and an entity be a character in a dialogue.


Character identification on text have been studied on Friends dataset and shown practical performance for identifying main characters(Chen et al., 2017; Choi&Chen, 2018). However, these studies solved the problem in the form of entity linking on pre defined characters. Thus, these modules couldn’t be applied to other than the Friends script unless the module is re-trained on the newly constructed data. This task should be approached in the form of coreference resolution to be applied to arbitrary dialogue or video script. There is a study that introduces coreference resolution based approach for this task(Chen et al., 2017), but coreference resolution is difficult problem in NLP, so the performance is not practical(F1 : 57.46% for 9 main characters).

Therefore, if we expand the task to get not only dialogue text, but also video as inputs, the performance would be improved to a practical level by utilizing richer features. This task is the extension of SemEval2018 Task4. There are two main extensions. Firstly, it adds multi modality by utilizing video as a input. Secondly, the final module of this task could be applied to arbitrary dialogue or video script.

OKBQA-7 Hackathon

2018.08.07 - 2018.08.08

Task 4. KB Population


The knowledge base is a database that stores the expertise accumulated through intellectual activities and experiences related to the field in which the AI agent is used and the facts and rules necessary for problem solving. It is an important element that affects the performance of various application systems based on natural language processing. Knowledge bases such as Wikidata, Freebase, WordNet, YAGO, Cyc, and BabelNet are widely used in English.

However, building and maintaining this massive knowledge base manually is very difficult in practice. Therefore, the technology that can detect important objects (people, events, places) from texts of everyday webs such as Wikipedia, news, and Relation Extraction that can extract the relationship between the entities is a very important technology for automatically expanding a knowledge base (KB Population).

You need to develop a model that learns training data and extracts important objects and relationships from a given sentence. Especially, this hackathon is trying to solve the following important problems.

    • [SubTask 4.1] Noise Reduction Methods for Distant Supervision

    • [SubTask 4.2] Full Graph Representation to Text

      • With integrating RDF Graph + Surface Graph + Frame Graph results, making TextGraph Controller with generates TextGraph.