AAAI.2016 - NLP and Knowledge Representation

| Total: 16

#1 Single or Multiple? Combining Word Representations Independently Learned from Text and WordNet [PDF] [Copy] [Kimi] [REL]

Authors: Josu Goikoetxea, Eneko Agirre, Aitor Soroa

Text and Knowledge Bases are complementary sources of information. Given the success of distributed word representations learned from text, several techniques to infuse additional information from sources like WordNet into word representations have been proposed. In this paper, we follow an alternative route. We learn word representations from text and WordNet independently, and then explore simple and sophisticated methods to combine them. The combined representations are applied to an extensive set of datasets on word similarity and relatedness. Simple combination methods happen to perform better that more complex methods like CCA or retrofitting, showing that, in the case of WordNet, learning word representations separately is preferable to learning one single representation space or adding WordNet information directly. A key factor, which we illustrate with examples, is that the WordNet-based representations captures similarity relations encoded in WordNet better than retrofitting. In addition, we show that the average of the similarities from six word representations yields results beyond the state-of-the-art in several datasets, reinforcing the opportunities to explore further combination techniques.


#2 Dependency Tree Representations of Predicate-Argument Structures [PDF] [Copy] [Kimi] [REL]

Authors: Likun Qiu, Yue Zhang, Meishan Zhang

We present a novel annotation framework for representing predicate-argument structures, which uses dependency trees to encode the syntactic and semantic roles of a sentence simultaneously. The main contribution is a semantic role transmission model, which eliminates the structural gap between syntax and shallow semantics, making them compatible. A Chinese semantic treebank was built under the proposed framework, and the first release containing about 14K sentences is made freely available. The proposed framework enables semantic role labeling to be solved as a sequence labeling task, and experiments show that standard sequence labelers can give competitive performance on the new treebank compared with state-of-the-art graph structure models.


#3 Topic Concentration in Query Focused Summarization Datasets [PDF] [Copy] [Kimi] [REL]

Authors: Tal Baumel, Raphael Cohen, Michael Elhadad

Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization methods, which ignore query relevance, when evaluated on traditional QFS datasets. We hypothesize this lack of success stems from the nature of the dataset. We define a task-based method to quantify topic concentration in datasets, i.e., the ratio of sentences within the dataset that are relevant to the query, and observe that the DUC 2005, 2006 and 2007 datasets suffer from very high topic concentration. We introduce TD-QFS, a new QFS dataset with controlled levels of topic concentration. We compare competitive baseline algorithms on TD-QFS and report strong improvement in ROUGE performance for algorithms that properly model query relevance as opposed to generic summarizers. We further present three new and simple QFS algorithms, RelSum, ThresholdSum, and TFIDF-KLSum that outperform state of the art QFS algorithms on the TD-QFS dataset by a large margin.


#4 Representing Verbs as Argument Concepts [PDF] [Copy] [Kimi] [REL]

Authors: Yu Gong, Kaiqi Zhao, Kenny Zhu

Verbs play an important role in the understanding of natural language text. This paper studies the problem of abstracting the subject and object arguments of a verb into a set of noun concepts, known as the “argument concepts”. This set of concepts, whose size is parameterized, represents the fine-grained semantics of a verb. For example, the object of “enjoy” can be abstracted into time, hobby and event, etc. We present a novel framework to automatically infer human readable and machine computable action concepts with high accuracy.


#5 Combining Retrieval, Statistics, and Inference to Answer Elementary Science Questions [PDF] [Copy] [Kimi] [REL]

Authors: Peter Clark, Oren Etzioni, Tushar Khot, Ashish Sabharwal, Oyvind Tafjord, Peter Turney, Daniel Khashabi

What capabilities are required for an AI system to pass standard 4th Grade Science Tests? Previous work has examined the use of Markov Logic Networks (MLNs) to represent the requisite background knowledge and interpret test questions, but did not improve upon an information retrieval (IR) baseline. In this paper, we describe an alternative approach that operates at three levels of representation and reasoning: information retrieval, corpus statistics, and simple inference over a semi-automatically constructed knowledge base, to achieve substantially improved results. We evaluate the methods on six years of unseen, unedited exam questions from the NY Regents Science Exam (using only non-diagram, multiple choice questions), and show that our overall system’s score is 71.3%, an improvement of 23.8% (absolute) over the MLN-based method described in previous work. We conclude with a detailed analysis, illustrating the complementary strengths of each method in the ensemble. Our datasets are being released to enable further research.


#6 Hashtag-Based Sub-Event Discovery Using Mutually Generative LDA in Twitter [PDF] [Copy] [Kimi] [REL]

Authors: Chen Xing, Yuan Wang, Jie Liu, Yalou Huang, Wei-Ying Ma

Sub-event discovery is an effective method for social event analysis in Twitter. It can discover sub-events from large amount of noisy event-related information in Twitter and semantically represent them. The task is challenging because tweets are short, informal and noisy. To solve this problem, we consider leveraging event-related hashtags that contain many locations, dates and concise sub-event related descriptions to enhance sub-event discovery. To this end, we propose a hashtag-based mutually generative Latent Dirichlet Allocation model(MGe-LDA). In MGe-LDA, hashtags and topics of a tweet are mutually generated by each other. The mutually generative process models the relationship between hashtags and topics of tweets, and highlights the role of hashtags as a semantic representation of the corresponding tweets. Experimental results show that MGe-LDA can significantly outperform state-of-the-art methods for sub-event discovery.


#7 Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning [PDF] [Copy] [Kimi] [REL]

Authors: Lemao Liu, Andrew Finch, Masao Utiyama, Eiichiro Sumita

Recurrent neural networks, particularly the long short- term memory networks, are extremely appealing for sequence-to-sequence learning tasks. Despite their great success, they typically suffer from a fundamental short- coming: they are prone to generate unbalanced targets with good prefixes but bad suffixes, and thus perfor- mance suffers when dealing with long sequences. We propose a simple yet effective approach to overcome this shortcoming. Our approach relies on the agreement between a pair of target-directional LSTMs, which generates more balanced targets. In addition, we develop two efficient approximate search methods for agreement that are empirically shown to be almost optimal in terms of sequence-level losses. Extensive experiments were performed on two standard sequence-to-sequence trans- duction tasks: machine transliteration and grapheme-to- phoneme transformation. The results show that the proposed approach achieves consistent and substantial im- provements, compared to six state-of-the-art systems. In particular, our approach outperforms the best reported error rates by a margin (up to 9% relative gains) on the grapheme-to-phoneme task.


#8 A Unified Bayesian Model of Scripts, Frames and Language [PDF] [Copy] [Kimi] [REL]

Authors: Francis Ferraro, Benjamin Van Durme

We present the first probabilistic model to capture all levels of the Minsky Frame structure, with the goal of corpus-based induction of scenario definitions. Our model unifies prior efforts in discourse-level modeling with that of Fillmore's related notion of frame, as captured in sentence-level, FrameNet semantic parses; as part of this, we resurrect the coupling among Minsky's frames, Schank's scripts and Fillmore's frames, as originally laid out by those authors. Empirically, our approach yields improved scenario representations, reflected quantitatively in lower surprisal and more coherent latent scenarios.


#9 Representation Learning of Knowledge Graphs with Entity Descriptions [PDF] [Copy] [Kimi1] [REL]

Authors: Ruobing Xie, Zhiyuan Liu, Jia Jia, Huanbo Luan, Maosong Sun

Representation learning (RL) of knowledge graphs aims to project both entities and relations into a continuous low-dimensional space. Most methods concentrate on learning representations with knowledge triples indicating relations between entities. In fact, in most knowledge graphs there are usually concise descriptions for entities, which cannot be well utilized by existing methods. In this paper, we propose a novel RL method for knowledge graphs taking advantages of entity descriptions. More specifically, we explore two encoders, including continuous bag-of-words and deep convolutional neural models to encode semantics of entity descriptions. We further learn knowledge representations with both triples and descriptions. We evaluate our method on two tasks, including knowledge graph completion and entity classification. Experimental results on real-world datasets show that, our method outperforms other baselines on the two tasks, especially under the zero-shot setting, which indicates that our method is capable of building representations for novel entities according to their descriptions. The source code of this paper can be obtained from https://github.com/xrb92/DKRL.


#10 ExTaSem! Extending, Taxonomizing and Semantifying Domain Terminologies [PDF] [Copy] [Kimi] [REL]

Authors: Luis Espinosa-Anke, Horacio Saggion, Francesco Ronzano, Roberto Navigli

We introduce ExTaSem!, a novel approach for the automatic learning of lexical taxonomies from domain terminologies. First, we exploit a very large semantic network to collect housands of in-domain textual definitions. Second, we extract (hyponym, hypernym) pairs from each definition with a CRF-based algorithm trained on manually-validated data. Finally, we introduce a graph induction procedure which constructs a full-fledged taxonomy where each edge is weighted according to its domain pertinence. ExTaSem! achieves state-of-the-art results in the following taxonomy evaluation experiments: (1) Hypernym discovery, (2) Reconstructing gold standard taxonomies, and (3) Taxonomy quality according to structural measures. We release weighted taxonomies for six domains for the use and scrutiny of the community.


#11 Complementing Semantic Roles with Temporally Anchored Spatial Knowledge: Crowdsourced Annotations and Experiments [PDF] [Copy] [Kimi] [REL]

Authors: Alakananda Vempala, Eduardo Blanco

This paper presents a framework to infer spatial knowledge from semantic role representations. We infer whether entities are or are not located somewhere, and temporally anchor this spatial information. A large crowdsourcing effort on top of OntoNotes shows that these temporally-anchored spatial inferences are ubiquitous and intuitive to humans. Experimental results show that inferences can be performed automatically and semantic features bring significant improvement.


#12 Fine-Grained Semantic Conceptualization of FrameNet [PDF] [Copy] [Kimi] [REL]

Authors: Jin-woo Park, Seung-won Hwang, Haixun Wang

Understanding verbs is essential for many natural language tasks. Tothis end, large-scale lexical resources such as FrameNet have beenmanually constructed to annotate the semantics of verbs (frames) andtheir arguments (frame elements or FEs) in example sentences.Our goal is to "semantically conceptualize" example sentences by connectingFEs to knowledge base (KB) concepts.For example, connecting Employer FE to company concept in the KB enables the understanding thatany (unseen) company can also be FE examples.However, a naive adoption of existing KB conceptualization technique, focusingon scenarios of conceptualizing a few terms,cannot 1) scale to many FE instances (average of 29.7 instances for all FEs) and 2) leverage interdependence betweeninstances and concepts.We thus propose a scalable k-truss clusteringand a Markov Random Field (MRF) model leveraging interdependence betweenconcept-instance, concept-concept, and instance-instance pairs. Our extensive analysis with real-life data validates that our approachimproves not only the quality of the identified concepts for FrameNet, but alsothat of applications such as selectional preference.


#13 Short Text Representation for Detecting Churn in Microblogs [PDF] [Copy] [Kimi] [REL]

Authors: Hadi Amiri, Hal Daume III

Churn happens when a customer leaves a brand or stop using its services. Brands reduce their churn rates by identifying and retaining potential churners through customer retention campaigns. In this paper, we consider the problem of classifying micro-posts as churny or non-churny with respect to a given brand. Motivated by the recent success of recurrent neural networks (RNNs) in word representation, we propose to utilize RNNs to learn micro-post and churn indicator representations. We show that such representations improve the performance of churn detection in microblogs and lead to more accurate ranking of churny contents. Furthermore, in this researchwe show that state-of-the-art sentiment analysis approaches fail to identify churny contents. Experiments on Twitter data about three telco brands show the utility of our approach for this task.


#14 Verb Pattern: A Probabilistic Semantic Representation on Verbs [PDF] [Copy] [Kimi] [REL]

Authors: Wanyun Cui, Xiyou Zhou, Hangyu Lin, Yanghua Xiao, Haixun Wang, Seung-won Hwang, Wei Wang

Verbs are important in semantic understanding of natural language. Traditional verb representations, such as FrameNet, PropBank, VerbNet, focus on verbs' roles. These roles are too coarse to represent verbs' semantics. In this paper, we introduce verb patterns to represent verbs' semantics, such that each pattern corresponds to a single semantic of the verb. First we analyze the principles for verb patterns: generality and specificity. Then we propose a nonparametric model based on description length. Experimental results prove the high effectiveness of verb patterns. We further apply verb patterns to context-aware conceptualization, to show that verb patterns are helpful in semantic-related tasks.


#15 A Generative Model of Words and Relationships from Multiple Sources [PDF] [Copy] [Kimi] [REL]

Authors: Stephanie Hyland, Theofanis Karaletsos, Gunnar Rätsch

Neural language models are a powerful tool to embed words into semantic vector spaces. However, learning such models generally relies on the availability of abundant and diverse training examples. In highly specialised domains this requirement may not be met due to difficulties in obtaining a large corpus, or the limited range of expression in average use. Such domains may encode prior knowledge about entities in a knowledge base or ontology. We propose a generative model which integrates evidence from diverse data sources, enabling the sharing of semantic information. We achieve this by generalising the concept of co-occurrence from distributional semantics to include other relationships between entities or words, which we model as affine transformations on the embedding space. We demonstrate the effectiveness of this approach by outperforming recent models on a link prediction task and demonstrating its ability to profit from partially or fully unobserved data training labels. We further demonstrate the usefulness of learning from different data sources with overlapping vocabularies.


#16 PEAK: Pyramid Evaluation via Automated Knowledge Extraction [PDF] [Copy] [Kimi] [REL]

Authors: Qian Yang, Rebecca Passonneau, Gerard de Melo

Evaluating the selection of content in a summary is important both for human-written summaries, which can be a useful pedagogical tool for reading and writing skills, and machine-generated summaries, which are increasingly being deployed in information management. The pyramid method assesses a summary by aggregating content units from the summaries of a wise crowd (a form of crowdsourcing). It has proven highly reliable but has largely depended on manual annotation. We propose PEAK, the first method to automatically assess summary content using the pyramid method that also generates the pyramid content models. PEAK relies on open information extraction and graph algorithms. The resulting scores correlate well with manually derived pyramid scores on both human and machine summaries, opening up the possibility of wide-spread use in numerous applications.