liu.seSök publikationer i DiVA
Ändra sökning
Avgränsa sökresultatet
12 1 - 50 av 57
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Aronsson, Fredrik
    et al.
    Karolinska Inst, Sweden; Karolinska Univ Hosp, Sweden.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Jelic, Vesna
    Karolinska Inst, Sweden; Karolinska Univ Hosp, Sweden.
    Ostberg, Per
    Karolinska Inst, Sweden; Karolinska Univ Hosp, Sweden.
    Is cognitive impairment associated with reduced syntactic complexity in writing? Evidence from automated text analysis2021Ingår i: Aphasiology, ISSN 0268-7038, E-ISSN 1464-5041, Vol. 35, nr 7, s. 900-913Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Background: Written language impairments are common in Alzheimers disease and reduced syntactic complexity in written discourse has been observed decades before the onset of dementia. The validity of average dependency distance (ADD), a measure of syntactic complexity, in cognitive decline needs to be studied further to evaluate its clinical relevance. Aims: The aim of the study was to determine whether ADD is associated with levels of cognitive impairment in memory clinic patients. Methods & procedures: We analyzed written texts collected in clinical practice from 114 participants with subjective cognitive impairment, mild cognitive impairment, and Alzheimers disease during routine assessment at a memory clinic. ADD was measured using automated analysis methods consisting of a syntactic parser and a part-of-speech tagger. Outcomes & results: Our results show a significant association between ADD and levels of cognitive impairment, using ordinal logistic regression models. Conclusion: These results suggest that ADD is clinically relevant with regard to levels of cognitive impairment and indicate a diagnostic potential for ADD in cognitive assessment.

    Ladda ner fulltext (pdf)
    fulltext
  • 2.
    Debusmann, Ralph
    et al.
    Saarland University, Saarbrücken, Germany.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Dependency Grammar: Classification and Exploration2010Ingår i: Resource-Adaptive Cognitive Processes / [ed] Matthew W. Crocker, Jörg Siekmann, Springer Berlin/Heidelberg, 2010, s. 365-388Kapitel i bok, del av antologi (Övrigt vetenskapligt)
    Abstract [en]

    Syntactic representations based on word-to-word dependencies have a long tradition in descriptive linguistics [29]. In recent years, they have also become increasingly used in computational tasks, such as information extraction [5], machine translation [43], and parsing [42]. Among the purported advantages of dependency over phrase structure representations are conciseness, intuitive appeal, and closeness to semantic representations such as predicate-argument structures. On the more practical side, dependency representations are attractive due to the increasing availability of large corpora of dependency analyses, such as the Prague Dependency Treebank [19].

  • 3.
    Dienes, Péter
    et al.
    Saarland University, Saarbrücken, Germany.
    Koller, Alexander
    Saarland University, Saarbrücken, Germany.
    Kuhlmann, Marco
    Saarland University, Saarbrücken, Germany.
    Statistical A-Star Dependency Parsing2003Ingår i: Proceedings of the Workshop on Prospects and Advances in the Syntax/Semantics Interface / [ed] Denys Duchier and Geert-Jan Kruijff, 2003, s. 85-89Konferensbidrag (Refereegranskat)
    Abstract [en]

    Extensible Dependency Grammar (XDG; Duchier and Debusmann (2001)) is a recently developed dependency grammar formalism that allows the characterization of linguistic structures along multiple dimensions of description. It can be implemented efficiently using constraint programming (CP; Koller and Niehren 2002). In the CP context, parsing is cast as a search problem: The states of the search are partial parse trees, successful end states are complete and valid parses. In this paper, we propose a probability model for XDG dependency trees and an A-Star search control regime for the XDG parsing algorithm that guarantees the best parse to be found first. Extending XDG with a statistical component has the benefit of bringing the formalism further into the grammatical mainstream; it also enables XDG to efficiently deal with large, corpus-induced grammars that come with a high degree of ambiguity.

  • 4.
    Doostmohammadi, Ehsan
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem.
    On the Effects of Video Grounding on Language Models2022Ingår i: Proceedings of the First Workshop on Performance and Interpretability Evaluations of Multimodal, Multipurpose, Massive-Scale Models, 2022Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Transformer-based models trained on text and vision modalities try to improve the performance on multimodal downstream tasks or tackle the problem Transformer-based models trained on text and vision modalities try to improve the performance on multimodal downstream tasks or tackle the problem of lack of grounding, e.g., addressing issues like models’ insufficient commonsense knowledge. While it is more straightforward to evaluate the effects of such models on multimodal tasks, such as visual question answering or image captioning, it is not as well-understood how these tasks affect the model itself, and its internal linguistic representations. In this work, we experiment with language models grounded in videos and measure the models’ performance on predicting masked words chosen based on their imageability. The results show that the smaller model benefits from video grounding in predicting highly imageable words, while the results for the larger model seem harder to interpret.of lack of grounding, e.g., addressing issues like models’ insufficient commonsense knowledge. While it is more straightforward to evaluate the effects of such models on multimodal tasks, such as visual question answering or image captioning, it is not as well-understood how these tasks affect the model itself, and its internal linguistic representations. In this work, we experiment with language models grounded in videos and measure the models’ performance on predicting masked words chosen based on their imageability. The results show that the smaller model benefits from video grounding in predicting highly imageable words, while the results for the larger model seem harder to interpret.

  • 5.
    Doostmohammadi, Ehsan
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Norlund, Tobias
    Chalmers Univ Technol, Sweden; Recorded Future, Sweden.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Johansson, Richard
    Chalmers Univ Technol, Sweden; Univ Gothenburg, Sweden.
    Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models2023Ingår i: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), ASSOC COMPUTATIONAL LINGUISTICS-ACL , 2023, s. 521-529Konferensbidrag (Refereegranskat)
    Abstract [en]

    Augmenting language models with a retrieval mechanism has been shown to significantly improve their performance while keeping the number of parameters low. Retrieval-augmented models commonly rely on a semantic retrieval mechanism based on the similarity between dense representations of the query chunk and potential neighbors. In this paper, we study the state-of-the-art Retro model and observe that its performance gain is better explained by surface-level similarities, such as token overlap. Inspired by this, we replace the semantic retrieval in Retro with a surface-level method based on BM25, obtaining a significant reduction in perplexity. As full BM25 retrieval can be computationally costly for large datasets, we also apply it in a re-ranking scenario, gaining part of the perplexity reduction with minimal computational overhead.

  • 6.
    Drewes, Frank
    et al.
    Umeå University.
    Knight, Kevin
    University of Southern California, Information Sciences Institute.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Formal Models of Graph Transformation in Natural Language Processing (Dagstuhl Seminar 15122)2015Ingår i: Dagstuhl Reports, ISSN 2192-5283, Vol. 5, nr 3, s. 143-161Artikel i tidskrift (Övrigt vetenskapligt)
    Abstract [en]

    In natural language processing (NLP) there is an increasing interest in formal models for processing graphs rather than more restricted structures such as strings or trees. Such models of graph transformation have previously been studied and applied in various other areas of computer science, including formal language theory, term rewriting, theory and implementation of programming languages, concurrent processes, and software engineering. However, few researchers from NLP are familiar with this work, and at the same time, few researchers from the theory of graph transformation are aware of the specific desiderata, possibilities and challenges that one faces when applying the theory of graph transformation to NLP problems. The Dagstuhl Seminar 15122 “Formal Models of Graph Transformation in Natural Language Processing” brought researchers from the two areas together. It initiated an interdisciplinary exchange about existing work, open problems, and interesting applications.

  • 7.
    Drewes, Frank
    et al.
    Umeå University, Umeå, Sweden.
    Kuhlmann, MarcoUppsala universitet, Institutionen för lingvistik och filologi.
    ATANLP 2012 Workshop on Applications of Tree Automata Techniques in Natural Language Processing: Proceedings of the Workshop2012Proceedings (redaktörskap) (Övrigt vetenskapligt)
  • 8.
    Drewes, Frank
    et al.
    Umeå University, Umeå, Sweden.
    Kuhlmann, MarcoUppsala universitet, Institutionen för lingvistik och filologi.
    Workshop on Applications of Tree Automata in Natural Language Processing 2010 (ATANLP 2010)2010Proceedings (redaktörskap) (Övrigt vetenskapligt)
  • 9.
    Fallgren, Per
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Segeblad, Jesper
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Towards a Standard Dataset of Swedish Word Vectors2016Ingår i: Proceedings of the Sixth Swedish Language Technology Conference (SLTC), 2016Konferensbidrag (Refereegranskat)
    Abstract [en]

    Word vectors, embeddings of words into a low-dimensional space, have been shown to be useful for a large number of natural language processing tasks. Our goal with this paper is to provide a useful dataset of such vectors for Swedish. To this end, we investigate three standard embedding methods: the continuous bag-of-words and the skip-gram model with negative sampling of Mikolov et al. (2013a), and the global vectors of Pennington et al. (2014). We compare these methods using QVEC-CCA (Tsvetkov et al., 2016), an intrinsic evaluation measure that quantifies the correlation of learned word vectors with external linguistic resources. For this propose we use SALDO, the Swedish Association Lexicon (Borin et al., 2013). Our experiments show that the continuous bag-of-words model produces vectors that are most highly correlated to SALDO, with the skip-gram model very close behind. Our learned vectors will be provided for download at the paper’s website.

    Ladda ner fulltext (pdf)
    fulltext
  • 10.
    Ferrara Boston, Marisa
    et al.
    Department of Linguistics, Cornell University, Ithaca, NY, USA.
    Hale, John
    Department of Linguistics, Cornell University, Ithaca, NY, USA.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Dependency Structures Derived from Minimalist Grammars2010Ingår i: The Mathematics of Language: 10th and 11th Biennial Conference, MOL 10, Los Angeles, CA, USA, July 28–30, 2007, and MOL 11, Bielefeld, Germany, August 20–21, 2009, Revised Selected Papers, Springer Berlin/Heidelberg, 2010, s. 1-12Konferensbidrag (Refereegranskat)
    Abstract [en]

    This paper provides an interpretation of Minimalist Grammars (Stabler, 1997; Stabler & Keenan, 2003) in terms of dependency structures. Under this interpretation, merge operations derive projective dependency structures, and movement operations create both non-projective and illnested structures. This provides a new characterization of the generative capacity of Minimalist Grammar, and makes it possible to discuss the linguistic relevance of non-projectivity and illnestedness based on grammars that derive structures with these properties.

  • 11.
    Gómez-Rodriguez, Carlos
    et al.
    Departamento de Computación, Universidade da Coruña, A Coruña, Spain.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Satta, Giorgio
    Department of Information Engineering, University of Padua, Padua, Italy.
    Weir, David
    Department of Informatics, University of Sussex, East Sussex, Storbritannien.
    Optimal Reduction of Rule Length in Linear Context-Free Rewriting Systems2009Ingår i: Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2009, s. 539-547Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Linear Context-free Rewriting Systems (LCFRS) is an expressive grammar formalism with applications in syntax-based machine translation. The parsing complexity of an LCFRS is exponential in both the rank of a production, defined as the number of nonterminals on its right-hand side, and a measure for the discontinuity of a phrase, called fan-out. In this paper, we present an algorithm that transforms an LCFRS into a strongly equivalent form in which all productions have rank at most 2, and has minimal fan-out. Our results generalize previous work on Synchronous Context-Free Grammar, and are particularly relevant for machine translation from or to languages that require syntactic analyses with discontinuous constituents.

  • 12.
    Gómez-Rodríguez, Carlos
    et al.
    Departamento de Computación, Universidade da Coruña, A Coruña, Spain.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Satta, Giorgio
    Department of Information Engineering, University of Padua, Padua, Italy.
    Efficient Parsing of Well-Nested Linear Context-Free Rewriting Systems2010Ingår i: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics: Proceedings of the Main Conference, Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, s. 276-284Konferensbidrag (Refereegranskat)
    Abstract [en]

    The use of well-nested linear context-free rewriting systems has been empirically motivated for modeling of the syntax of languages with discontinuous constituents or relatively free word order. We present a chart-based parsing algorithm that asymptotically improves the known running time upper bound for this class of rewriting systems. Our result is obtained through a linear space construction of a binary normal form for the grammar at hand.

  • 13.
    Holmström, Oskar
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kunz, Jenny
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem.
    Bridging the Resource Gap: Exploring the Efficacy of English and Multilingual LLMs for Swedish2023Ingår i: Proceedings of the Second Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2023), Tórshavn, the Faroe Islands, 2023, s. 92-110Konferensbidrag (Refereegranskat)
    Abstract [en]

    Large language models (LLMs) have substantially improved natural language processing (NLP) performance, but training these models from scratch is resource-intensive and challenging for smaller languages. With this paper, we want to initiate a discussion on the necessity of language-specific pre-training of LLMs. We propose how the “one model-many models” conceptual framework for task transfer can be applied to language transfer and explore this approach by evaluating the performance of non-Swedish monolingual and multilingual models’ performance on tasks in Swedish. Our findings demonstrate that LLMs exposed to limited Swedish during training can be highly capable and transfer competencies from English off-the-shelf, including emergent abilities such as mathematical reasoning, while at the same time showing distinct culturally adapted behaviour. Our results suggest that there are resourceful alternatives to language-specific pre-training when creating useful LLMs for small languages.

  • 14.
    Kallmeyer, Laura
    et al.
    Heinrich Heine University Düsseldorf, Düsseldorf, Germany.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    A Formal Model for Plausible Dependencies in Lexicalized Tree Adjoining Grammar2012Ingår i: Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms, 2012, s. 108-116Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Several authors have pointed out that the correspondence between LTAG derivation trees and dependency structures is not as direct as it may seem at first glance, and various proposals have been made to overcome this divergence. In this paper we propose to view the correspondence between derivation trees and dependency structures as a tree transformation during which the direction of some of the original edges is reversed. We show that, under this transformation, LTAG is able to induce both ill-nested dependency trees and dependency trees with gap-degree greater than 1, which is not possible under the direct reading of derivation trees as dependency trees.

  • 15.
    Koller, Alexander
    et al.
    Dept. of Linguistics, University of Potsdam, Potsdam, Germany.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    A Generalized View on Parsing and Translation2011Ingår i: Proceedings of the Twelfth International Conference on Parsing Technologies (IWPT), Stroudsburg, PA, USA: Association for Computational Linguistics, 2011, s. 2-13Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    We present a formal framework that generalizes a variety of monolingual and synchronous grammar formalisms for parsing and translation. Our framework is based on regular tree grammars that describe derivation trees, which are interpreted in arbitrary algebras. We obtain generic parsing algorithms by exploiting closure properties of regular tree languages.

  • 16.
    Koller, Alexander
    et al.
    University of Potsdam, Potsdam, Germany.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Decomposing TAG Parsing Algorithms Using Simple Algebraizations2012Ingår i: Proceedings of the 11th International Workshop on Tree Adjoining Grammars and Related Formalisms, 2012, s. 135-143Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    We review a number of different ‘algebraic’ perspectives on TAG and STAG in the framework of interpreted regular tree grammars (IRTGs). We then use this framework to derive a new parsing algorithm for TAGs, based on two algebras that describe strings and derived trees. Our algorithm is extremely modular, and can easily be adapted to the synchronous case.

  • 17.
    Koller, Alexander
    et al.
    Department of Computational Linguistics, Saarland University, Saarbrücken, Germany.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Dependency Trees and the Strong Generative Capacity of CCG2009Ingår i: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2009, s. 460-468Konferensbidrag (Refereegranskat)
    Abstract [en]

    We propose a novel algorithm for extracting dependencies from CCG derivations. Unlike earlier proposals, our dependency structures are always tree-shaped. We then use these dependency trees to compare the strong generative capacities of CCG and TAG and obtain surprising results: Although both formalisms generate the same string languages, their strong generative capacities are equivalent if we ignore word order, and incomparable if we take it into account.

  • 18.
    Kornai, András
    et al.
    Hungarian Academy of Sciences, Hungary.
    Kuhlmann, MarcoUppsala University, Sweden.
    Proceedings of the 13th Meeting on the Mathematics of Language (MoL)2013Proceedings (redaktörskap) (Övrigt vetenskapligt)
    Abstract [en]

    The Mathematics of Language (MoL) special interest group traces its origins to a meeting held in October 1984 at Ann Arbor, Michigan. While MoL is among the oldest SIGs of the ACL, it is the first time that the proceedings are produced by our parent organization. The first volume was published by Benjamins, later ones became special issues of the Annals of Mathematics and Artificial Intelligence and Linguistics and Philosophy, and for the last three occasions (really six years, since MoL only meets every second year) we relied on the Springer LNCS series. Perhaps the main reason for this aloofness was that the past three decades have brought the ascendancy of statistical methods in computational linguistics, with the formal, grammar-based methods that were the mainstay of mathematical linguistics viewed with increasing suspicion.

    To make matters worse, the harsh anti-formal rhetoric of leading linguists relegated important attempts at formalizing Government-Binding and later Minimalist theory to the fringes of syntax. Were it not for phonology and morphology, where the incredibly efficient finite state methods pioneered by Kimmo Koskenniemi managed to bridge the gap between computational practice and linguistic theory, and were it not for the realization that the mathematical approach has no alternative in machine learning, MoL could have easily disappeared from the frontier of research.

    The current volume marks a time when we can begin to see the computational and the theoretical linguistics camps together again. The selection of papers, while still strong on phonology (Heinz and Lai, Heinz and Rogers) and morphology (Kornai et al.), extends well to syntax (Hunter and Dyer, Fowlie) and semantics (Clark et al., Fernando). Direct computational concerns such as machine translation (Martzoukos et al.), decoding (Corlett and Penn), and complexity (Berglund et al.) are now clearly seen as belonging to the core focus of the field.

    The 10 papers presented in this volume were selected by the Program Committee from 16 submissions. We would like to thank the authors, the members of the Program Committee, and our invited speaker for their contributions to the planning and execution of the workshop, and the ACL conference organizers, especially Aoife Cahill and Qun Liu (workshops), and Roberto Navigli and Jing-Shin Chang (publications) for their significant contributions to the overall management of the workshop and their direction in preparing the publication of the proceedings.

    Ladda ner fulltext (pdf)
    fulltext
  • 19.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Dependency Structures and Lexicalized Grammars: An Algebraic Approach2010Bok (Övrigt vetenskapligt)
  • 20.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska högskolan.
    Linköping: Cubic-Time Graph Parsing with a Simple Scoring Scheme2014Ingår i: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), Association for Computational Linguistics, 2014, s. 395-399Konferensbidrag (Refereegranskat)
    Abstract [en]

    We turn the Eisner algorithm for parsing to projective dependency trees into a cubic-time algorithm for parsing to a restricted class of directed graphs. To extend the algorithm into a data-driven parser, we combine it with an edge-factored feature model and online learning. We report and discuss results on the SemEval-2014 Task 8 data sets (Oepen et al., 2014).

  • 21.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Mildly Non-Projective Dependency Grammar2013Ingår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 39, nr 2, s. 355-387Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Syntactic representations based on word-to-word dependencies have a long-standing tradition in descriptive linguistics, and receive considerable interest in many applications. Nevertheless, dependency syntax has remained somewhat of an island from a formal point of view. Moreover, most formalisms available for dependency grammar are restricted to projective analyses, and thus not able to support natural accounts of phenomena such as wh-movement and cross–serial dependencies. In this article we present a formalism for non-projective dependency grammar in the framework of linear context-free rewriting systems. A characteristic property of our formalism is a close correspondence between the non-projectivity of the dependency trees admitted by a grammar on the one hand, and the parsing complexity of the grammar on the other. We show that parsing with unrestricted grammars is intractable. We therefore study two constraints on non-projectivity, block-degree and well-nestedness. Jointly, these two constraints define a class of “mildly” non-projective dependency grammars that can be parsed in polynomial time. An evaluation on five dependency treebanks shows that these grammars have a good coverage on empirical data.

    Ladda ner fulltext (pdf)
    fulltext
  • 22.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Gómez-Rodríguez, Carlos
    Universidade da Coruña, A Coruña, Spain.
    Satta, Giorgio
    University of Padua; Padua, Italy.
    Dynamic Programming Algorithms for Transition-Based Dependency Parsers2011Ingår i: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2011, s. 673-682Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    We develop a general dynamic programming technique for the tabulation of transition-based dependency parsers, and apply it to obtain novel, polynomial-time algorithms for parsing with the arc-standard and arc-eager models. We also show how to reverse our technique to obtain new transition-based dependency parsers from existing tabular methods. Additionally, we provide a detailed discussion of the conditions under which the feature models commonly used in transition-based parsing can be integrated into our algorithms.

     

  • 23.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Jonsson, Peter
    Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten.
    Parsing to Noncrossing Dependency Graphs2015Ingår i: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, Vol. 3, s. 559-570Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We study the generalization of maximum spanning tree dependency parsing to maximum acyclic subgraphs. Because the underlying optimization problem is intractable even under an arc-factored model, we consider the restriction to noncrossing dependency graphs. Our main contribution is a cubic-time exact inference algorithm for this class. We extend this algorithm into a practical parser and evaluate its performance on four linguistic data sets used in semantic dependency parsing. We also explore a generalization of our parsing framework to dependency graphs with pagenumber at most $k$ and show that the resulting optimization problem is NP-hard for k ≥ 2.

  • 24.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Kanazawa, MakotoNational Institute of Informatics, Japan.Kobele, GregoryUniversity of Chicago, USA.
    MoL 2015: the 14th meeting on the Mathematics of language. Proceedings, July 25-26, Chicago, USA2015Proceedings (redaktörskap) (Refereegranskat)
    Abstract [en]

    This volume contains eleven regular papers and  two invited papers. The regular papers, which were selected by the Program Committee from a total of twenty-two submissions, feature a broad variety of work on mathematics of language, including phonology, formal language theory, natural language semantics, and language learning. The invited papers are presented by two distinguished researchers in the field: David McAllester, Professor and Chief Academic Officer at the Toyota Technological Institute at Chicago, and Ryo Yoshinaka, Assistant Professor at Kyoto University.

    Ladda ner fulltext (pdf)
    fulltext
  • 25.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Koller, Alexander
    University of Potsdam, Germany.
    Satta, Giorgio
    University of Padua, Italy.
    Lexicalization and Generative Power in CCG2015Ingår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 41, nr 2, s. 215-247Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The weak equivalence of Combinatory Categorial Grammar (CCG) and Tree-Adjoining Grammar (TAG) is a central result of the literature on mildly context-sensitive grammar formalisms. However, the categorial formalism for which this equivalence has been established differs significantly from the versions of CCG that are in use today. In particular, it allows restriction of combinatory rules on a per grammar basis, whereas modern CCG assumes a universal set of rules, isolating all cross-linguistic variation in the lexicon. In this article we investigate the formal significance of this difference. Our main result is that lexicalized versions of the classical CCG formalism are strictly less powerful than TAG.

    Ladda ner fulltext (pdf)
    fulltext
  • 26.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Koller, Alexander
    Cluster of Excellence, Saarland University, Saarbrücken, Germany.
    Satta, Giorgio
    Dept. of Information Engineering, University of Padua, Padua, Italy.
    The Importance of Rule Restrictions in CCG2010Ingår i: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2010, s. 534-543Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    Combinatory Categorial Grammar (CCG) is generally construed as a fully lexicalized formalism, where all grammars use one and the same universal set of rules, and cross-linguistic variation is isolated in the lexicon. In this paper, we show that the weak generative capacity of this `pure’ form of CCG is strictly smaller than that of CCG with grammar-specific rules, and of other mildly context-sensitive grammar formalisms, including Tree Adjoining Grammar (TAG). Our result also carries over to a multi-modal extension of CCG.

  • 27.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Maletti, Andreas
    University of Leipzig.
    Schiffer, Lena Katharina
    University of Leipzig.
    The Tree-Generative Capacity of Combinatory Categorial Grammars.2019Ingår i: 39th IARCS Annual Conference on Foundations of Software Technology and Theoretical Computer Science, FSTTCS 2019, December 11-13, 2019, Bombay, India, 2019, s. 44:1-44:14Konferensbidrag (Refereegranskat)
    Abstract [en]

    The generative capacity of combinatory categorial grammars as acceptors of forests is investigated. It is demonstrated that the such obtained forests can also be generated by simple monadic context-free tree grammars. However, the subclass of pure combinatory categorial grammars cannot even accept all regular forests. Additionally, the forests accepted by combinatory categorial grammars with limited rule degrees are characterized: If only application rules are allowed, then they can accept only a proper subset of the regular forests, whereas they can accept exactly the regular forests once first degree composition rules are permitted.

  • 28.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Maletti, Andreas
    Univ Leipzig, Germany.
    Schiffer, Lena Katharina
    Univ Leipzig, Germany.
    The tree-generative capacity of combinatory categorial grammars2022Ingår i: Journal of computer and system sciences (Print), ISSN 0022-0000, E-ISSN 1090-2724, Vol. 124, s. 214-233Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The generative capacity of combinatory categorial grammars (CCGs) as generators of tree languages is investigated. It is demonstrated that the tree languages generated by CCGs can also be generated by simple monadic context-free tree grammars. However, the important subclass of pure combinatory categorial grammars cannot even generate all regular tree languages. Additionally, the tree languages generated by combinatory categorial grammars with limited rule degrees are characterized: If only application rules are allowed, then these grammars can generate only a proper subset of the regular tree languages, whereas they can generate exactly the regular tree languages once first-degree composition rules are permitted. (C) 2021 The Author(s). Published by Elsevier Inc.

    Ladda ner fulltext (pdf)
    fulltext
  • 29.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Niehren, Joachim
    INRIA Lille, France.
    Logics and Automata for Totally Ordered Trees2008Ingår i: Rewriting Techniques and Applications, Proceedings / [ed] Voronkov, A, Springer Berlin/Heidelberg, 2008, s. 217-231Konferensbidrag (Refereegranskat)
    Abstract [en]

    A totally ordered tree is a tree equipped with an additional total order on its nodes. It provides a formal model for data that comes with both a hierarchical and a sequential structure; one example for such data are natural language sentences, where a sequential structure is given by word order, and a hierarchical structure is given by grammatical relations between words. In this paper, we study monadic second-order logic (MSO) for totally ordered terms. We show that the MSO satisfiability problem of unrestricted structures is undecidable, but give a decision procedure for practically relevant sub-classes, based on tree automata.

  • 30.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Nivre, Joakim
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Transition-Based Techniques for Non-Projective Dependency Parsing2010Ingår i: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 2, nr 1, s. 1-19Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present an empirical evaluation of three methods for the treatment of non-projective structures in transition-based dependency parsing: pseudo-projective parsing, non-adjacent arc transitions, and online reordering. We compare both the theoretical coverage and the empirical performance of these methods using data from Czech, English and German. The results show that although online reordering is the only method with complete theoretical coverage, all three techniques exhibit high precision but somewhat lower recall on non-projective dependencies and can all improve overall parsing accuracy provided that non-projective dependencies are frequent enough. We also find that the use of non-adjacent arc transitions may lead to a drop in accuracy on projective dependencies in the presence of long-distance non-projective dependencies, an effect that is not found for the two other techniques.

    Ladda ner fulltext (pdf)
    fulltext
  • 31.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Oepen, Stephan
    Department of Informatics, University of Oslo.
    Towards a Catalogue of Linguistic Graph Banks2016Ingår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 42, nr 4, s. 819-827Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Graphs exceeding the formal complexity of rooted trees are of growing relevance to much NLP research. Although formally well understood in graph theory, there is substantial variation in the types of linguistic graphs, as well as in the interpretation of various structural properties. To provide a common terminology and transparent statistics across different collections of graphs in NLP, we propose to establish a shared community resource with an open-source reference implementation for common statistics.

    Ladda ner fulltext (pdf)
    fulltext
  • 32.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska högskolan.
    Satta, Giorgio
    University of Padua.
    A New Parsing Algorithm for Combinatory Categorial Grammar2014Ingår i: Transactions of the Association for Computational Linguistics, ISSN 2307-387X, Vol. 2, nr 2014, s. 405-418Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We present a polynomial-time parsing algorithm for CCG, based on a new decomposition of derivations into small, shareable parts. Our algorithm has the same asymptotic complexity, O(n⁶), as a previous algorithm by Vijay-Shanker and Weir (1993), but is easier to understand, implement, and prove correct.

  • 33.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Satta, Giorgio
    University of Padua, Padua, Italy.
    Tree-Adjoining Grammars are not Closed Under Strong Lexicalization2012Ingår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 38, nr 3, s. 617-629Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    A lexicalized tree-adjoining grammar is a tree-adjoining grammar where each elementary tree contains some overt lexical item. Such grammars are being used to give lexical accounts of syntactic phenomena, where an elementary tree defines the domain of locality of the syntactic and semantic dependencies of its lexical items. It has been claimed in the literature that for every tree-adjoining grammar, one can construct a strongly equivalent lexicalized version. We show that such a procedure does not exist: Tree-adjoining grammars are not closed under strong lexicalization.

  • 34.
    Kuhlmann, Marco
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Satta, Giorgio
    Department of Information Engineering, University of Padua, Padua, Italy.
    Treebank Grammar Techniques for Non-Projective Dependency Parsing2009Ingår i: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, Stroudsburg, PA, USA: Association for Computational Linguistics, 2009, s. 478-486Konferensbidrag (Refereegranskat)
    Abstract [en]

    An open problem in dependency parsing is the accurate and efficient treatment of non-projective structures. We propose to attack this problem using chart-parsing algorithms developed for mildly context-sensitive grammar formalisms. In this paper, we provide two key tools for this approach. First, we show how to reduce non-projective dependency parsing to parsing with Linear Context-Free Rewriting Systems (LCFRS), by presenting a technique for extracting LCFRS from dependency treebanks. For efficient parsing, the extracted grammars need to be transformed in order to minimize the number of nonterminal symbols per production. Our second contribution is an algorithm that computes this transformation for a large, empirically relevant class of grammars.

  • 35.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Satta, Giorgio
    Univ Padua, Italy.
    Jonsson, Peter
    Linköpings universitet, Institutionen för datavetenskap, Programvara och system. Linköpings universitet, Tekniska fakulteten.
    On the Complexity of CCG Parsing2018Ingår i: Computational linguistics - Association for Computational Linguistics (Print), ISSN 0891-2017, E-ISSN 1530-9312, Vol. 44, nr 3, s. 447-482Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    We study the parsing complexity of Combinatory Categorial Grammar (CCG) in the formalism of Vijay-Shanker and Weir (1994). As our main result, we prove that any parsing algorithm for this formalism will take in the worst case exponential time when the size of the grammar, and not only the length of the input sentence, is included in the analysis. This sets the formalism of Vijay-Shanker and Weir (1994) apart from weakly equivalent formalisms such as Tree Adjoining Grammar, for which parsing can be performed in time polynomial in the combined size of grammar and input sentence. Our results contribute to a refined understanding of the class of mildly context-sensitive grammars, and inform the search for new, mildly context-sensitive versions of CCG.

    Ladda ner fulltext (pdf)
    fulltext
  • 36.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Scheffler, TatjanaUniversity of Potsdam.
    Proceedings of the 13th International Workshop on Tree Adjoining Grammars and Related Formalisms2017Proceedings (redaktörskap) (Övrigt vetenskapligt)
  • 37.
    Kuhlmann, Marco
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Wurm, Christian
    University of Düsseldorf, Germany.
    Finite-State Methods and Mathematics of Language, Introduction to the Special Issue.2017Ingår i: Journal of language modelling, ISSN 2299-856X, E-ISSN 2299-8470, Vol. 5, nr 1, s. 1-2Artikel i tidskrift (Övrigt vetenskapligt)
    Ladda ner fulltext (pdf)
    fulltext
  • 38.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Jirénius, Martin
    Linköpings universitet, Institutionen för datavetenskap. Linköpings universitet, Tekniska fakulteten.
    Holmström, Oskar
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Human Ratings Do Not Reflect Downstream Utility: A Study of Free-Text Explanations for Model Predictions2022Ingår i: Proceedings of the Fifth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP, 2022, Vol. 5, s. 164-177, artikel-id 2022.blackboxnlp-1.14Konferensbidrag (Refereegranskat)
    Abstract [en]

    Models able to generate free-text rationales that explain their output have been proposed as an important step towards interpretable NLP for “reasoning” tasks such as natural language inference and commonsense question answering. However, the relative merits of different architectures and types of rationales are not well understood and hard to measure. In this paper, we contribute two insights to this line of research: First, we find that models trained on gold explanations learn to rely on these but, in the case of the more challenging question answering data set we use, fail when given generated explanations at test time. However, additional fine-tuning on generated explanations teaches the model to distinguish between reliable and unreliable information in explanations. Second, we compare explanations by a generation-only model to those generated by a self-rationalizing model and find that, while the former score higher in terms of validity, factual correctness, and similarity to gold explanations, they are not more useful for downstream classification. We observe that the self-rationalizing model is prone to hallucination, which is punished by most metrics but may add useful context for the classification step.

  • 39.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem.
    Classifier Probes May Just Learn from Linear Context Features2020Ingår i: Proceedings of the 28th International Conference on Computational Linguistics, 2020, Vol. 28, s. 5136-5146, artikel-id 450Konferensbidrag (Refereegranskat)
    Abstract [en]

    Classifiers trained on auxiliary probing tasks are a popular tool to analyze the representations learned by neural sentence encoders such as BERT and ELMo. While many authors are aware of the difficulty to distinguish between “extracting the linguistic structure encoded in the representations” and “learning the probing task,” the validity of probing methods calls for further research. Using a neighboring word identity prediction task, we show that the token embeddings learned by neural sentence encoders contain a significant amount of information about the exact linear context of the token, and hypothesize that, with such information, learning standard probing tasks may be feasible even without additional linguistic structure. We develop this hypothesis into a framework in which analysis efforts can be scrutinized and argue that, with current models and baselines, conclusions that representations contain linguistic structure are not well-founded. Current probing methodology, such as restricting the classifier’s expressiveness or using strong baselines, can help to better estimate the complexity of learning, but not build a foundation for speculations about the nature of the linguistic structure encoded in the learned representations.

    Ladda ner fulltext (pdf)
    fulltext
  • 40.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Test Harder Than You Train: Probing with Extrapolation Splits2021Ingår i: Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP / [ed] Jasmijn Bastings, Yonatan Belinkov, Emmanuel Dupoux, Mario Giulianelli, Dieuwke Hupkes, Yuval Pinter, Hassan Sajjad, Punta Cana, Dominican Republic, 2021, Vol. 5, s. 15-25, artikel-id 2Konferensbidrag (Refereegranskat)
    Abstract [en]

    Previous work on probing word representations for linguistic knowledge has focused on interpolation tasks. In this paper, we instead analyse probes in an extrapolation setting, where the inputs at test time are deliberately chosen to be ‘harder’ than the training examples. We argue that such an analysis can shed further light on the open question whether probes actually decode linguistic knowledge, or merely learn the diagnostic task from shallow features. To quantify the hardness of an example, we consider scoring functions based on linguistic, statistical, and learning-related criteria, all of which are applicable to a broad range of NLP tasks. We discuss the relative merits of these criteria in the context of two syntactic probing tasks, part-of-speech tagging and syntactic dependency labelling. From our theoretical and experimental analysis, we conclude that distance-based and hard statistical criteria show the clearest differences between interpolation and extrapolation settings, while at the same time being transparent, intuitive, and easy to control.

    Ladda ner fulltext (pdf)
    fulltext
  • 41.
    Kunz, Jenny
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Where Does Linguistic Information Emerge in Neural Language Models?: Measuring Gains and Contributions across Layers2022Ingår i: Proceedings of the 29th International Conference on Computational Linguistics / [ed] Nicoletta Calzolari, Chu-Ren Huang, Hansaem Kim, James Pustejovsky, Leo Wanner, Key-Sun Choi, Pum-Mo Ryu, Hsin-Hsi Chen, Lucia Donatelli, Heng Ji, Sadao Kurohashi, Patrizia Paggio, Nianwen Xue, Seokhwan Kim, Younggyun Hahm, Zhong He, Tony Kyungil Lee, Enrico Santus, Francis Bond, Seung-Hoon Na, 2022, s. 4664-4676, artikel-id 1.413Konferensbidrag (Refereegranskat)
    Abstract [en]

    Probing studies have extensively explored where in neural language models linguistic information is located. The standard approach to interpreting the results of a probing classifier is to focus on the layers whose representations give the highest performance on the probing task. We propose an alternative method that asks where the task-relevant information emerges in the model. Our framework consists of a family of metrics that explicitly model local information gain relative to the previous layer and each layer’s contribution to the model’s overall performance. We apply the new metrics to two pairs of syntactic probing tasks with different degrees of complexity and find that the metrics confirm the expected ordering only for one of the pairs. Our local metrics show a massive dominance of the first layers, indicating that the features that contribute the most to our probing tasks are not as high-level as global metrics suggest.

    Ladda ner fulltext (pdf)
    fulltext
  • 42.
    Kurtz, Robin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Exploiting Structure in Parsing to 1-Endpoint-Crossing Graphs2017Ingår i: Proceedings of the 15th International Conference on Parsing Technologies, Association for Computational Linguistics, 2017Konferensbidrag (Refereegranskat)
    Abstract [en]

    Deep dependency parsing can be cast as the search for maximum acyclic subgraphs in weighted digraphs. Because this search problem is intractable in the general case, we consider its restriction to the class of 1-endpoint-crossing (1ec) graphs, which has high coverage on standard data sets. Our main contribution is a characterization of 1ec graphs as a subclass of the graphs with pagenumber at most 3. Building on this we show how to extend an existing parsing algorithm for 1-endpoint-crossing trees to the full class. While the runtime complexity of the extended algorithm is polynomial in the length of the input sentence, it features a large constant, which poses a challenge for practical implementations.

  • 43.
    Kurtz, Robin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    The Interplay Between Loss Functions and Structural Constraints in Dependency Parsing2019Ingår i: Northern European Journal of Language Technology (NEJLT), ISSN 2000-1533, Vol. 6, s. 43-66Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Dependency parsing can be cast as a combinatorial optimization problem with the objective to find the highest-scoring graph, where edge scores are learnt from data. Several of the decoding algorithms that have been applied to this task employ structural restrictions on candidate solutions, such as the restriction to projective dependency trees in syntactic parsing, or the restriction to noncrossing graphs in semantic parsing. In this paper we study the interplay between structural restrictions and a common loss function in neural dependency parsing, the structural hingeloss. We show how structural constraints can make networks trained under this loss function diverge and propose a modified loss function that solves this problem. Our experimental evaluation shows that the modified loss function can yield improved parsing accuracy, compared to the unmodified baseline.

    Ladda ner fulltext (pdf)
    fulltext
  • 44.
    Kurtz, Robin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Oepen, Stephan
    Univ Oslo, Dept Informat, Oslo, Norway.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    End-to-End Negation Resolution as Graph Parsing2020Ingår i: Proceedings of the 16th International Conference on Parsing Technologies and the IWPT 2020 Shared Task on Parsing into Enhanced Universal Dependencies / [ed] Bouma, Gosse and Matsumoto, Yuji and Oepen, Stephan and Sagae, Kenji and Seddah, Djamé and Sun, Weiwei and Søgaard, Anders and Tsarfaty, Reut and Zeman, Dan, Association for Computational Linguistics, 2020, s. 14-24Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present a neural end-to-end architecture for negation resolution based on a formulation of the task as a graph parsing problem. Our approach allows for the straightforward inclusion of many types of graph-structured features without the need for representation-specific heuristics. In our experiments, we specifically gauge the usefulness of syntactic information for negation resolution. Despite the conceptual simplicity of our architecture, we achieve state-of-the-art results on the Conan Doyle benchmark dataset, including a new top result for our best model.

    Ladda ner fulltext (pdf)
    fulltext
  • 45.
    Kurtz, Robin
    et al.
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Roxbo, Daniel
    Linköpings universitet, Tekniska fakulteten. Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Improving Semantic Dependency Parsing with Syntactic Features2019Ingår i: Proceedings of the First NLPL Workshop on Deep Learning for Natural Language Processing, Linköping University Electronic Press, 2019, s. 12-21Konferensbidrag (Refereegranskat)
    Abstract [en]

    We extend a state-of-the-art deep neural architecture for semantic dependency parsing with features defined over syntactic dependency trees. Our empirical results show that only gold-standard syntactic information leads to consistent improvements in semantic parsing accuracy, and that the magnitude of these improvements varies with the specific combination of the syntactic and the semantic representation used. In contrast, automatically predicted syntax does not seem to help semantic parsing. Our error analysis suggests that there is a significant overlap between syntactic and semantic representations.

    Ladda ner fulltext (pdf)
    fulltext
  • 46.
    Nivre, Joakim
    et al.
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Kuhlmann, Marco
    Uppsala universitet, Institutionen för lingvistik och filologi.
    Hall, Johan
    Uppsala universitet, Institutionen för lingvistik och filologi.
    An Improved Oracle for Dependency Parsing with Online Reordering2009Ingår i: Proceedings of the 11th International Conference on Parsing Technologies (IWPT), Stroudsburg, PA, USA: Association for Computational Linguistics, 2009, s. 73-76Konferensbidrag (Refereegranskat)
    Abstract [en]

    We present an improved training strategyfor dependency parsers that use online re-ordering to handle non-projective trees.The new strategy improves both efficiency and accuracy by reducing the number of swap operations performed on non-projective trees by up to 80%. We present state-of-the-art results for five languages with the best ever reported results for Czech.

  • 47.
    Norlund, Tobias
    et al.
    Chalmers University of Technology, Sweden; Recorded Future.
    Doostmohammadi, Ehsan
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    Johansson, Richard
    Chalmers University of Technology, Sweden; University of Gothenburg, Sweden.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Artificiell intelligens och integrerade datorsystem. Linköpings universitet, Tekniska fakulteten.
    On the Generalization Ability of Retrieval-Enhanced Transformers2023Ingår i: Findings of the Association for Computational Linguistics, ASSOC COMPUTATIONAL LINGUISTICS-ACL , 2023, s. 1485-1493Konferensbidrag (Refereegranskat)
    Abstract [en]

    Recent work on the Retrieval-Enhanced Transformer (Retro) model has shown that offloading memory from trainable weights to a retrieval database can significantly improve language modeling and match the performance of non-retrieval models that are an order of magnitude larger in size. It has been suggested that at least some of this performance gain is due to non-trivial generalization based on both model weights and retrieval. In this paper, we try to better understand the relative contributions of these two components. We find that the performance gains from retrieval largely originate from over-lapping tokens between the database and the test data, suggesting less non-trivial generalization than previously assumed. More generally, our results point to the challenges of evaluating the generalization of retrieval-augmented language models such as Retro, as even limited token overlap may significantly decrease test-time loss. We release our code and model at https://github.com/TobiasNorlund/retro

  • 48.
    Oepen, Stephan
    et al.
    University of Oslo.
    Abend, Omri
    The Hebrew University of Jerusalem.
    Hajic, Jan
    Charles University in Prague.
    Hershcovich, Daniel
    University of Copenhagen.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    O’Gorman, Tim
    University of Colorado at Boulder.
    Xue, Nianwen
    Brandeis University.
    Chun, Jayeol
    Brandeis University.
    Straka, Milan
    Charles University in Prague.
    Urešová, Zdeňka
    Charles University in Prague.
    MRP 2019: Cross-Framework Meaning Representation Parsing2019Ingår i: Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning, Association for Computational Linguistics , 2019, s. 1-27Konferensbidrag (Övrigt vetenskapligt)
    Abstract [en]

    The 2019 Shared Task at the Conference for Computational Language Learning (CoNLL) was devoted to Meaning Representation Parsing (MRP) across frameworks. Five distinct approaches to the representation of sentence meaning in the form of directed graph were represented in the training and evaluation data for the task, packaged in a uniform abstract graph representation and serialization. The task received submissions from eighteen teams, of which five do not participate in the official ranking because they arrived after the closing deadline, made use of additional training data, or involved one of the task co-organizers. All technical information regarding the task, including system submissions, official results, and links to supporting resources and software are available from the task web site at: http://mrp.nlpl.eu

  • 49.
    Oepen, Stephan
    et al.
    University of Oslo.
    Abend, OmriThe Hebrew University of Jerusalem.Hajič, JanCharles University in Prague.Hershcovich, DanielUniversity of Copenhagen.Kuhlmann, MarcoLinköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.O’Gorman, TimUniversity of Colorado at Boulder.Xue, NianwenBrandeis University.
    Proceedings of the Shared Task on Cross-Framework Meaning Representation Parsing at the 2019 Conference on Natural Language Learning2019Proceedings (redaktörskap) (Övrigt vetenskapligt)
  • 50.
    Oepen, Stephan
    et al.
    University of Oslo, Department of Informatics.
    Kuhlmann, Marco
    Linköpings universitet, Institutionen för datavetenskap, Interaktiva och kognitiva system. Linköpings universitet, Tekniska fakulteten.
    Miyao, Yusuke
    National Institute of Informatics, Tokyo.
    Zeman, Daniel
    Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics.
    Cinková, Silvie
    Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics.
    Flickinger, Dan
    Stanford University, Center for the Study of Language and Information.
    Hajič, Jan
    Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics.
    Ivanova, Angelina
    University of Oslo, Department of Informatics.
    Urešová, Zdeňka
    Charles University in Prague, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics.
    Towards Comparability of Linguistic Graph Banks for Semantic Parsing2016Ingår i: Proceedings of the 10th International Conference on Language Resources and Evaluation (LREC), European Language Resources Association, 2016, s. 3991-3995Konferensbidrag (Refereegranskat)
    Abstract [en]

    We announce a new language resource for research on semantic parsing, a large, carefully curated collection of semantic dependency graphs representing multiple linguistic traditions. This resource is called SDP 2016 and provides an update and extension to previous versions used as Semantic Dependency Parsing target representations in the 2014 and 2015 Semantic Evaluation Exercises (SemEval). For a common core of English text, this third edition comprises semantic dependency graphs from four distinct frameworks, packaged in a unified abstract format and aligned at the sentence and token levels. SDP 2016 is the first general release of this resource and available for licensing from the Linguistic Data Consortium from May 2016. The data is accompanied by an open-source SDP utility toolkit and system results from previous contrastive parsing evaluations against these target representations.

12 1 - 50 av 57
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • oxford
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf