Download
Abstract
The aim of my research is to devise Natural Language Processing (NLP) systems that learn to generate, distill, and use knowledge from unstructured text. Driven by an intrinsic desire to acquire new knowledge [Aristotle and Ross, 1933], humans have consistently developed tools for this purpose. Machine learning systems are no exception. These systems should be designed to provide new knowledge, which, accordingly, represents a valuable component for their evaluation and trustworthiness. In NLP, this requires the development of methods with the capability of understanding text to distill and organize novel structured knowledge. Moreover, these capabilities should be designed to guarantee desired properties of autonomous support systems, such as model transparency, efficiency, robustness to text representations, adaptability to different contexts, and computational scalability. Concretely, we require a paradigm shift in how machine learning problems are formulated, going from the traditional approach of “Given input X , provide output Y”, where the focus is on Y, to “Given input X and knowledge K, provide output Y and update K”, where the focus is also on K. Here, the knowledge K refers to any information bit that contributes to the general human understanding of a given problem. Notably, this includes information about the task, the domain, and the model of interest. This paradigm shift implies that when we design a system for an NLP problem, our purpose and measurement of its success should not be limited to model accuracy. Instead, we should also account for the reason behind that accuracy. This translates to a model that not only explains its predictions, but also aggregates individual explanations to summarize its decision-making process into more general criteria. Altogether, these capabilities constitute important desiderata for several real-world applications where extracted knowledge represents the basis for understanding the underlying factors like human sociodynamics, dialogical interactions, and reasoning. Notable examples include Argument Mining, where understanding how humans convey and relate opinions is paramount [Lawrence and Reed, 2019], Hate Speech Detection, where knowledge about the cultural domain of content is necessary to discriminate hateful content [Yu et al., 2022], and Legal Analytics, where the decision-making process of legal experts is inherently grounded on factual knowledge [Xu et al., 2020]. During my research activity, I have taken several steps towards this ultimate goal of learning with knowledge, with applications in Argument Mining and Legal Analytics. I will continue these efforts by focusing on two main directions:
- Unstructured Knowledge Integration. The capability of models to leverage a large amount of unstructured textual knowledge to address specific problems.
- Structured Knowledge Extraction from Text. The capability of models to extract structured knowledge from raw text.
References
- [Aristotle and Ross, 1933] Aristotle and Ross, W. D. (1933). Metaphysics, volume 1. Harvard University Press Cambridge, MA.
- [Lawrence and Reed, 2019] Lawrence, J. and Reed, C. (2019). Argument mining: A survey. Computational Linguistics, 45(4):765–818.
- [Yu et al., 2022] Yu, X., Blanco, E., and Hong, L. (2022). Hate speech and counter speech detection: Conversational context does matter. In Carpuat, M., de Marneffe, M.-C., and Meza Ruiz, I. V., editors, Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 5918–5930, Seattle, United States. Association for Computational Linguistics
- [Xu et al., 2020] Xu, N., Wang, P., Chen, L., Pan, L., Wang, X., and Zhao, J. (2020). Distinguish confusing law articles for legal judgment prediction. In Jurafsky, D., Chai, J., Schluter, N., and Tetreault, J., editors, Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 3086–3095, Online. Association for Computational Linguistics.