Colson, Jean-Pierre
[UCL]
As the Transformer architecture takes the broad context into account, it should (in principle) cope very well with PHRASEOLOGY, FORMULAE and (maybe) CONSTRUCTIONS However, this leaves us with a number of MISSING LINKS. The THEORETICAL UNDERPINNINGS of Transformer are, in the first place, not explicitly stated: Vaswani et al. (2017) make no reference to distributional semantics (although they use distributions), nor to information retrieval (although they use linear algebra and metric clusters based on probability). In probability, the distribution of continuous values has to take expertise into account, which in this case (language) was hardly the case. From a PRACTICAL POINT OF VIEW, there are also a number of missing links. Is idiomatic information present in Large Language Models based on Transformer (LLMs)? Is constructional information present? Is it the same type of information? Is a vector approach (cosine similarity or distance) appropriate for measuring phraseology, constructions? Is the fine-tuning of these models necessary for phraseology / CxG? Can we partly answer those questions by studying the results of Neural Machine Translation (NMT), which is also based on Transformer?
Bibliographic reference |
Colson, Jean-Pierre. Artificial intelligence, phraseology and constructions: a few missing links.Colloquium of the Work Group on Germanic Linguistics, Heinrich Heine University Düsseldorf (Düsseldorf, 12/11/2024). |
Permanent URL |
http://hdl.handle.net/2078.1/293616 |