Senicic, Danica
[UCL]
Fairon, Cédrick
[UCL]
The aim of this thesis is to explore current systems for sentence alignment and adapt one for the automatic extraction and pairing of sentences in Serbian and English. For this purpose, the EXtraction and ALingment Pipeline was developed. EXALP takes unstructured data as an input, extracts the sentences, aligns them and presents them in readable format. Evaluated against manually extracted and aligned data, EXALP shows accuracy of 84%. The pipeline is easily adapted for other language pairs and its output can be applied in various domains of NLP and linguistics.
Référence bibliographique |
Senicic, Danica. Automatic alignment of bilingual sentences: the case of English and Serbian. Faculté de philosophie, arts et lettres, Université catholique de Louvain, 2017. Prom. : Fairon, Cédrick. |
Permalien |
http://hdl.handle.net/2078.1/thesis:11186 |