Community Question Answering Ranking: Methodology Survey | ||||
The Egyptian Journal of Language Engineering | ||||
Article 1, Volume 9, Issue 2, September 2022, Page 1-22 PDF (1.77 MB) | ||||
Document Type: Original Article | ||||
DOI: 10.21608/ejle.2022.138720.1031 | ||||
View on SCiNiTO | ||||
Authors | ||||
Ahmed Zaazaa 1; Mohsen Rashwan2; Ossama Emam3 | ||||
1Faculty of Engineering, Cairo University | ||||
2Electronics and Communication Department, Faculty of Engineering, Cairo University, Giza, Egypt | ||||
3IBM | ||||
Abstract | ||||
This paper surveys the evolution of word embeddings along with the methodologies used in Community Question Answering (cQA), and how these methodologies use word embeddings to achieve higher performance metrics. The paper first discusses vector modelling and how it affected Natural Language Processing (NLP) as a whole, then it details some of the approaches used like the one-hot-encoding, word2vec and others. The paper then discusses contextualized embeddings and how they improve on the previous techniques. The paper then sheds some light on language modelling along with new attention-based architectures (Transformers), discussing briefly how they work and how they affected not only cQA but NLP in general. Then the paper discusses in brief the shift in the field from model-based AI where most of the focus is on producing a model with high performance metrics to Data Centric AI where the focus is on trying to have a systemic way of labelling the data to ease the generation of a high-performance model. | ||||
Keywords | ||||
Machine Learning (ML); Natural Language Processing (NLP); Community Question Answering (cQA); Ranking | ||||
Statistics Article View: 193 PDF Download: 268 |
||||