Large Language Models Across Languages - Pavel Král

Показать описание

My presentation, titled "Large Language Models Across Languages" aims to explore the fascinating world of large language models (LLMs) and their varying efficiencies in different languages.

My talk is grounded in extensive research publicly available on GitHub, providing a robust and transparent foundation for the insights shared.

Key Points of the Talk:

* Token Economy Across Languages: An analysis of which languages require the most tokens to express the same content. This segment will dive into the tokenization process of LLMs and compare major languages on their token consumption for equivalent texts.

* Cross-Lingual Embeddings Effectiveness: Evaluating how effectively LLMs' embeddings capture and express information across different languages. This part will focus on the quality of embeddings in bridging language gaps and maintaining semantic integrity.

* Reasoning Abilities in Various Languages: Investigating the reasoning capabilities of LLMs when operating in multiple languages. The focus will be on comparing the models' performance across a spectrum of languages.