HiTZ Center develops Latxa, the largest language model for Basque
This first version developed by the HiTZ center, a LANGUNE member, will be essential for building chatbot-type tools for the general public.
A large language model, LLM, is an artificial intelligence model that uses machine learning techniques to understand and produce human language; it is based on the knowledge generated by massive data sets. Basque also has its own LLM: Latxa. It is based on the Meta's LLaMa models, and brings together between 7 and 70 billion model parameters. Today’s LLMs perform amazingly in well-resourced languages, such as those of ChatGPT or Bard for English. However, in the case of Basque and other poorly-resourced languages, its performance is close to random guessing. This increases the technological gap between well-resourced and poorly-resourced languages, at least as far as digital tools are concerned. Latxa was developed by the Langune member HiTZ Basque Centre for Language Technology (UPV/EHU) to overcome these limitations and promote the development of products, innovations and products in Basque based on the LLM. This work was supported by the Basque Government (within the IKER-GAITU project).