Latin America Launches Collaborative Open‑Source AI Model, Latam‑GPT

The Chilean National Center for Artificial Intelligence (CENIA) is spearheading Latam‑GPT, an open‑source large language model built for Latin America and the Caribbean. Backed by more than thirty strategic partners, the project has gathered a multi‑terabyte corpus covering diverse regional content and is training a model with 50 billion parameters. A new supercomputing facility at the University of Tarapacá, equipped with twelve nodes and state‑of‑the‑art GPUs, provides the computational power needed. Latam‑GPT aims to deliver performance comparable to commercial models while offering deeper cultural relevance, with plans to support sectors such as education, health and agriculture.

Project Overview

Latam‑GPT is a collaborative artificial‑intelligence initiative led by the Chilean National Center for Artificial Intelligence (CENIA). The effort brings together more than thirty institutions across Latin America and the Caribbean to develop a large language model tailored to the region’s languages, dialects and cultural contexts. By focusing on open‑source principles, the project seeks to provide a free, adaptable AI resource that can be customized for specific local applications.

Data Collection and Model Scale

Partners have assembled a corpus exceeding eight terabytes of text, representing millions of documents from twenty countries and Spain. The data set balances regional representation, ensuring that no single country dominates the content. Using this corpus, CENIA is training a model with 50 billion parameters, a size comparable to commercially available models such as GPT‑3.5. The model is expected to perform well on general tasks while offering superior knowledge on topics specific to Latin America.

Technical Infrastructure

A cornerstone of the initiative is a new supercomputing center at the University of Tarapacá in Arica, Chile. The facility includes twelve nodes, each equipped with eight NVIDIA H200 GPUs, providing unprecedented computing capacity in the region. This infrastructure enables large‑scale training to be conducted locally, reducing reliance on external cloud services and supporting the goal of technological sovereignty.

Regional Impact and Applications

Latam‑GPT is designed to serve a variety of sectors. Early use cases envision adaptations for the education system, health services, agriculture and cultural preservation. By delivering responses grounded in regional history, sports, politics and indigenous heritage, the model can offer more relevant assistance than generic global models that often default to examples from other parts of the world.

Future Development

The first version of Latam‑GPT will be launched this year, with plans to expand the model family to include multimodal capabilities such as image and video processing. Ongoing collaboration aims to incorporate additional languages, including indigenous tongues, and to refine the data set for broader topic coverage. The long‑term vision is for Latin American institutions to become creators and owners of advanced AI technology rather than merely consumers.

Key Points

Project Overview

Data Collection and Model Scale

Technical Infrastructure

Regional Impact and Applications

Future Development