LeoLM 70B is a German optimized large language model that beats Llama 2



summary
Summary

Update from 02. December 2023:

LAION releases the 70 billion version of LeoLM trained with 65 billion tokens. It is based on Llama-2-70b, but according to LAION it can beat Meta’s base model – in both German and English.

“With this release, we hope to bring a new wave of opportunities to German open-source and commercial LLM research and accelerate adoption,” the team writes.

According to LAION, LeoLM surpasses the translation performance of gpt-3.5.turbo-instruct in few-shot applications and achieves better benchmark results than the Llama-2 base model. All models, including a chat version, are available from Hugging Face under the Llama license.

Ad

Ad

Image: LAION

Original article from September 29. 2023:

Laion and Hessian.AI launch the German language model LeoLM (Linguistically Enhanced Open Language Model).

Laion and Hessian.AI have jointly developed LeoLM, the first commercially viable open-source “German Foundation Language Model”. It is based on Meta’s Llama 2 and has been trained on the Hessian.AI supercomputer 42 with an extensive corpus of high-quality German and country-specific texts.

The now released models LeoLM/leo-hessianai-7b and LeoLM/leo-hessianai-13b as well as the upcoming LeoLM-70B are intended to advance the German LLM landscape for open-source and commercial applications.

All models feature an 8K context window. The most powerful model is leo-hessianai-13b-chat, which almost reached the performance of GPT-3.5 in the humanities task of the GPT-4 based AI test “MT-Bench”.

Recommendation

available on HuggingFace. For productive use of a language model for German-language tasks, the announced 70 billion parameters should be where things get interesting.

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top