Ahxt

Models by this creator

📶

LiteLlama-460M-1T

ahxt

Total Score

158

LiteLlama-460M-1T is an open-source reproduction of Meta AI's LLaMa 2 model, but with significantly reduced model sizes. Trained on part of the RedPajama dataset using the GPT2Tokenizer, this 460M parameter model was trained on approximately 1 trillion tokens. The training curve can be viewed on the WandB project. Model inputs and outputs Inputs Text data Outputs Generated text Capabilities LiteLlama-460M-1T demonstrates strong performance on the MMLU task, scoring 21.13 in zero-shot and 26.39 in 5-shot evaluation. It also achieves competitive results on the Open LLM Leaderboard, with an average score of 26.65. What can I use it for? The LiteLlama-460M-1T model can be used for a variety of natural language generation tasks, such as text summarization, language modeling, and content creation. Its smaller model size makes it an attractive option for deployment on resource-constrained environments. Developers can easily load and use the model with the Transformers library, as shown in the provided code example. Things to try With its strong performance on benchmarks and easy integration with popular libraries, LiteLlama-460M-1T is a compelling option for developers looking to experiment with a reduced-scale version of the LLaMa 2 model. Potential use cases could include building language-based applications, evaluating model performance, or exploring the capabilities of smaller-scale language models.

Read more

Updated 5/28/2024