Sarvamai

Models by this creator

🎲

OpenHathi-7B-Hi-v0.1-Base

sarvamai

Total Score

89

OpenHathi-7B-Hi-v0.1-Base is a large language model developed by Sarvam AI that is based on Llama2 and trained on Hindi, English, and Hinglish data. It is a 7 billion parameter model, making it a mid-sized model compared to similar offerings like the alpaca-30b and PMC_LLAMA_7B models. This base model is designed to be fine-tuned on specific tasks, rather than used directly. Model inputs and outputs OpenHathi-7B-Hi-v0.1-Base is a text-to-text model, meaning it takes in text and generates new text in response. The model can handle a variety of language inputs, including Hindi, English, and code. Inputs Text prompts in Hindi, English, or Hinglish Outputs Generated text in response to the input prompt Capabilities OpenHathi-7B-Hi-v0.1-Base has broad capabilities in language generation, from open-ended conversation to task-oriented outputs. The model can be used for tasks like text summarization, question answering, and creative writing. It also has the potential to be fine-tuned for more specialized use cases, such as code generation or domain-specific language modeling. What can I use it for? The OpenHathi-7B-Hi-v0.1-Base model could be useful for a variety of applications that require language understanding and generation in Hindi, English, or a mix of the two. Some potential use cases include: Building virtual assistants or chatbots that can communicate in Hindi and English Generating content like news articles, product descriptions, or creative writing in multiple languages Translating between Hindi and English Providing language support for applications targeting Indian users Things to try One interesting thing to try with OpenHathi-7B-Hi-v0.1-Base would be to fine-tune it on a specific domain or task, such as customer service, technical writing, or programming. This could help the model learn the nuances and specialized vocabulary of that area, allowing it to generate more relevant and useful text. Additionally, exploring the model's performance on code-switching between Hindi and English could yield insights into its language understanding capabilities.

Read more

Updated 5/28/2024

⛏️

sarvam-2b-v0.5

sarvamai

Total Score

69

The sarvam-2b-v0.5 is an early checkpoint of the sarvam-2b language model, a small yet powerful model pre-trained from scratch on 2 trillion tokens. It is trained to be good at 10 Indic languages (Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Oriya, Punjabi, Tamil, and Telugu) plus English. The final checkpoint of sarvam-2b will be released soon, and it will be trained on a data mixture of 4 trillion tokens, containing equal parts English (2T) and Indic (2T) tokens. This early checkpoint has not undergone any post-training, but you can see its current capabilities in this video. The model was trained with the NVIDIA NeMo Framework on the Yotta Shakti Cloud using HGX H100 systems. Similar models include the OpenHathi-7B-Hi-v0.1-Base and the orca_mini_3b, both of which are based on the LLaMA-2 architecture. Model inputs and outputs Inputs Text prompts**: The model accepts text prompts as input, which can be in any of the 11 supported languages (10 Indic languages plus English). Outputs Text completions**: The model generates text completions based on the input prompt, continuing the sequence of text. Capabilities The sarvam-2b-v0.5 model has demonstrated strong performance on a variety of Indic language tasks, including text generation, translation, and understanding. Its tokenizer is designed to be efficient for Indic languages, with an average fertility score that is significantly lower than other popular models like LLaMA-3.1, Gemma-2, and GPT-4. This allows the model to handle Indic languages more effectively than some of its counterparts. What can I use it for? The sarvam-2b-v0.5 model can be used for a variety of natural language processing tasks in the Indic language domain, such as: Text generation**: The model can be used to generate coherent and fluent text in any of the 10 Indic languages or English. Translation**: The model can be fine-tuned for translation tasks between Indic languages and English. Question answering**: The model can be fine-tuned on question-answering datasets to provide accurate answers in Indic languages. To get started with using the model, you can check out this notebook on Google Colab. Things to try One interesting thing to try with the sarvam-2b-v0.5 model is to explore its multilingual capabilities. Since it is trained on a mix of Indic languages and English, you could experiment with prompts that combine multiple languages, or try generating text that seamlessly transitions between different languages. This could be useful for applications that need to handle code-switching or multilingual content. Another area to explore is the model's performance on different Indic language tasks, such as translation, summarization, or dialogue generation. By fine-tuning the model on task-specific datasets, you could unlock its full potential for real-world applications in the Indic language domain.

Read more

Updated 9/18/2024