bloom

Maintainer: bigscience

Total Score

4.6K

Last updated 5/28/2024

🤔

PropertyValue
Run this modelRun on HuggingFace
API specView on HuggingFace
Github linkNo Github link provided
Paper linkNo paper link provided

Create account to get full access

or

If you already have an account, we'll log you in

Model overview

BLOOM is a large language model developed by the BigScience collective, a group of over 1,000 researchers from around the world. It is a 176 billion parameter decoder-only transformer model trained on a dataset of over 1.5 TB of text data in 46 natural languages and 13 programming languages. Like other GPT-style models, BLOOM is trained to continue text from a prompt, producing coherent and contextually relevant output.

Similar models include the bloom-7b1 and bloomz variants, which are smaller models finetuned from BLOOM for different applications. The BLOOMChat-176B-v1 model, developed by SambaNova Systems, is an instruction-tuned version of BLOOM for conversational tasks.

Model inputs and outputs

BLOOM takes a text prompt as input and generates continuation text as output. The model can understand and generate text in 46 natural languages and 13 programming languages. Some key highlights include the large scale of the model, its multilingual capabilities, and the use of ALiBI positional embeddings to enable modeling of long-range dependencies.

Inputs

  • Text prompt: A sequence of text, which the model will use to generate a continuation.
  • Sequence length: BLOOM accepts sequences up to 2048 tokens in length.

Outputs

  • Generated text: Text continuation, where each generated token is selected to maximize the probability of the full output sequence given the input prompt.
  • Likelihood: A measure of how likely the generated text is, based on the model's internal probabilities.

Capabilities

BLOOM is a highly capable language model that can be used for a wide variety of text-related tasks. It can be used for open-ended text generation, such as creative writing or story generation. It can also be used for more structured tasks like translation, summarization, and question answering by framing them as text generation problems.

What can I use it for?

BLOOM's large scale and multilingual capabilities make it a powerful tool for research and development in natural language processing. Researchers can use BLOOM as a starting point for fine-tuning on specific tasks, or analyze its internal representations to gain insights into language learning. Developers can also integrate BLOOM into applications that require language understanding and generation, such as chatbots, virtual assistants, and language learning tools.

However, it's important to note that BLOOM is not intended for use in high-stakes or safety-critical applications, as it can produce incorrect or biased information. Users should carefully evaluate the model's outputs and take appropriate precautions when deploying BLOOM-based systems.

Things to try

One interesting aspect of BLOOM is its ability to generate text in multiple languages. You could try prompting the model with a phrase in one language and see what it generates in another. Another interesting experiment would be to explore BLOOM's performance on programming language tasks, such as code generation or explanation.

Additionally, you could investigate BLOOM's few-shot or zero-shot learning capabilities by framing tasks as text generation problems and seeing how the model performs without fine-tuning. This could provide insights into the model's general language understanding abilities.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🏋️

bloom-1b1

bigscience

Total Score

53

bloom-1b1 is a large open-source multilingual language model developed by the BigScience research workshop. It is a transformer-based model that has been trained on a diverse dataset of 45 natural languages and 12 programming languages, spanning over 1.5TB of text data. The model has 1,065,314,304 parameters, making it a substantial language model capable of generating coherent text across a wide range of topics and languages. The bloom-1b1 model is similar in scale and capabilities to other large language models like bloom-7b1 and bloom-1b7, which were also developed by BigScience. These models share the same underlying architecture and training approach, but differ in the total number of parameters. Model inputs and outputs Inputs Natural language prompts in any of the 45 supported languages Programming language prompts in any of the 12 supported languages Outputs Coherent text continuations of the provided prompts, reflecting the model's ability to understand and generate language across a diverse set of domains Capabilities The bloom-1b1 model is capable of generating fluent and coherent text in response to a wide variety of prompts, across both natural languages and programming languages. It can be used for tasks like language translation, question answering, summarization, and creative writing. The model's large scale and broad training data allow it to draw insights and make connections that can lead to novel and interesting outputs. What can I use it for? The bloom-1b1 model is well-suited for research and experimentation with large language models. Researchers can use the model to explore phenomena like multilingual language understanding, zero-shot learning, and the capabilities and limitations of transformer-based models at scale. Developers may find the model useful as a starting point for building applications that require natural language processing or generation, such as chatbots, content creation tools, or language learning platforms. The model's broad capabilities and licensing make it an accessible resource for a variety of use cases. Things to try One interesting aspect of the bloom-1b1 model is its ability to generate text in programming languages. Developers could experiment with using the model to assist with code generation, documentation writing, or even creative programming tasks. The model's multilingual capabilities also open up possibilities for building language-agnostic applications or exploring cross-cultural perspectives. Another avenue to explore is the model's performance on specialized tasks or domains. While the model was trained on a diverse dataset, its outputs may still reflect biases or limitations in the training data. Evaluating the model's behavior on tasks related to sensitive topics, such as politics or social issues, could provide valuable insights into the model's strengths and weaknesses.

Read more

Updated Invalid Date

🔮

bloom-7b1

bigscience

Total Score

184

bloom-7b1 is a 7 billion parameter multilingual language model developed by the BigScience collaborative research workshop. It was pretrained on a large, diverse dataset of 341.6 billion tokens in 46 languages. The model uses a transformer-based architecture similar to GPT-2, with modifications such as layer normalization on the word embeddings, ALiBI positional encodings, and GeLU activation functions. bloom-7b1 is part of the larger BLOOM model family, which includes variants ranging from 560 million to 176 billion parameters. The BLOOMZ model is a finetuned version of bloom-7b1 that has been optimized for cross-lingual tasks and understanding. Model inputs and outputs bloom-7b1 is a text-to-text model that can be used for a variety of natural language processing tasks. It takes text as input and generates relevant text as output. Inputs Free-form text in multiple languages, such as prompts, instructions, or questions Outputs Relevant text responses generated based on the input The model can be used for tasks like translation, question answering, and open-ended text generation Capabilities bloom-7b1 has strong multilingual capabilities, able to understand and generate text in 46 different languages. The model has shown promising performance on a variety of benchmarks, including translation, language understanding, and open-ended generation tasks. What can I use it for? bloom-7b1 can be used for a wide range of natural language processing applications, such as: Translation**: Translating text between supported languages Question Answering**: Answering questions based on provided context Summarization**: Generating concise summaries of longer text Text Generation**: Producing coherent, human-like text based on prompts The model's multilingual capabilities make it particularly useful for projects that involve working with text in multiple languages. Developers and researchers can fine-tune bloom-7b1 on domain-specific data to adapt it for their particular use cases. Things to try Some interesting things to try with bloom-7b1 include: Experimenting with different prompting techniques to see how the model responds to various types of input Evaluating the model's performance on specialized benchmarks or datasets relevant to your application Exploring the model's ability to handle long-form text, such as generating multi-paragraph responses Investigating how the model's performance varies across different languages and language pairs By leveraging the capabilities of bloom-7b1, you can unlock new possibilities for your natural language processing projects.

Read more

Updated Invalid Date

bloom-3b

bigscience

Total Score

85

The bloom-3b is a large language model developed by the BigScience workshop, a collaborative research effort to create open-access multilingual language models. It is a transformer-based model trained on a diverse dataset of 46 natural languages and 13 programming languages, totaling 1.6TB of preprocessed text. This model is similar in scale to other large language models like bloom-7b1 and bloom-1b1, but with more parameters and a broader language coverage. Model inputs and outputs The bloom-3b is an autoregressive language model, meaning it takes text as input and generates additional text as output. It can be instructed to perform a variety of text generation tasks, such as continuing a given prompt, rewriting text with a different tone or perspective, or answering questions. Inputs Text prompt: A sequence of text that the model will use to generate additional content. Outputs Generated text: The model's continuation of the input prompt, producing coherent and contextually relevant text. Capabilities The bloom-3b model has impressive multilingual capabilities, able to generate fluent text in 46 natural languages and 13 programming languages. It can be used for a variety of text-based tasks, such as language translation, code generation, and creative writing. However, it is important to note that the model may exhibit biases and limitations, and its outputs should not be treated as factual or reliable in high-stakes settings. What can I use it for? The bloom-3b model can be used for a variety of language-related tasks, such as text generation, language translation, and code generation. For example, you could use it to generate creative stories, summarize long documents, or write code in multiple programming languages. The model's multilingual capabilities also make it a useful tool for cross-language communication and collaboration. Things to try One interesting thing to try with the bloom-3b model is to give it prompts that combine multiple languages or mix natural language and code. This can reveal insights about the model's understanding of language structure and its ability to switch between different modes of expression. Additionally, you can experiment with providing the model with prompts that require a specific tone, style, or perspective, and observe how it adapts its generated text accordingly.

Read more

Updated Invalid Date

📈

bloom-560m

bigscience

Total Score

326

The bloom-560m is a large language model developed by the BigScience research collective. It is a transformer-based model trained on a vast multilingual dataset spanning 45 natural languages and 12 programming languages. The model is part of the BLOOM family of language models, which also includes the larger bloom-1b1 and bloom-1b7 models. These models are designed to enable public research on large language models and can be used for a variety of text generation tasks. Model inputs and outputs The bloom-560m model takes text prompts as input and generates coherent text outputs in response. The model was trained on a diverse dataset, allowing it to understand and generate text in multiple languages. It can be used for tasks like text generation, language modeling, and exploring the characteristics of language generated by a large language model. Inputs Text prompts in a variety of languages, including natural languages and programming languages Outputs Generated text in response to the input prompts The generated text can be in the same language as the input prompt, or in a different language if the model is instructed to translate or generate text in a specific language Capabilities The bloom-560m model is capable of generating coherent and contextually relevant text in a wide range of languages. It can be used for tasks like language translation, text summarization, and even creative writing. The model's multilingual capabilities make it a valuable tool for researchers and developers working on multilingual applications. What can I use it for? The bloom-560m model can be used for a variety of text-based tasks, such as: Text generation**: Generating coherent text in response to prompts, which can be used for creative writing, content generation, and more. Language modeling**: Exploring the characteristics of the language generated by the model, which can provide insights into language use and patterns. Language translation**: Translating text from one language to another, leveraging the model's multilingual capabilities. Downstream tasks**: Using the bloom-560m model as a pre-trained base for fine-tuning on specific tasks, such as question answering, information extraction, or summarization. Researchers and developers can use the bloom-560m model to explore the capabilities of large language models and develop applications that leverage these capabilities. Things to try One interesting aspect of the bloom-560m model is its ability to generate text in a wide range of programming languages. Developers can experiment with using the model to generate code snippets, explore how the model represents programming concepts, or even try to fine-tune the model on specific programming tasks. Another interesting direction to explore is the model's multilingual capabilities. Users can try providing prompts in different languages and observe how the model generates text in response, or experiment with using the model for cross-lingual tasks like translating between languages. Overall, the bloom-560m model offers a rich set of capabilities for researchers and developers to explore, and the provided links to similar models and related research papers can serve as a valuable starting point for further investigation.

Read more

Updated Invalid Date