mbart-large-50

Maintainer: facebook

120

Last updated 5/27/2024

⚙️

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model Overview

mbart-large-50 is a multilingual Sequence-to-Sequence model pre-trained using the "Multilingual Denoising Pretraining" objective. It was introduced in the Multilingual Translation with Extensible Multilingual Pretraining and Finetuning paper. The model was developed by Facebook and can be used for multilingual machine translation tasks.

Similar models include the XLM-RoBERTa (large-sized) and XLM-RoBERTa (base-sized) models, which are also multilingual transformer-based language models. The roberta-large-mnli and bart-large-mnli models are fine-tuned versions of RoBERTa and BART for natural language inference tasks.

Model Inputs and Outputs

The mbart-large-50 model is a multilingual Sequence-to-Sequence model, meaning it takes a sequence of text as input and generates a sequence of text as output.

Inputs

Source text: The text to be translated or transformed, in any of the 50 supported languages.
Language IDs: A special language ID token is used as a prefix in both the source and target text to indicate the language.

Outputs

Target text: The translated or transformed text, in the target language specified by the language ID.

Capabilities

The mbart-large-50 model is primarily intended for multilingual machine translation tasks. It can translate between any of the 50 supported languages, including low-resource languages. The model can also be fine-tuned on other Sequence-to-Sequence tasks like summarization, text generation, and more.

What can I use it for?

You can use mbart-large-50 to build multilingual machine translation applications, where users can input text in one language and receive a translation in another. This could be useful for international businesses, travel apps, language learning platforms, and more.

The model can also be fine-tuned on other Sequence-to-Sequence tasks, like summarizing news articles in multiple languages or generating product descriptions in various languages. Developers can explore these possibilities on the Hugging Face model hub.

Things to try

One interesting thing to try with mbart-large-50 is zero-shot translation, where you input text in a language the model wasn't fine-tuned on and ask it to translate to another language. This can be a powerful capability for building flexible, multilingual applications.

You can also experiment with using the model for other Sequence-to-Sequence tasks beyond translation, like text summarization or data-to-text generation. The multilingual nature of the model may enable interesting cross-lingual capabilities in these areas as well.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🖼️

mbart-large-50-many-to-many-mmt

facebook

223

mbart-large-50-many-to-many-mmt is a multilingual machine translation model that can translate directly between any pair of 50 languages. It is a fine-tuned checkpoint of the mBART-large-50 model, introduced in the paper Multilingual Translation with Extensible Multilingual Pretraining and Finetuning. The model was developed by Facebook. Similar multilingual translation models include mbart-large-50-many-to-one-mmt, which can translate to English from the same 50 languages, and Llama2-13b-Language-translate, which can translate from English to the 49 other languages. Model inputs and outputs Inputs Source text**: The text to be translated, in one of the 50 supported languages. Target language**: The language to translate the source text into, specified by the language code. Outputs Translated text**: The source text translated into the target language. Capabilities mbart-large-50-many-to-many-mmt can translate directly between any pair of the 50 supported languages, which include languages like Arabic, Chinese, Hindi, and Spanish. This allows for high-quality multilingual translation without the need for pivot languages. What can I use it for? You can use mbart-large-50-many-to-many-mmt for a variety of multilingual translation tasks, such as: Translating web content, documents, or other text between any of the 50 supported languages. Facilitating cross-lingual communication and collaboration in multinational organizations. Improving accessibility of information for speakers of different languages. Enhancing machine translation capabilities for commercial or research purposes. See the model hub to explore more fine-tuned versions of the mBART-50 model. Things to try Try experimenting with different language combinations to see the model's performance across various language pairs. You can also fine-tune the model further on domain-specific data to improve its translation quality for your particular use case.

Updated Invalid Date

Text-to-Text

✨

mbart-large-50-many-to-one-mmt

facebook

mbart-large-50-many-to-one-mmt is a fine-tuned checkpoint of the mBART-large-50 model. It was introduced in the paper "Multilingual Translation with Extensible Multilingual Pretraining and Finetuning" and is a multilingual machine translation model that can translate directly between any pair of 50 languages. This model is an extension of the original mBART model, adding support for an additional 25 languages to create a 50-language multilingual translation system. The mBART-50 model was pre-trained using a "Multilingual Denoising Pretraining" objective, where the model is tasked with reconstructing the original text from a noised version. This allows the model to learn a multilingual representation that can be effectively fine-tuned for translation tasks. Some similar models include the Llama2-13b-Language-translate model, which is also a fine-tuned multilingual translation model, and the M2M100-1.2B model, which can directly translate between 9,900 language directions across 100 languages. Model inputs and outputs Inputs Source text in any of the 50 supported languages Outputs Translated text in the target language Capabilities The mbart-large-50-many-to-one-mmt model can translate directly between any pair of the 50 supported languages, which include a diverse set of languages such as Arabic, Chinese, Hindi, Russian, and more. This makes it a powerful tool for multilingual translation tasks. What can I use it for? The mbart-large-50-many-to-one-mmt model can be used for a variety of multilingual translation tasks, such as: Translating content (e.g. articles, documents, websites) between different languages Enabling cross-lingual communication and collaboration Providing language support for global businesses or organizations Assisting with language learning and education See the model hub to explore other fine-tuned versions of the mBART-50 model that may be better suited for your specific use case. Things to try One interesting thing to try with this model is to explore how it handles translations between more linguistically distant languages, such as translating from a European language to an Asian language. The model's multilingual pre-training should allow it to capture cross-lingual relationships, but the quality of the translations may vary depending on the language pair. Additionally, you could experiment with translating between low-resource languages, where the model's performance may provide insight into its generalization capabilities.

Updated Invalid Date

Text-to-Text

👨‍🏫

mbart-large-cc25

facebook

mbart-large-cc25 is a pretrained multilingual mBART model from Facebook. It is a multilingual Sequence-to-Sequence model pre-trained using the "Multilingual Denoising Pretraining" objective. The model can be fine-tuned for a variety of tasks, including multilingual machine translation and summarization. Similar models include mBART-50, which is an extended version of mBART covering 50 languages, and mBART-50 fine-tuned models for specific tasks like many-to-many and many-to-one machine translation. Model inputs and outputs mbart-large-cc25 is a text-to-text transformer model. It takes in text in one of the 25 supported languages and generates translated text in another supported language. Inputs Text in one of the 25 supported languages, including Arabic, Czech, German, English, Spanish, and many others. Outputs Translated text in any of the 25 supported languages. Capabilities mbart-large-cc25 is capable of multilingual translation between any pair of its 25 supported languages. It can translate text from one language to another, and has been shown to perform well on a variety of translation tasks. What can I use it for? You can use mbart-large-cc25 for a range of multilingual text generation and translation use cases, such as: Translating text between different languages Building multilingual chatbots or virtual assistants Powering language learning applications Multilingual content generation for websites or apps The model is particularly useful when you need to work with content in multiple languages. Things to try One interesting thing to try with mbart-large-cc25 is using it for zero-shot cross-lingual transfer learning. Since the model is pre-trained on a wide range of languages, you may be able to fine-tune it on a task in one language and have it generalize to perform well on that task in other languages, without needing to fine-tune on data in those other languages. This could be a powerful technique for building multilingual NLP applications with limited training data.

Updated Invalid Date

Text-to-Text

🛠️

bart-large

facebook

158

The bart-large model is a large-sized BART (Bidirectional and Auto-Regressive Transformer) model pre-trained on English language. BART is a transformer encoder-decoder (seq2seq) model with a bidirectional (BERT-like) encoder and an autoregressive (GPT-like) decoder. BART is pre-trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text. BART is particularly effective when fine-tuned for text generation (e.g. summarization, translation) but also works well for comprehension tasks (e.g. text classification, question answering). The bart-base model is a base-sized BART model with a similar architecture and training procedure as the bart-large model. The bart-large-cnn model is the bart-large model that has been fine-tuned on the CNN Daily Mail dataset, making it particularly effective for text summarization tasks. The mbart-large-cc25 and mbart-large-50 models are multilingual BART models that can be used for various cross-lingual tasks. The roberta-large model is a large RoBERTa model, a transformer model pre-trained on a large corpus of English data using a masked language modeling objective. Model inputs and outputs Inputs Text**: The bart-large model takes text as input, which can be a single sentence or a longer passage. Outputs Text**: The bart-large model outputs text, which can be used for tasks like text generation, summarization, and translation. Capabilities The bart-large model is particularly effective at text generation and understanding tasks. It can be used for tasks like text summarization, translation, and question answering. For example, when fine-tuned on the CNN Daily Mail dataset, the bart-large-cnn model can generate concise summaries of news articles. What can I use it for? You can use the bart-large model for a variety of text-to-text tasks, such as summarization, translation, and text generation. The model hub has various fine-tuned versions of the BART model for different tasks, which you can use as a starting point for your own applications. Things to try One interesting thing to try with the bart-large model is using it for text infilling, where you can mask out parts of the input text and have the model generate the missing text. This can be useful for tasks like language modeling and text generation. You can also explore fine-tuning the model on your own dataset to adapt it to your specific use case.

Updated Invalid Date

Text-to-Text