Supabase

Models by this creator

👨‍🏫

gte-small

The gte-small model is a smaller version of the General Text Embeddings (GTE) models developed by the Alibaba DAMO Academy. The GTE models are based on the BERT framework and offer different sizes, including GTE-large, GTE-base, and GTE-small. These models are trained on a large-scale corpus of relevant text pairs, covering a wide range of domains and scenarios. This enables the GTE models to be applied to various downstream tasks of text embeddings, such as information retrieval, semantic textual similarity, and text reranking. The gte-small model specifically has a smaller size of 0.07 GB and a dimension of 384, making it more lightweight and efficient compared to the larger GTE models. According to the metrics provided, the gte-small model performs well on a variety of tasks, including clustering, pair classification, retrieval, and semantic textual similarity. Model inputs and outputs Inputs Text inputs of up to 512 tokens Outputs Numeric text embeddings representing the semantic meaning of the input text Capabilities The gte-small model is capable of generating high-quality text embeddings that capture the semantic meaning of input text. These embeddings can be used for a variety of natural language processing tasks, such as information retrieval, text classification, and semantic search. The model's performance on the MTEB benchmark suggests that it can be a useful tool for these types of applications. What can I use it for? The gte-small model can be used for a variety of natural language processing tasks that require text embeddings. For example, you could use the model to: Information retrieval**: Retrieve relevant documents or web pages based on a user's query by comparing the query's embedding to the embeddings of the documents. Semantic textual similarity**: Measure the semantic similarity between two pieces of text by comparing their embeddings. Text reranking**: Reorder a list of text documents based on their relevance to a given query by using the text embeddings. Things to try One interesting thing to try with the gte-small model is to compare its performance on different downstream tasks to the larger GTE models, such as GTE-large and GTE-base. This could help you understand the tradeoffs between model size, complexity, and performance for your specific use case. Additionally, you could try fine-tuning the gte-small model on your own dataset to see if you can further improve its performance on your particular task.

Updated 6/29/2024

Text-to-Text