NuExtract-large

Maintainer: numind

102

Last updated 7/26/2024

🧪

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model overview

NuExtract-large is a version of the Phi-3-small model, fine-tuned by NuMind on a private high-quality synthetic dataset for information extraction. It is a text-to-text model designed for extracting structured information from input text.

Compared to similar models like NuNER-v0.1 and NuNER-multilingual-v0.1, which focus on entity recognition, NuExtract-large is specialized for more general information extraction tasks. It can extract relevant information from input text based on a provided JSON template.

Model inputs and outputs

NuExtract-large is a text-to-text model, taking in input text and a JSON template as input, and generating the extracted information as output.

Inputs

Input text: The input text can be up to 2000 tokens long. It contains the information that the model will extract from.
JSON template: A JSON template that describes the information the user wants to extract from the input text.
Example output: An optional example of the desired output formatting to help the model understand the task.

Outputs

Extracted information: The model's attempt at extracting the requested information from the input text, formatted according to the provided JSON template.

Capabilities

NuExtract-large is capable of extracting structured information from input text based on a provided template. It can handle a variety of information extraction tasks, from extracting key entities and facts to summarizing longer passages of text.

The model's fine-tuning on a high-quality synthetic dataset gives it strong performance on information extraction, as evidenced by its benchmarked results. It outperforms the base Phi-3-small model on these tasks.

What can I use it for?

NuExtract-large could be useful for a variety of applications that require extracting structured information from text, such as:

Automating data entry from documents or web pages
Summarizing long passages of text into key facts and entities
Powering intelligent search and question-answering systems
Streamlining business processes by extracting relevant information

Companies could potentially monetize NuExtract-large by building applications and services that leverage its information extraction capabilities, such as NuExtract from the model's maintainer NuMind.

Things to try

One interesting thing to try with NuExtract-large is using it to extract information from longer, more complex input texts. The model's fine-tuning on a high-quality dataset suggests it may be able to handle these types of inputs well, going beyond simple entity extraction to summarize key facts and relationships.

Another idea is to experiment with providing different levels of detail in the JSON template and example output to see how it affects the model's performance. This could help refine the template and instructions to get the most accurate extractions for your specific use case.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

📊

NuExtract

numind

140

NuExtract is a version of the phi-3-mini model, fine-tuned by numind on a private high-quality synthetic dataset for information extraction tasks. Compared to the base model, NuExtract is tailored for extracting specific information from input text. Other similar models from numind include the larger NuExtract-large and smaller NuExtract-tiny versions. Model inputs and outputs The NuExtract model takes two main inputs: a text passage (up to 2000 tokens) and a JSON template describing the information to extract. The model is purely extractive, meaning its output will consist of text directly present in the original input. Users can also provide an example output format to help the model understand the task more precisely. Inputs Text passage**: A text document up to 2000 tokens in length JSON template**: A JSON object describing the information to extract from the text Outputs Extracted information**: The relevant text from the input passage, formatted according to the provided JSON template or example Capabilities The NuExtract model excels at extracting specific pieces of information from input text. It can handle a variety of extraction tasks, such as pulling key facts, entities, or other structured data from documents. By fine-tuning the base phi-3-mini model, NuExtract has gained specialized capabilities for this type of information extraction while maintaining the strong reasoning and language understanding abilities of the original model. What can I use it for? The NuExtract model could be useful for any application that requires extracting structured data from text, such as: Automating information retrieval from business documents or reports Populating databases or knowledge graphs from unstructured data sources Powering intelligent search or question-answering systems Summarizing key details from lengthy technical or scientific papers Since NuExtract is a fine-tuned version of a larger language model, it can also serve as a starting point for further customization and fine-tuning to meet the needs of specific domains or use cases. Things to try One interesting aspect of NuExtract is its ability to handle both the text input and the JSON template in a unified way. This allows for greater flexibility in how the extraction task is specified, as users can experiment with different template formats or even provide examples to guide the model's output. Developers could also explore combining NuExtract with other numind models, such as the SOTA Multilingual Entity Recognition Foundation Model, to tackle more complex information extraction challenges.

Updated Invalid Date

Text-to-Text

📉

NuNER-v0.1

numind

The NuNER-v0.1 model is an English language entity recognition model fine-tuned from the RoBERTa-base model by the team at NuMind. This model provides strong token embeddings for entity recognition tasks in English. It was the prototype for the NuNER v1.0 model, which is the version reported in the paper introducing the model. The NuNER-v0.1 model outperforms the base RoBERTa-base model on entity recognition, achieving an F1 macro score of 0.7500 compared to 0.7129 for RoBERTa-base. Combining the last and second-to-last hidden states further improves performance to 0.7686 F1 macro. Other notable entity recognition models include bert-base-NER, a BERT-base model fine-tuned on the CoNLL-2003 dataset, and roberta-large-ner-english, a RoBERTa-large model fine-tuned for English NER. Model inputs and outputs Inputs Text**: The model takes in raw text as input, which it then tokenizes and encodes for processing. Outputs Entity predictions**: The model outputs a sequence of entity predictions for the input text, classifying each token as belonging to one of the four entity types: location (LOC), organization (ORG), person (PER), or miscellaneous (MISC). Token embeddings**: The model can also be used to extract token-level embeddings, which can be useful for downstream tasks. The author suggests using the concatenation of the last and second-to-last hidden states for better quality embeddings. Capabilities The NuNER-v0.1 model is highly capable at recognizing entities in English text, surpassing the base RoBERTa model on the CoNLL-2003 NER dataset. It can accurately identify locations, organizations, people, and miscellaneous entities within input text. This makes it a powerful tool for applications that require understanding the entities mentioned in documents, such as information extraction, knowledge graph construction, or content analysis. What can I use it for? The NuNER-v0.1 model can be used for a variety of applications that involve identifying and extracting entities from English text. Some potential use cases include: Information Extraction**: The model can be used to automatically extract key entities (people, organizations, locations, etc.) from documents, articles, or other text-based data sources. Knowledge Graph Construction**: The entity predictions from the model can be used to populate a knowledge graph with structured information about the entities mentioned in a corpus. Content Analysis**: By understanding the entities present in text, the model can enable more sophisticated content analysis tasks, such as topic modeling, sentiment analysis, or text summarization. Chatbots and Virtual Assistants**: The entity recognition capabilities of the model can be leveraged to improve the natural language understanding of chatbots and virtual assistants, allowing them to better comprehend user queries and respond appropriately. Things to try One interesting aspect of the NuNER-v0.1 model is its ability to produce high-quality token embeddings by concatenating the last and second-to-last hidden states. These embeddings could be used as input features for a wide range of downstream NLP tasks, such as text classification, named entity recognition, or relation extraction. Experimenting with different ways of utilizing these embeddings, such as fine-tuning on domain-specific datasets or combining them with other model architectures, could lead to exciting new applications and performance improvements. Another avenue to explore would be comparing the NuNER-v0.1 model's performance on different types of text data, beyond the news-based CoNLL-2003 dataset used for evaluation. Trying the model on more informal, conversational text (e.g., social media, emails, chat logs) could uncover interesting insights about its generalization capabilities and potential areas for improvement.

Updated Invalid Date

Text-to-Text

🏅

bert-large-NER

dslim

127

bert-large-NER is a fine-tuned BERT model that is ready to use for Named Entity Recognition and achieves state-of-the-art performance for the NER task. It has been trained to recognize four types of entities: location (LOC), organizations (ORG), person (PER) and Miscellaneous (MISC). Specifically, this model is a bert-large-cased model that was fine-tuned on the English version of the standard CoNLL-2003 Named Entity Recognition dataset. If you'd like to use a smaller BERT model fine-tuned on the same dataset, a bert-base-NER version is also available from the same maintainer, dslim. Model inputs and outputs Inputs A text sequence to analyze for named entities Outputs A list of recognized entities, their type (LOC, ORG, PER, MISC), and their position in the input text Capabilities bert-large-NER can accurately identify and classify named entities in English text, such as people, organizations, locations, and miscellaneous entities. It outperforms previous state-of-the-art models on the CoNLL-2003 NER benchmark. What can I use it for? You can use bert-large-NER for a variety of applications that involve named entity recognition, such as: Information extraction from text documents Knowledge base population by identifying key entities Chatbots and virtual assistants to understand user queries Content analysis and categorization The high performance of this model makes it a great starting point for building NER-based applications. Things to try One interesting thing to try with bert-large-NER is analyzing text from different domains beyond news articles, which was the primary focus of the CoNLL-2003 dataset. The model may perform differently on text from social media, scientific publications, or other genres. Experimenting with fine-tuning or ensembling the model for specialized domains could lead to further performance improvements.

Updated Invalid Date

Text-to-Text

🔍

Phi-3-small-8k-instruct

microsoft

108

The Phi-3-small-8k-instruct is a 7B parameter, lightweight, state-of-the-art open model from Microsoft. It is part of the Phi-3 family of models, which includes variants with different context lengths - 8K and 128K. The Phi-3 models are trained on a combination of synthetic data and filtered public websites, with a focus on high-quality and reasoning-dense properties. The Phi-3-small-8k-instruct model has undergone a post-training process that incorporates both supervised fine-tuning and direct preference optimization to enhance its ability to follow instructions and adhere to safety measures. When evaluated on benchmarks testing common sense, language understanding, math, code, long context, and logical reasoning, the model demonstrated robust and state-of-the-art performance among models of similar size. Model inputs and outputs Inputs Text prompts, best suited for the chat format Outputs Generated text responses to the input prompts Capabilities The Phi-3-small-8k-instruct model excels at tasks that require strong reasoning, such as math, coding, and logical analysis. It can provide detailed and coherent responses across a wide range of topics. What can I use it for? The Phi-3-small-8k-instruct model is intended for broad commercial and research use in English. It can be used in general-purpose AI systems and applications that require memory/compute constrained environments, low-latency scenarios, or robust reasoning capabilities. The model can accelerate research on language and multimodal models, and serve as a building block for generative AI-powered features. Things to try One interesting aspect of the Phi-3-small-8k-instruct model is its ability to provide step-by-step explanations and solutions for math and coding problems. You can try prompting the model with math equations or coding challenges and observe how it breaks down the problem and walks through the solution. Another interesting area to explore is the model's language understanding and common sense reasoning capabilities. You can provide it with prompts that require an understanding of the physical world, social norms, or abstract concepts, and see how it responds.

Updated Invalid Date

Text-to-Text