Zero-Shot Hierarchical Classification on the Common Procurement Vocabulary Taxonomy

Read original: arXiv:2405.09983 - Published 5/31/2024 by Federico Moiraghi, Matteo Palmonari, Davide Allavena, Federico Morando

Zero-Shot Hierarchical Classification on the Common Procurement Vocabulary Taxonomy

Overview

This paper presents a zero-shot hierarchical classification approach for the Common Procurement Vocabulary (CPV) taxonomy, a large-scale hierarchical classification task.
The authors leverage pre-trained vision-language models to classify procurement items into the relevant CPV categories without any task-specific fine-tuning.
The proposed method outperforms existing zero-shot classification approaches on the CPV dataset, demonstrating the effectiveness of using vision-language models for challenging hierarchical classification tasks.

Plain English Explanation

In this research, the authors tackled the problem of classifying procurement items into a hierarchical taxonomy called the Common Procurement Vocabulary (CPV). This is a complex task because the CPV taxonomy has a deep, multi-level structure with thousands of categories.

The key idea is to use pre-trained vision-language models that have been trained on a vast amount of image and text data. These models can understand the relationship between visual and textual information, which is very useful for classifying procurement items.

The researchers showed that this "zero-shot" approach, where the model is not fine-tuned on any CPV-specific data, outperforms existing methods that require substantial task-specific training. This demonstrates the power of leveraging large language models for challenging classification tasks, even when the target domain is quite different from the model's original training data.

Technical Explanation

The authors propose a zero-shot hierarchical classification approach for the CPV taxonomy, a large-scale, multi-level hierarchy of procurement item categories. They leverage pre-trained vision-language models to classify procurement item descriptions into the relevant CPV categories without any task-specific fine-tuning.

Specifically, the researchers use a joint image-text embedding produced by the vision-language model to represent both the procurement item descriptions and the CPV category names. They then classify each item by finding the closest matching CPV category in this shared embedding space.

Experiments on the CPV dataset show that this zero-shot approach outperforms existing methods that require substantial task-specific training. The authors attribute this success to the ability of large language models to capture rich semantic relationships between visual and textual information, which is crucial for navigating the complex CPV hierarchy.

Critical Analysis

The paper presents a compelling approach to tackle a challenging hierarchical classification task using zero-shot learning. However, there are a few potential limitations and areas for further research:

The authors only evaluate their method on the CPV dataset, which has its own unique characteristics. It would be valuable to assess the generalizability of the approach on other hierarchical classification benchmarks.
While the zero-shot performance is impressive, it's unclear how the method would scale as the CPV taxonomy grows even larger. Investigating the model's robustness to changes in the hierarchy would be an interesting direction.
The paper does not provide much insight into the types of errors the model makes or the specific categories it struggles with. A more in-depth error analysis could yield valuable insights for improving the approach.
The authors mention that incorporating additional domain-specific knowledge, such as procurement regulations, could further boost performance. Exploring ways to effectively integrate such external information would be a worthwhile area for future research.

Conclusion

This paper presents a novel zero-shot hierarchical classification approach that leverages pre-trained vision-language models to tackle the challenging CPV taxonomy. The results demonstrate the power of large language models in capturing rich semantic relationships, enabling effective classification without task-specific fine-tuning.

The proposed method has the potential to significantly simplify the development of hierarchical classification systems, especially in domains where obtaining labeled data is costly or time-consuming. As the authors suggest, integrating domain-specific knowledge could further improve the approach, making it a promising direction for future research in this area.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Zero-Shot Hierarchical Classification on the Common Procurement Vocabulary Taxonomy

Federico Moiraghi, Matteo Palmonari, Davide Allavena, Federico Morando

Classifying public tenders is a useful task for both companies that are invited to participate and for inspecting fraudulent activities. To facilitate the task for both participants and public administrations, the European Union presented a common taxonomy (Common Procurement Vocabulary, CPV) which is mandatory for tenders of certain importance; however, the contracts in which a CPV label is mandatory are the minority compared to all the Public Administrations activities. Classifying over a real-world taxonomy introduces some difficulties that can not be ignored. First of all, some fine-grained classes have an insufficient (if any) number of observations in the training set, while other classes are far more frequent (even thousands of times) than the average. To overcome those difficulties, we present a zero-shot approach, based on a pre-trained language model that relies only on label description and respects the label taxonomy. To train our proposed model, we used industrial data, which comes from contrattipubblici.org, a service by SpazioDati s.r.l. that collects public contracts stipulated in Italy in the last 25 years. Results show that the proposed model achieves better performance in classifying low-frequent classes compared to three different baselines, and is also able to predict never-seen classes.

5/31/2024

🏷️

LLM meets Vision-Language Models for Zero-Shot One-Class Classification

Yassir Bendou, Giulia Lioi, Bastien Pasdeloup, Lukas Mauch, Ghouthi Boukli Hacene, Fabien Cardinaux, Vincent Gripon

We consider the problem of zero-shot one-class visual classification, extending traditional one-class classification to scenarios where only the label of the target class is available. This method aims to discriminate between positive and negative query samples without requiring examples from the target class. We propose a two-step solution that first queries large language models for visually confusing objects and then relies on vision-language pre-trained models (e.g., CLIP) to perform classification. By adapting large-scale vision benchmarks, we demonstrate the ability of the proposed method to outperform adapted off-the-shelf alternatives in this setting. Namely, we propose a realistic benchmark where negative query samples are drawn from the same original dataset as positive ones, including a granularity-controlled version of iNaturalist, where negative samples are at a fixed distance in the taxonomy tree from the positive ones. To our knowledge, we are the first to demonstrate the ability to discriminate a single category from other semantically related ones using only its label.

5/28/2024

TELEClass: Taxonomy Enrichment and LLM-Enhanced Hierarchical Text Classification with Minimal Supervision

Yunyi Zhang, Ruozhen Yang, Xueqiang Xu, Rui Li, Jinfeng Xiao, Jiaming Shen, Jiawei Han

Hierarchical text classification aims to categorize each document into a set of classes in a label taxonomy. Most earlier works focus on fully or semi-supervised methods that require a large amount of human annotated data which is costly and time-consuming to acquire. To alleviate human efforts, in this paper, we work on hierarchical text classification with the minimal amount of supervision: using the sole class name of each node as the only supervision. Recently, large language models (LLM) show competitive performance on various tasks through zero-shot prompting, but this method performs poorly in the hierarchical setting, because it is ineffective to include the large and structured label space in a prompt. On the other hand, previous weakly-supervised hierarchical text classification methods only utilize the raw taxonomy skeleton and ignore the rich information hidden in the text corpus that can serve as additional class-indicative features. To tackle the above challenges, we propose TELEClass, Taxonomy Enrichment and LLM-Enhanced weakly-supervised hierarchical text Classification, which (1) automatically enriches the label taxonomy with class-indicative terms to facilitate classifier training and (2) utilizes LLMs for both data annotation and creation tailored for the hierarchical label space. Experiments show that TELEClass can outperform previous weakly-supervised methods and LLM-based zero-shot prompting methods on two public datasets.

6/18/2024

Taxes Are All You Need: Integration of Taxonomical Hierarchy Relationships into the Contrastive Loss

Kiran Kokilepersaud, Yavuz Yarici, Mohit Prabhushankar, Ghassan AlRegib

In this work, we propose a novel supervised contrastive loss that enables the integration of taxonomic hierarchy information during the representation learning process. A supervised contrastive loss operates by enforcing that images with the same class label (positive samples) project closer to each other than images with differing class labels (negative samples). The advantage of this approach is that it directly penalizes the structure of the representation space itself. This enables greater flexibility with respect to encoding semantic concepts. However, the standard supervised contrastive loss only enforces semantic structure based on the downstream task (i.e. the class label). In reality, the class label is only one level of a emph{hierarchy of different semantic relationships known as a taxonomy}. For example, the class label is oftentimes the species of an animal, but between different classes there are higher order relationships such as all animals with wings being ``birds. We show that by explicitly accounting for these relationships with a weighting penalty in the contrastive loss we can out-perform the supervised contrastive loss. Additionally, we demonstrate the adaptability of the notion of a taxonomy by integrating our loss into medical and noise-based settings that show performance improvements by as much as 7%.

6/12/2024