Controlled Query Evaluation through Epistemic Dependencies

Read original: arXiv:2405.02458 - Published 5/7/2024 by Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Domenico Fabio Savo

📉

Overview

This paper introduces a framework for "controlled query evaluation" (CQE), which aims to protect sensitive information in databases while still allowing some queries to be answered.
The key idea is to use "epistemic dependencies" - relationships between different pieces of information - to determine which queries can be safely answered without revealing sensitive data.
The authors provide algorithms for deciding which queries can be answered, and show how this approach can offer better privacy protections than previous CQE methods.

Plain English Explanation

Imagine you have a database with some sensitive information, like people's personal details or financial records. You want to allow users to ask questions about the data, but you don't want to accidentally reveal anything private or confidential.

The problem is that even if you don't directly show the sensitive information, the answers to certain questions might indirectly give away that private data. For example, if you have information about someone's salary and their job title, and a user asks "What is the highest paid job at the company?", you might be able to figure out that person's salary.

This paper introduces a way to address this challenge, called "controlled query evaluation" (CQE). The key idea is to look at the relationships between different pieces of information in the database - what the authors call "epistemic dependencies". By understanding these dependencies, the system can figure out which queries are safe to answer without revealing anything sensitive.

The paper presents algorithms that can analyze a database and determine which queries can be safely answered. This allows users to get useful information from the data, while still protecting people's privacy. The authors show that this CQE approach can provide better privacy guarantees than some previous methods.

Technical Explanation

The paper introduces a framework for controlled query evaluation (CQE) that uses epistemic dependencies to determine which queries can be safely answered without revealing sensitive information.

The authors first define a formal model for representing databases and the relationships between different pieces of information, called epistemic dependencies. These dependencies capture how answering one query could potentially reveal the answer to another, more sensitive query.

Building on this model, the paper presents algorithms for deciding which queries can be safely answered given the epistemic dependencies in the database. The authors prove that this CQE approach can provide stronger privacy guarantees than previous CQE methods, which did not fully account for the complex relationships between data.

The paper also discusses efficient enumeration of recursive plans and zero-shot logical query reasoning as potential techniques for implementing the CQE framework in practice.

Critical Analysis

The paper provides a thorough formal foundation for CQE using epistemic dependencies, and the authors prove that their approach can offer stronger privacy guarantees than previous methods. However, the practical implementation of the CQE framework is still an open challenge.

The authors acknowledge that determining the full set of epistemic dependencies in a database may be computationally expensive, and suggest approximation techniques as an area for future research. Additionally, the paper does not address the potential for epistemic uncertainty in pre-trained neural networks to introduce new privacy risks in CQE systems.

Further research is needed to understand the scalability and robustness of the CQE approach, as well as its integration with other privacy-preserving techniques like query workload minimization and unique characterizability of temporal queries.

Conclusion

This paper introduces a novel framework for controlled query evaluation that uses epistemic dependencies to determine which queries can be safely answered without revealing sensitive information. The authors prove that their CQE approach can offer stronger privacy guarantees than previous methods.

While the formal foundations of the framework are well-established, the practical implementation and scalability of CQE systems remain open challenges. Further research is needed to address these issues and integrate CQE with other privacy-preserving techniques. Overall, this work provides an important contribution to the field of privacy-preserving data management.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📉

Controlled Query Evaluation through Epistemic Dependencies

Gianluca Cima, Domenico Lembo, Lorenzo Marconi, Riccardo Rosati, Domenico Fabio Savo

In this paper, we propose the use of epistemic dependencies to express data protection policies in Controlled Query Evaluation (CQE), which is a form of confidentiality-preserving query answering over ontologies and databases. The resulting policy language goes significantly beyond those proposed in the literature on CQE so far, allowing for very rich and practically interesting forms of data protection rules. We show the expressive abilities of our framework and study the data complexity of CQE for (unions of) conjunctive queries when ontologies are specified in the Description Logic DL-Lite_R. Interestingly, while we show that the problem is in general intractable, we prove tractability for the case of acyclic epistemic dependencies by providing a suitable query rewriting algorithm. The latter result paves the way towards the implementation and practical application of this new approach to CQE.

5/7/2024

Structure Your Data: Towards Semantic Graph Counterfactuals

Angeliki Dimitriou, Maria Lymperaiou, Giorgos Filandrianos, Konstantinos Thomas, Giorgos Stamou

Counterfactual explanations (CEs) based on concepts are explanations that consider alternative scenarios to understand which high-level semantic features contributed to particular model predictions. In this work, we propose CEs based on the semantic graphs accompanying input data to achieve more descriptive, accurate, and human-aligned explanations. Building upon state-of-the-art (SoTA) conceptual attempts, we adopt a model-agnostic edit-based approach and introduce leveraging GNNs for efficient Graph Edit Distance (GED) computation. With a focus on the visual domain, we represent images as scene graphs and obtain their GNN embeddings to bypass solving the NP-hard graph similarity problem for all input pairs, an integral part of the CE computation process. We apply our method to benchmark and real-world datasets with varying difficulty and availability of semantic annotations. Testing on diverse classifiers, we find that our CEs outperform previous SoTA explanation models based on semantics, including both white and black-box as well as conceptual and pixel-level approaches. Their superiority is proven quantitatively and qualitatively, as validated by human subjects, highlighting the significance of leveraging semantic edges in the presence of intricate relationships. Our model-agnostic graph-based approach is widely applicable and easily extensible, producing actionable explanations across different contexts.

7/23/2024

✨

Unique Characterisability and Learnability of Temporal Queries Mediated by an Ontology

Jean Christoph Jung, Vladislav Ryzhikov, Frank Wolter, Michael Zakharyaschev

Algorithms for learning database queries from examples and unique characterisations of queries by examples are prominent starting points for developing automated support for query construction and explanation. We investigate how far recent results and techniques on learning and unique characterisations of atemporal queries mediated by an ontology can be extended to temporal data and queries. Based on a systematic review of the relevant approaches in the atemporal case, we obtain general transfer results identifying conditions under which temporal queries composed of atemporal ones are (polynomially) learnable and uniquely characterisable.

7/30/2024

Process Trace Querying using Knowledge Graphs and Notation3

William Van Woensel

In process mining, a log exploration step allows making sense of the event traces; e.g., identifying event patterns and illogical traces, and gaining insight into their variability. To support expressive log exploration, the event log can be converted into a Knowledge Graph (KG), which can then be queried using general-purpose languages. We explore the creation of semantic KG using the Resource Description Framework (RDF) as a data model, combined with the general-purpose Notation3 (N3) rule language for querying. We show how typical trace querying constraints, inspired by the state of the art, can be implemented in N3. We convert case- and object-centric event logs into a trace-based semantic KG; OCEL2 logs are hereby flattened into traces based on object paths through the KG. This solution offers (a) expressivity, as queries can instantiate constraints in multiple ways and arbitrarily constrain attributes and relations (e.g., actors, resources); (b) flexibility, as OCEL2 event logs can be serialized as traces in arbitrary ways based on the KG; and (c) extensibility, as others can extend our library by leveraging the same implementation patterns.

9/10/2024