Error Bounds for the Network Scale-Up Method

Read original: arXiv:2407.10640 - Published 7/16/2024 by Sergio D'iaz-Aranda, Juan Marcos Ram'irez, Mohit Daga, Jaya Prakash Champati, Jos'e Aguilar, Rosa Elvira Lillo, Antonio Fern'andez Anta

🌐

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🌐

Error Bounds for the Network Scale-Up Method

Sergio D'iaz-Aranda, Juan Marcos Ram'irez, Mohit Daga, Jaya Prakash Champati, Jos'e Aguilar, Rosa Elvira Lillo, Antonio Fern'andez Anta

Epidemiologists and social scientists have used the Network Scale-Up Method (NSUM) for over thirty years to estimate the size of a hidden sub-population within a social network. This method involves querying a subset of network nodes about the number of their neighbours belonging to the hidden sub-population. In general, NSUM assumes that the social network topology and the hidden sub-population distribution are well-behaved; hence, the NSUM estimate is close to the actual value. However, bounds on NSUM estimation errors have not been analytically proven. This paper provides analytical bounds on the error incurred by the two most popular NSUM estimators. These bounds assume that the queried nodes accurately provide their degree and the number of neighbors belonging to the hidden population. Our key findings are twofold. First, we show that when an adversary designs the network and places the hidden sub-population, then the estimate can be a factor of $Omega(sqrt{n})$ off from the real value (in a network with $n$ nodes). Second, we also prove error bounds when the underlying network is randomly generated, showing that a small constant factor can be achieved with high probability using samples of logarithmic size $O(log{n})$. We present improved analytical bounds for Erdos-Renyi and Scale-Free networks. Our theoretical analysis is supported by an extensive set of numerical experiments designed to determine the effect of the sample size on the accuracy of the estimates in both synthetic and real networks.

7/16/2024

🤯

Scalable Subsampling Inference for Deep Neural Networks

Kejin Wu, Dimitris N. Politis

Deep neural networks (DNN) has received increasing attention in machine learning applications in the last several years. Recently, a non-asymptotic error bound has been developed to measure the performance of the fully connected DNN estimator with ReLU activation functions for estimating regression models. The paper at hand gives a small improvement on the current error bound based on the latest results on the approximation ability of DNN. More importantly, however, a non-random subsampling technique--scalable subsampling--is applied to construct a `subagged' DNN estimator. Under regularity conditions, it is shown that the subagged DNN estimator is computationally efficient without sacrificing accuracy for either estimation or prediction tasks. Beyond point estimation/prediction, we propose different approaches to build confidence and prediction intervals based on the subagged DNN estimator. In addition to being asymptotically valid, the proposed confidence/prediction intervals appear to work well in finite samples. All in all, the scalable subsampling DNN estimator offers the complete package in terms of statistical inference, i.e., (a) computational efficiency; (b) point estimation/prediction accuracy; and (c) allowing for the construction of practically useful confidence and prediction intervals.

5/15/2024

📈

Dynamic Size Counting in the Population Protocol Model

Dominik Kaaser, Maximilian Lohmann

The population protocol model describes collections of distributed agents that interact in pairs to solve a common task. We consider a dynamic variant of this prominent model, where we assume that an adversary may change the population size at an arbitrary point in time. In this model we tackle the problem of counting the population size: in the dynamic size counting problem the goal is to design an algorithm that computes an approximation of $log n$. This estimate can be used to turn static, non-uniform population protocols, i.e., protocols that depend on the population size $n$, into dynamic and loosely-stabilizing protocols. Our contributions in this paper are three-fold. Starting from an arbitrary initial configuration, we first prove that the agents converge quickly to a valid configuration where each agent has a constant-factor approximation of $log n$, and once the agents reach such a valid configuration, they stay in it for a polynomial number of time steps. Second, we show how to use our protocol to define a uniform and loosely-stabilizing phase clock for the population protocol model. Finally, we support our theoretical findings by empirical simulations that show that our protocols work well in practice.

5/9/2024

Learning the mechanisms of network growth

Lourens Touwen, Doina Bucur, Remco van der Hofstad, Alessandro Garavaglia, Nelly Litvak

We propose a novel model-selection method for dynamic networks. Our approach involves training a classifier on a large body of synthetic network data. The data is generated by simulating nine state-of-the-art random graph models for dynamic networks, with parameter range chosen to ensure exponential growth of the network size in time. We design a conceptually novel type of dynamic features that count new links received by a group of vertices in a particular time interval. The proposed features are easy to compute, analytically tractable, and interpretable. Our approach achieves a near-perfect classification of synthetic networks, exceeding the state-of-the-art by a large margin. Applying our classification method to real-world citation networks gives credibility to the claims in the literature that models with preferential attachment, fitness and aging fit real-world citation networks best, although sometimes, the predicted model does not involve vertex fitness.

5/28/2024