Revisiting the power of reinsertion for optimal targets of network attack

Understanding and improving the robustness of networks has significant applications in various areas, such as bioinformatics, transportation, critical infrastructures, and social networks. Recently, there has been a large amount of work on network dismantling, which focuses on removing an optimal set of nodes to break the network into small components with sub-extensive sizes. However, in our experiments, we found these state-of-the-art methods, although seemingly different, utilize the same refinement technique, namely reinsertion, to improve the performance. Despite being mentioned with understatement, the technique essentially plays the key role in the final performance. Without reinsertion, the current best method would deteriorate worse than the simplest heuristic ones; while with reinsertion, even the random removal strategy achieves on par with the best results. As a consequence, we, for the first time, systematically revisit the power of reinsertion in network dismantling problems. We re-implemented and compared 10 heuristic and approximate competing methods on both synthetic networks generated by four classical network models, and 18 real-world networks which cover seven different domains with varying scales. The comprehensive ablation results show that: i) HBA (High Betweenness Adaption, no reinsertion) is the most effective network dismantling strategy, however, it can only be applicable in small scale networks; ii) HDA (High Degree Adaption, with reinsertion) achieves the best balance between effectiveness and efficiency; iii) The reinsertion techniques help improve the performance for most current methods; iv) The one, which adds back the node based on that it joins the clusters minimizing the multiply of both numbers and sizes, is the most effective reinsertion strategy for most methods. Our results can be a survey reference to help further understand the current methods and thereafter design the better ones.


Introduction
Many real-world systems can be described through the complex network perspective, including air transport [17], power grid [3], malicious organization [9,10], Internet [3] or inter-personal networks [15]. One of the most important topics on these networks is about the robustness, i.e., the capacity to maintain the functionality after a major failure [29]. Since connectivity is the fundamental basic for almost all behaviors on networks, researches thus try to quantify how the connectivity is affected by node(or *Correspondence: fanchangjun09@163.com † Changjun Fan and Li Zeng contributed equally to this work. 1 College of Systems Engineering, National University of Defense Technology, Changsha, Hunan, China Full list of author information is available at the end of the article link) removal, and there comes with the well-defined network dismantling problem [1], which aims at identifying an optimal sequence of nodes that maximizes the damage on the network connectivity [5]. Such analysis yields a wide range of practical applications, such as immunize the epidemic propagation in populations [23], block the rumor spreading on social networks [15], prevent the virus diffusion in computer networks [7], etc. However, the exact solution is computationally intractable for medium and large networks due to its NP-hard nature [5], thus a large number of approximate methods have been proposed, including the heuristic methods [4,11,12,21,23,31], and some message-passing algorithms [5,22]. The former methods often greedily select target nodes based on local metrics, like node degree, which often leads to sub-optimal solutions; the latter ones are more accurate and global, while they need to iterate certain steps on the whole network to select the suitable candidate nodes [31], which would sacrifice some efficiency.
Although these methods looks different from each other, many of them [5,21,22,24,31] share the same refinement technique, named reinsertion (we later introduce it in detail in Section 2), which is just simply mentioned in the respective literature, while has significant influence on the final results. As illustrated in Fig. 1, we draw the robustness curves ("Robustness measure" section) of random removal, simplest heuristic HDA and the representative CI (details of these methods will be introduced in "Competing methods" section) on a realworld Gnutella31 network [18]. We can see that without reinsertion, the representative method CI cannot even beat the simplest heuristic HDA, while with reinsertion, the random removal strategy can achieve comparable performance than the state-of-the-art results. In some literature, people just compare their methods enhanced with reinsertion with others without reinsertion, and then report the 'fake' superiority of their model, since we are not sure whether the superiority comes from the model itself or just the reinsertion. Such confused results prevent us from selecting the best algorithm to handle the application at hand.
In this paper, we systematically investigate the power of reinsertion on the current methods for network dismantling. As far as we know, there are no previous efforts that conduct such comprehensive ablation studies for the reinsertion. We aim at figuring out the following three questions: i) Which is the current best method if all without reinsertion? ii)Which one is the best if all with reinsertion? iii) Which is the best reinsertion strategy?
To achieve this, we conduct ablation study (with/without reinsertion) for all the current network dismantling methods, including both traditional heuristics and the state-of-art message-passing ones, on synthetic networks and real-world networks. We use four random network models, including ER [8], WS [30], BA [3] and PLC [13], to generate diverse graphs with varying sizes and structures by controlling the model parameters. For real-world networks, we select 18 real networks covering 7 domains and with different scales. Considering that the network robustness can be described by different measures, we choose the area under the robustness curve as the main evaluation metric, since it captures the response of the whole dismantling process. Extensive experiments demonstrate that the reinsertion can significantly improve the performance regardless of the network types and the methods. Besides, since reinsertion is rather effective for the network dismantling problem, perhaps people should focus on this technique itself rather than other aspects, so as to design a better attack strategy.
The main contributions of this paper are summarized as follows: 1. We conduct comprehensive ablation studies that are with and without reinsertion for the network dismantling problem. We compare 10 competing methods on both synthetic graphs generated from four random network types and 18 real-world networks covering seven domains and scales up to hundreds of thousands nodes; 2. We design two other reinsertion strategies, and empirically prove that they have surpassed the previous reinsertion technique in a large margin; 3. The results obtained in this paper could provide a valuable guide for selecting and designing the most appropriate method for practical network dismantling problems.
The rest of the paper is organized as follows. We introduce the reinsertion method, robustness measures, competing methods and experimental data in "Method" section. We analyze the comprehensive ablation results and effects of different reinsertion strategies in "Results" section. Finally, we conclude the paper in Section 5.

Method
In this section, we introduce the experimental setups. We first introduce the robustness measure to evaluate the dismantling efficacy, then we introduce the reinsertion technique that is widely adopted in most current competitors. After that, we describe the competitors we are to analyze and the experimental data, including both synthetic graphs and real-world networks.

Robustness measure
Network dismantling is to identify a sequence of nodes of which removal would degrade the network connectivity maximally, and this connectivity disintegration is often measured as the relative reduction in the size of the giant(largest) connected component (GCC size) [5,21]. The smaller the remaining GCC size, the more the network is considered to have been disintegrated.
We consider the area under the robustness curve as the evaluation metric, which is plotted with horizontal axis being the fraction of nodes removed, and the vertical axis being the remaining GCC size. It is defined as: where N is the number of graph nodes, s(Q) is the remaining GCC size after removing Q nodes. Intuitively this measure is equivalent to assessing how many nodes the GCC contains when a new node is deleted from the network, and sum this for all nodes [29]. Note that Eq. 1 captures the network's response to the dismantling throughout the whole process, and the computation of R requires a ranking of the nodes, we are interested in minimizing R over all possible node orders.
In this paper, we evaluate the ablation performance of reinsertion for this robustness measure.

Reinsertion technique
The reinsertion is firstly proposed as an independent strategy for network destruction and immunization [27], and later developed as an important refinement technique for other dismantling strategies. The reinsertion starts from the point, where the network has been dismantled over by a certain strategy, it adds back one of the removed node, chosen such that, once reinserted, it joins the smallest number of clusters. When the node is reinserted, restore the edges with its neighbors which are in the network (but not the ones with neighbors not yet reinserted, if any). Repeat the above the procedure until all the nodes are back in the network.
As is shown in Fig. 2, each node is assigned an index c(i) given by the number of clusters it would join if it is reinserted in the network. The red node has c(red) = 2, while the blue one has c(blue) = 4, the green node has c(green) = 3. Then the node with the smallest c(i) is reinserted, i.e., the red node. After that, the c(i)s are recalculated and the new node with smallest c(i) is found and reinserted. Repeat these steps until the terminal criteria meets.
We will later show with extensive experiments how powerful such a simple technique is to the current network dismantling methods.

Competing methods
In this paper, we compare with 9 most representative competing methods. The first five are traditional heuristics which are based on some local or global structure centrality, such as degree, betweenness, closeness, pagerank, or collective influence. The remaining five are specifically designed for dismantling networks. Note that we also add a Random removal strategy as a worst possible baseline. High Degree Adaptive (HDA) [23]. HDA is an adaptive version of high degree method [2]. Within each step, the node with the highest degree is removed, and then the remaining degrees are updated.
High Betweenness Adaptive (HBA) [12]. HBA is the adaptive version of the high betweenness method, where the betweenness centrality of the remaining nodes is recomputed after each node removal. Betweenness centrality of a node equals to the sum of the fraction of all pairs shortest paths that pass through this node. It is a very useful centrality measure that benefits many networkrelated applications such as community detection and network vulnerability. However, the high computing cost prohibits its use in large-scale problem settings.
High Closeness Adaptive(HCA) [4]. HCA is the adaptive version of the high closeness method. Closeness centrality describes how close a node is to all the other nodes in the graph. It is calculated as the reciprocal of average distances from one node to all the others. Similar as HBA, the high complexity cost prevents its application in large networks.
High PageRank Adaptive (HPRA) [6]. HPRA is the adaptive version of high PageRank method. PageRank has been widely employed in search engines, as it provides a global ranking of all web pages, regardless of their content, based solely on their location in the Web's graph structure [6]. PageRank computes the probabilities for a randomwalking agent to reach every node in the network, which is also regarded as useful indications to supervise the network attack.
Collective Influence(CI) [21]. The Collective Influence measure is defined as the product of the node's reduced degree (i.e. original degree minus one) with the sum of the reduced degrees of the nodes that are within a constant hops away from it. This measure describes the proportion of other nodes that can be reached from a given node, assuming the nodes with higher CI values play more crucial roles in networks. The CI method sequentially removes the node with the highest CI value and recalculating the collective influence for the rest following operations.
MinSum [5]. MinSum is proposed to address the network dismantling problem. It consists three stages, which firstly utilizes a variant of message-passing algorithm to break all the cycles, and then breaks the remaining tree into small components by removing a fraction of nodes that vanishes in the large size limit. In the third stage, it greedily reinserts some nodes that close cycles without increasing too much the size the largest component, to reduce the total number of nodes removed.
Belief Propagation-guided Decimation (BPD) [22]. BPD is very similar as MinSum, which contains the same three stages. The difference lies on that BPD treats the decycling problem as the minimum-FVS construction. The FVS refers to the feedback vertex set, which is a set of node that will cause the network to become a forest if being deleted. To solve this problem, BPD proposes a belief propagation-guided decimation algorithm. After, it conducts the same subsequent steps, including tree breaking and node reinsertion.
CoreHD [31]. CoreHD also contains the similar three stages. The only difference lies in the decycling stage. Unlike the message-passing or belief-propagation algorithm, CoreHD instead seeks to remove the minimum nodes to empty the 2-core subgraph in the network, since the network is acyclic equals to that the 2-core subgraph is empty. CoreHD greedily remove the highest degree node in the 2-core subgraph until the end.
GND [25]. GND is the state-of-the-art method to address the network dismantling problem with non-unit removal costs. It first defines a node weighted Laplacian, and then proposes a simple and elegant approximate algorithm to calculate its second smallest eigenvector, based on which the set of nodes are removed. GND repeats the process until the end. Note that the unit-cost GND is just the spectral cut method.
We use SNAP software 1 to implement the heuristic methods, including Random, HDA, HBA, HCA and HPRA. For the other baselines, we use the source codes 23456 released online, and use the defaut parameter settings for each method.

Synthetic graphs
We evaluate all competitors against various synthetic networks. Synthetic networks are the result of applying generative function, present the advantage of displaying specific topological features that are both a prior known and tunable [29]. More specifically, we select a collection of 4 most common network types, summarized in Table 1. Note that there are many other random network models, such as regular graphs, circle graphs, grid graphs, ladder graphs, etc, we do not consider them since they are not difficult to dismantle, and there always exists some effective heuristic methods for them. Erdos-Renyi(ER) [8]. ER model is first introduced by Paul Erdos and Alfred Renyi, it returns a G n,p graph, where n is the graph nodes, p is the edge creation probability. The G n,p chooses each of the possible edges with probability p. This model can be used in the probabilistic method to prove the existence of graphs satisfying various properties, or to provide a rigorous definition of what it means for a property to hold for almost all graphs [8].
Watt-Strogatz(WS) [30]. WS is a random generative model that produces graphs with small-world properties, including short average path lengths and high clustering. It was proposed by Duncan J. Watts and Steven Strogatz in 1998. The tunable parameters include the node number n, k nearest neighbors in a ring topology that each node is joined with, and the probability of rewiring each edge p.
Barabasi-Albert(BA) [3]. BA is a model that generates random scale-free networks using a preferential attachment mechanism. Many real-world networks are thought to be approximately scale-free and contain few nodes (called hubs) with unusually high degree as compared to the other nodes. The BA model tries to explain the existence of such nodes in real networks. The algorithm is named for its inventors Albert-Laszlo Barabasi and Reka Albert and is a special case of a more general model called Price's model [28]. It generates a graph of n nodes by attaching new nodes with each adding m edges that are preferentially attached to existing nodes with high degree.
Powerlaw-Cluster(PLC) [13]. PLC is a mode for generating graphs with powerlaw degree distribution and approximate average clustering. It is essentially the BA growth model with an extra step that each random edge is followed by a chance of making an edge to one of its neighbors too (and thus a triangle) [13]. The model improves on BA in the sense that it enables a higher average clustering to be attained if desired. The tunable parameters include the number of nodes n, the number of random edges to add for each new node m, and the probability of adding a triangle after adding a random edge p. Figure 3 visualizes one instance for each of the above four networks types.

Real-world networks
We also conduct experiments on 18 real-world networks, which cover a wide range of domains, including malicious networks, PPI networks, infrastructure networks, social networks, citation networks, communication networks, etc. Specifically, they are: Corruption [26], a malicious network where nodes are people listed in scandals, and the ties indicate that two people were involved in the same corruption scandal; Crime [16], a malicious network from the projection of a bipartite network of persons and crimes, each node denotes a person, an edge represents that two person are involved in the same crime; USairport [16], a network of flights between US airports in 2010. Each node is an airport, and each edge represents a connection from one airport to another; Hamster [16]. This Network contains friendships and family links between users of the website hamster.com; Figeys [16], a network of interactions between proteins in Humans (Homo sapiens), from the first largescale study of protein-protein interactions in Human cells using a mass spectrometry-based approach; CA-GrQc [18], a collaboration network from the eprint arXiv and covers scientific collaborations between authors papers submitted to General Relativity and Quantum Cosmology category; HI-II-14, the corresponding Human Interactome dataset covering Space II and reported in 2014. Each node represents a distinct protein, each edge denotes the interaction between the corresponding proteins;  Powergrid [16], a power grid network of the Western States of the United States of America. An edge represents a power supply line. A node is either a generator, a transformator or a substation; CA-HepPh [18], a collaboration network from the eprint arXiv and covers scientific collaborations between authors papers submitted to High Energy Physics -Phenomenology category; DBLP [16], a citation network of DBLP, a database of scientific publications such as papers and books. Each node in the network is a publication, and each edge represents a citation of a publication by another publication; Cora [16], a citation network of Cora. Nodes represent scientific papers. An edge between two nodes indicates that the left node cites the right node; Digg [16], a reply network of the social news website Digg. Each node in the network is a user of the website, and each edge denotes that a user replied to another user; Email-Enron [20], the Enron email communication network which covers all the email communication within a data set of around half million emails. Each node is an email address, and an edge denotes at least one email communication; Brightkite [16], a social network contains user-user friendship relations from Brightkite, a former locationbased social network were user shared their locations. A node represents a user, and an edge indicates that a friendship exists between the user represented by the left node and the user represented by the right node; Gnutella31 [19], a sequence of snapshots of the Gnutella peer-to-peer file sharing network from August 2002. Nodes represent hosts in the Gnutella network topology and edges represent connections between the Gnutella hosts; Facebook [16], contains friendship data of a small subset of Facebook users. A node represents a user and an edge represents a friendship between two users; Epinion [16], the trust network from the online social network Epinions. Nodes are users of Epinions and directed edges represent trust between the users; Douban [16], a social network of Douban, a Chinese online recommendation site. A node represents a user of Douban and an edge represents a friendship between two users.
We treat all the networks as undirected ones and remove the self-loops. We extract the largest connected component. Basic statistics of the extracted networks are reported as Table 2.
We also draw the degree distributions for these networks in Fig. 4. We can see most real networks (except Corruption network) share an approximate scale-free structure, which presents a well-known resilience against random failures, but disintegrate rapidly under intentional attacks targeting key nodes [2].  4 Degree distribution of the real-world networks. X axis is the node degree, y axis is the number of nodes corresponding to that degree, and we choose double logarithmic coordinates to more intuitively determine whether it is scale-free

Results
In this section, we first demonstrate the effectiveness of the reinsertion technique on both synthetic graphs and real-world networks, then we explore the effects of different reinsertion techniques.

Synthetic results
We test all methods w/o the reinsertion technique on synthetic graphs randomly generated by four classic models introduced in "Synthetic graphs" section. For each model, we generate 100 graphs with the parameters in        Table 1, and report the values of mean and standard variance results. Table 3 shows the comparison results of Eq. 1 without reinsertion, we can clearly see that HBA the best across different types of networks, which is widely validated by previous research [14,27], since HBA adaptively removes the highest betweenness nodes, which are key to the whole network connectivity. HCA, which adaptively removes the highest closeness nodes, also performs excellently due to the similar reasons. However, considering the high computational costs of these two methods (Table 6), they are not practical in large or even medium scale networks. We can also see in methods achieve good results in ER graphs, since these graphs are purely random ones that there are no 'critical' nodes that determine the graph connectivity.
In Table 4, we enhance each method with the reinsertion technique introduced in "Reinsertion technique" section, and report the refined results, and we also show the promotion (Eq. 2) after adding the reinsertion in Table 5.
We can see that most methods (except for HBA, HCA and GND) get improved after using reinsertion, and on average, HPRA (reinserted) performs the best among all. We also observe two interesting things: i) The best performed HBA gets deteriorated greatly when utilized with reinsertion, however, even the best result for reinserted methods (HPRA) cannot beat the vanilla HBA (Table 3). This indicates that the vanilla HBA has achieved the close-to-optimal performance for the network dismantling problem, at which the reinsertion is no longer a refinement, but a hindrance; ii) The pure Random strategy gets greatly improved with reinsertion, making the reinserted random strategy be close to those manuallydesigned state-of-the-arts.
However, if taking account of running time, we find actually the simple heuristic HDA achieves the best balance between effectiveness and efficiency (Tables 3, 4 and 6). The reinserted HDA is only 1.74% worse than the best result (vanilla HBA), while is hundreds of times faster ( Table 6). Note that we do not list the time for Random strategy, since it basically takes no time to obtain a random solution.

Real-world results
Now we will see the effects of reinsertion on real-world networks. Since HBA and HCA are computationally prohibitive on medium or large networks (e.g., HBA takes over 5 days to finish computation on the Cora network, with 23,166 nodes and 89,157 edges.), we do not compare with them in this section. Table 7 shows the results of vanilla methods without reinsertion. We can see that HDA, HPRA and GND performs relatively better than other methods, and HPRA is the best (0.0986) among all, and followed by HDA (0.1043). Table 8 gives the results after reinsertion, and Table 9 shows the promotion results. Consistent with the observations from synthetic results, most methods get improvements for different levels, with the refinement of reinsertion. For example, the random strategy obtains an average 71.56% gain (Table 9) with reinsertion, making it even beat the state-of-the-art MinSum strategy (Table 8). Among the reinserted methods, HDA achieves the highest performance with an average 0.0938 (Table 8) robustness score (Eq. 1). However, GND is deteriorated on some networks when refined with reinsertion (Table 8), the reason behind remains to be explored. When considering the execution, HDA is far more efficient than the other ones, e.g., it is about 767 times faster than HPRA, which is very close to HDA in effectiveness.

Effects of different reinsertion strategies
We have observed the impressive gains brought by the reinsertion technique in "Synthetic results" and "Real-world results" sections, now we may ask: Is the reinsertion in "Reinsertion technique" section the best one? Does there exist more effective reinsertion methods? In this section, we try to answer this question by exploring other potential reinsertion techniques (Table 10).
We name the previous reinsert method as Reinsert_I, and here we propose two other ones, and call them Reinsert_II and Reinsert_III respectively. Basically, the general reinsertion technique is to add back one of the removed node (together with the adjacent edges), chosen based on some criteria, until all nodes are back in the network. Different reinsertion methods define different criteria, based on which, we define the following three reinsertion strategies: • Reinsert_I: The criteria is once reinserted, it joins the smallest number of clusters; • Reinsert_II: The criteria is once reinserted, it joins the clusters of smallest sizes; • Reinsert_III: The criteria is once reinserted, it joins the clusters minimizing the multiply of both numbers and sizes; In Fig. 2, each node is assigned an index c(i) given by the criteria specified by the reinsertion technique. For Rein-sert_I, c(red) = 2, c(blue) = 4, c(green) = 3, then the red node is reinserted; for Reinsert_II, c(red) = 10, c(blue) = 5, c(green) = 6, then the blue node is reinserted; for Reinsert_III, c(red) = 20, c(blue) = 20, c(green) = 18, then the green node is reinserted. After that, the c(i)s are recalculated and the new node with smallest c(i) is found and reinserted. Repeat these steps until the end. As a consequence, different reinsertion strategies determines different nodes to be reinserted first, leading to different refinement results. To decide which one is better in practice, we compare the average performance promotion for each method on both synthetic graphs and real-world networks (Tables 11 and 12). It can be clearly observed in Tables 11 and 12 that Rein-sert_III achieves the most promotions for most methods (except CI) on both synthetic and real-world networks, compared to other two reinsertion strategies, and excels to a significant extent to the current strategy Reinsert_I. For CI method, Reinsert_I tends to be more effective. All the three reinsertion strategies fail in HBA and GND.
To illustrate the effects of these three strategies more intuitively, we draw the robustness curve of CA-GrQc network for different methods with different reinsertions in Fig. 5, which is plotted with horizontal axis being the fraction of removed nodes, and vertical axis being the remaining giant connected component size. Actually, the value of Eq. 1 approximates the area under the robustness curve. The figure clearly shows that the reinsertion greatly helps reduce the area under the curve, compared to the original method, and Reinsert_III is among the most effective one, while all the reinsertions produce negative effects on the GND method.

Conclusion
In this paper, we, for the first time, systematically explore the effects of reinsertion techniques for the network dismantling problem. Previous research tend to use their reinserted results to compare with other un-reinserted baseline methods, which may mislead us in the selection of the real best dismantling strategy for applications at hand. We conduct comprehensive ablation studies on both synthetic graphs generated by four classical random network models, i.e., ER, WS, BA and PLC, and 18 real-world networks across seven different domains and with different scales, and the results show that: i) HBA (no reinsertion) is the most effective network dismantling strategy, however, it can only be applicable in small scale networks; ii) HDA (with reinsertion) achieves the best balance between effectiveness and efficiency. It is surprising that such a simple heuristic method would beat most state-of-the-art methods if enhanced with reinsertion techniques; iii) The reinsertion technique helps improve the performance for most current methods, except for HBA, HCA and GND (on small-world type graphs); iv) Reinsert_III, which determines the node based on that it joins the clusters minimizing the multiply of both numbers and sizes, is the most effective reinsertion strategy for most methods (except for CI, where Rein-sert_I suits best). We believe the results in this paper could provide as a reference for choosing and designing the most effective strategy for realistic network dismantling applications.
However, we still lack a deep understanding about why such a simple reinsertion technique works so well for the network dismantling problem, which would be a very meaningful future research topic to be explored. We will later release the codes and data to support the research in this direction.