Discovering Influential Nodes in Social Networks based on modified independent cascade model and Fuzzy Multi-Objective Genetic Algorithm

Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

In recent years, social networks have become an integral part of people's lives and play a significant role in the real world. The primary aim of influence maximization problem is finding a set of nodes in the network which can maximize the influence if the diffusion process starts from them. Therefore, the problem’s goal is to find influential people in large scale real social networks. The penetration phenomenon is carried out according to an influence model in the network. Two independent cascade and linear threshold influence models, the most common of which is the independent cascade model, are utilized for broadcasting in the network. Theoreticaly, optimizing the selection influential nodes problem is NP-hard in both models. The problem will start by considering the social network’s graph, a specific influence model and a given number k. The problem’s goal is to select k nodes (users) from the graph (network) as influential nodes, so that the number of active nodes is maximized at the end of the diffusion process. Due to the influence maximization problem and finding influential people is an NP-hard optimization problem in the social network, meta-heuristic algorithms can be used to solve the problem. With regard to the privous researches, there is just one objective function as the problem’s goal and it is maximizing the number of effective nodes of the diffusion model. While other objective functions such as maximizing the number of effective nodes in the diffusion model and minimizing the budget value k (the number of initial nodes as seed) are not considered, minimizing the time required for effective diffusion can be achieved by having k initial nodes. Although various models for the problem and various optimization algorithms have been presented to discover influential nodes in social networks, paying attention to the multi-objective nature of the problem and improving the performance of the proposed optimization algorithms are a serious research challenge in this field. In this paper. A fuzzy version of the NSGA-II as a multi-objective genetic algorithm, whose mutation and crossover rates are adjusted by Fuzzy Inference System, is utilized to simultaneously optimize the three objectives of maximizing the number of effective nodes, minimizing the number of initial nodes and minimizing the required diffusion time. In the proposed method, the Expected Diffusion Value (EDV) of the diffusion model is replaced instead the simulation of the independent cascade diffusion model with heavy calculations to calculate the diffusion spread (the number of effective nodes of the diffusion model). Therefore, the EDV function is satisfied as the thied objective (minimizing the required diffusion time). The second objective function can also be converted into a maximization function by considering N-k nodes, where k is the number of selected initial nodesand N is the total graph nodes. The decimal numerical coding with fixed length is used in the proposed method. Based on the coding, each chromosome has k genes in the search space. The integer part of each gene is the selected node number. The decimal part is also used to determine whether that initial node exists or not. In the maximizing influence process using the fuzzy NSGA-II algorithm, the solution space (chromosomes) consists of k number of initial nodes, which should be encoded into the fuzzy NSGA-II comprehensible space. Also, a Fuzzy Inference System is proposed to adjust the mutation and recombination rates for filling up a serious challenge of genetic algorithms. In the fuzzy system, NF and FitBest are considered as two input variables, and Pm (mutation rate) and Pc (crossover rate) are returned as outputs NF is the number of chromosomes in the first frontier (F1) of the multi-objective genetic algorithm and FitBest is the average of the normalized objective functions for the chromosomes in F1. To analysis the efficiency of the proposed method, the obtained exprimental results have been compared with conventional maximizing influence methods, i.e., degree centrality, distance centrality, closeness centerality, betweenness, eigenvector and page rank methods, non-fuzzy version of NSGA-II, the latest methods presented for maximizing penetration based on multi-objective meta-heuristic algorithms, i.e. µGP multi-objective evolutionary algorithm, multi-objective crow search algorithm (MOCSA), greedy randomized adaptive search process algorithm (GRASP) and the Multi-Transformation Evolutionary Framework (MTEF) on five benchmark graph datasets Arenasjazz, Canetscience, EgoFacebook, Higgs-Reply and Slashdot. This comparison is based on four criteria: EDV, cost (the number of nodes selected as seed), the influence expansion criterion σ(s), i.e. the number of active nodes with the independent cascade (IC) propagation model, and the execution time of the method in seconds. The obtained results show the superiority of the proposed method over the other methods.

Language:
Persian
Published:
Signal and Data Processing, Volume:21 Issue: 4, 2025
Pages:
29 to 48
https://www.magiran.com/p2846211  
سامانه نویسندگان
  • Kherad، Mahdi
    Author (2)
    Kherad, Mahdi
    Phd Student Information Technology,Department of Computer Engineering, Faculty of Engineering, University of Qom, University of Qom, Qom, Iran
اطلاعات نویسنده(گان) توسط ایشان ثبت و تکمیل شده‌است. برای مشاهده مشخصات و فهرست همه مطالب، صفحه رزومه را ببینید.
مقالات دیگری از این نویسنده (گان)