Publications | Science of Security Virtual Organization

Fakequipo: Deep Fake Detection

Deep learning have a variety of applications in different fields such as computer vision, automated self-driving cars, natural language processing tasks and many more. One of such deep learning adversarial architecture changed the fundamentals of the data manipulation. The inception of Generative Adversarial Network (GAN) in the computer vision domain drastically changed the way how we saw and manipulated the data. But this manipulation of data using GAN has found its application in various type of malicious activities like creating fake images, swapped videos, forged documents etc. But now, these generative models have become so efficient at manipulating the data, especially image data, such that it is creating real life problems for the people. The manipulation of images and videos done by the GAN architectures is done in such a way that humans cannot differentiate between real and fake images/videos. Numerous researches have been conducted in the field of deep fake detection. In this paper, we present a structured survey paper explaining the advantages, gaps of the existing work in the domain of deep fake detection.

Authored by Pramod Bide, Varun, Gaurav Patil, Samveg Shah, Sakshi Patil

A CNN-RNN Based Fake News Detection Model Using Deep Learning

False news has become widespread in the last decade in political, economic, and social dimensions. This has been aided by the deep entrenchment of social media networking in these dimensions. Facebook and Twitter have been known to influence the behavior of people significantly. People rely on news/information posted on their favorite social media sites to make purchase decisions. Also, news posted on mainstream and social media platforms has a significant impact on a particular country’s economic stability and social tranquility. Therefore, there is a need to develop a deceptive system that evaluates the news to avoid the repercussions resulting from the rapid dispersion of fake news on social media platforms and other online platforms. To achieve this, the proposed system uses the preprocessing stage results to assign specific vectors to words. Each vector assigned to a word represents an intrinsic characteristic of the word. The resulting word vectors are then applied to RNN models before proceeding to the LSTM model. The output of the LSTM is used to determine whether the news article/piece is fake or otherwise.

Authored by Qamber Abbas, Muhammad Zeshan, Muhammad Asif

Fake news detection: A RNN-LSTM, Bi-LSTM based deep learning approach

Fake news is a new phenomenon that promotes misleading information and fraud via internet social media or traditional news sources. Fake news is readily manufactured and transmitted across numerous social media platforms nowadays, and it has a significant influence on the real world. It is vital to create effective algorithms and tools for detecting misleading information on social media platforms. Most modern research approaches for identifying fraudulent information are based on machine learning, deep learning, feature engineering, graph mining, image and video analysis, and newly built datasets and online services. There is a pressing need to develop a viable approach for readily detecting misleading information. The deep learning LSTM and Bi-LSTM model was proposed as a method for detecting fake news, In this work. First, the NLTK toolkit was used to remove stop words, punctuation, and special characters from the text. The same toolset is used to tokenize and preprocess the text. Since then, GLOVE word embeddings have incorporated higher-level characteristics of the input text extracted from long-term relationships between word sequences captured by the RNN-LSTM, Bi-LSTM model to the preprocessed text. The proposed model additionally employs dropout technology with Dense layers to improve the model's efficiency. The proposed RNN Bi-LSTM-based technique obtains the best accuracy of 94%, and 93% using the Adam optimizer and the Binary cross-entropy loss function with Dropout (0.1,0.2), Once the Dropout range increases it decreases the accuracy of the model as it goes 92% once Dropout (0.3).

Authored by Govind Mahara, Sharad Gangele

Deep fake Image Detection based on Modified minimized Xception Net and DenseNet

This paper deals with the problem of image forgery detection because of the problems it causes. Where The Fake im-ages can lead to social problems, for example, misleading the public opinion on political or religious personages, de-faming celebrities and people, and Presenting them in a law court as evidence, may Doing mislead the court. This work proposes a deep learning approach based on Deep CNN (Convolutional Neural Network) Architecture, to detect fake images. The network is based on a modified structure of Xception net, CNN based on depthwise separable convolution layers. After extracting the feature maps, pooling layers are used with dense connection with Xception output, to in-crease feature maps. Inspired by the idea of a densenet network. On the other hand, the work uses the YCbCr color system for images, which gave better Accuracy of %99.93, more than RGB, HSV, and Lab or other color systems.

Authored by Ihsan Sahib, Tawfiq AlAsady

Fake News Detection using a Decentralized Deep Learning Model and Federated Learning

Social media has beneficial and detrimental impacts on social life. The vast distribution of false information on social media has become a worldwide threat. As a result, the Fake News Detection System in Social Networks has risen in popularity and is now considered an emerging research area. A centralized training technique makes it difficult to build a generalized model by adapting numerous data sources. In this study, we develop a decentralized Deep Learning model using Federated Learning (FL) for fake news detection. We utilize an ISOT fake news dataset gathered from "Reuters.com" (N = 44,898) to train the deep learning model. The performance of decentralized and centralized models is then assessed using accuracy, precision, recall, and F1-score measures. In addition, performance was measured by varying the number of FL clients. We identify the high accuracy of our proposed decentralized FL technique (accuracy, 99.6%) utilizing fewer communication rounds than in previous studies, even without employing pre-trained word embedding. The highest effects are obtained when we compare our model to three earlier research. Instead of a centralized method for false news detection, the FL technique may be used more efficiently. The use of Blockchain-like technologies can improve the integrity and validity of news sources.

Authored by Nirosh Jayakody, Azeem Mohammad, Malka Halgamuge

Deep Learning Methods for Fake News Detection

Nowadays, although it is much more convenient to obtain news with social media and various news platforms, the emergence of all kinds of fake news has become a headache and urgent problem that needs to be solved. Currently, the fake news recognition algorithm for fake news mainly uses GCN, including some other niche algorithms such as GRU, CNN, etc. Although all fake news verification algorithms can reach quite a high accuracy with sufficient datasets, there is still room for improvement for unsupervised learning and semi-supervised. This article finds that the accuracy of the GCN method for fake news detection is basically about 85% through comparison with other neural network models, which is satisfactory, and proposes that the current field lacks a unified training dataset, and that in the future fake news detection models should focus more on semi-supervised learning and unsupervised learning.

Authored by Zhichao Wang

Analysis of “Tripartite and Bilateral” Space Deterrence Based on Signaling Game

A “tripartite and bilateral” dynamic game model was constructed to study the impact of space deterrence on the challenger's military strategy in a military conflict. Based on the signal game theory, the payment matrices and optimal strategies of the sheltering side and challenging side were analyzed. In a theoretical framework, the indicators of the effectiveness of the challenger's response to space deterrence and the influencing factors of the sheltering's space deterrence were examined. The feasibility and effective means for the challenger to respond to the space deterrent in a “tripartite and bilateral” military conflict were concluded.

Authored by Zhiyong Wu, Yanhua Cao

An Enhanced Copy-deterrence scheme for Secure Image Outsourcing in Cloud

In this paper, we propose a novel watermarking-based copy deterrence scheme for identifying data leaks through authorized query users in secure image outsourcing systems. The scheme generates watermarks unique to each query user, which are embedded in the retrieved encrypted images. During unauthorized distribution, the watermark embedded in the image is extracted to determine the untrustworthy query user. Experimental results show that the proposed scheme achieves minimal information loss, faster embedding and better resistance to JPEG compression attacks compared with the state-of-the-art schemes.

Authored by J. Anju, R. Shreelekshmi

Towards Modern Card Games with Large-Scale Action Spaces Through Action Representation

Axie infinity is a complicated card game with a huge-scale action space. This makes it difficult to solve this challenge using generic Reinforcement Learning (RL) algorithms. We propose a hybrid RL framework to learn action representations and game strategies. To avoid evaluating every action in the large feasible action set, our method evaluates actions in a fixed-size set which is determined using action representations. We compare the performance of our method with two baseline methods in terms of their sample efficiency and the winning rates of the trained models. We empirically show that our method achieves an overall best winning rate and the best sample efficiency among the three methods.

Authored by Zhiyuan Yao, Tianyu Shi, Site Li, Yiting Xie, Yuanyuan Qin, Xiongjie Xie, Huan Lu, Yan Zhang

Obnoxious Deterrence

The reigning U.S. paradigm for deterring malicious cyberspace activity carried out by or condoned by other countries is to levy penalties on them. The results have been disappointing. There is little evidence of the permanent reduction of such activity, and the narrative behind the paradigm presupposes a U.S./allied posture that assumes the morally superior role of judge upon whom also falls the burden of proof–-a posture not accepted but nevertheless exploited by other countries. In this paper, we explore an alternative paradigm, obnoxious deterrence, in which the United States itself carries out malicious cyberspace activity that is used as a bargaining chip to persuade others to abandon objectionable cyberspace activity. We then analyze the necessary characteristics of this malicious cyberspace activity, which is generated only to be traded off. It turns out that two fundamental criteria–that the activity be sufficiently obnoxious to induce bargaining but be insufficiently valuable to allow it to be traded away–may greatly reduce the feasibility of such a ploy. Even if symmetric agreements are easier to enforce than pseudo-symmetric agreements (e.g., the XiObama agreement of 2015) or asymmetric red lines (e.g., the Biden demand that Russia not condone its citizens hacking U.S. critical infrastructure), when violations occur, many of today’s problems recur. We then evaluate the practical consequences of this approach, one that is superficially attractive.

Authored by Martin Libicki

Cyber Security and Defense: Proactive Defense and Deterrence

With the development of technology, the invention of computers, the use of cyberspace created by information communication systems and networks, increasing the effectiveness of knowledge in all aspects and the gains it provides have increased further the importance of cyber security day by day. In parallel with the developments in cyber space, the need for cyber defense has emerged with active and passive defense approaches for cyber security against internal and external cyber-attacks of increasing type, severity and complexity. In this framework, proactive cyber defense and deterrence strategies have started to be implemented with new techniques and methods.

Authored by Mustafa Şenol

Smart City Digital Twins for Public Safety: A Deep Learning and Simulation Based Method for Dynamic Sensing and Decision-Making

Technological innovations are expanding rapidly in the public safety sector providing opportunities for more targeted and comprehensive urban crime deterrence and detection. Yet, the spatial dispersion of crimes may vary over time. Therefore, it is unclear whether and how sensors can optimally impact crime rates. We developed a Smart City Digital Twin-based method to dynamically place license plate reader (LPR) sensors and improve their detection and deterrence performance. Utilizing continuously updated crime records, the convolutional long short-term memory algorithm predicted areas crimes were most likely to occur. Then, a Monte Carlo traffic simulation simulated suspect vehicle movements to determine the most likely routes to flee crime scenes. Dynamic LPR placement predictions were made weekly, capturing the spatiotemporal variation in crimes and enhancing LPR performance relative to static placement. We tested the proposed method in Warner Robins, GA, and results support the method's promise in detecting and deterring crime.

Authored by Xiyu Pan, Neda Mohammadi, John Taylor

Deterrence of Cycles in Temporal Knowledge Graphs

Temporal Knowledge Graph Embedding (TKGE) is an extensible (continuous vector space) time-sensitive data structure (tree) and is used to predict future event given historical events. An event consists of current state of a knowledge (subject), and a transition (predicate) that morphs the knowledge to the next state (object). The prediction is accomplished when the historical event data conform to structural model of Temporal Points Processes (TPP), followed by processing it by the behavioral model of Conditional Intensity Function (CIF). The formidable challenge in constructing and maintaining a TKGE is to ensure absence of cycles when historical event data are formed/structured as logical paths. Variations of depth-first search (DFS) are used in constructing TKGE albeit with the challenge of maintaining it as a cycle-free structure. This article presents a simple (tradeoff-based) design that creates and maintains a single-rooted isolated-paths TKGE: ipTKGE. In ipTKGE, isolated-paths have their own (local) roots. The local roots trigger the break down of the traditionally-constructed TKGE into isolated (independent) paths alleviating the necessity for using DFS - or its variational forms. This approach is possible at the expense of subject/objec t and predicates redun-dancies in ipTKGE. Isolated-paths allow for simpler algorithmic detection and avoidance of potential cycles in TKGE.

Authored by Seif Azghandi

The Promise and Perils of Allied Offensive Cyber Operations

NATO strategy and policy has increasingly focused on incorporating cyber operations to support deterrence, warfighting, and intelligence objectives. However, offensive cyber operations in particular have presented a delicate challenge for the alliance. As cyber threats to NATO members continue to grow, the alliance has begun to address how it could incorporate offensive cyber operations into its strategy and policy. However, there are significant hurdles to meaningful cooperation on offensive cyber operations, in contrast with the high levels of integration in other operational domains. Moreover, there is a critical gap in existing conceptualizations of the role of offensive cyber operations in NATO policy. Specifically, NATO cyber policy has focused on cyber operations in a warfighting context at the expense of considering cyber operations below the level of conflict. In this article, we explore the potential role for offensive cyber operations not only in wartime but also below the threshold of armed conflict. In doing so, we systematically explore a number of challenges at the political/strategic as well as the operational/tactical levels and provide policy recommendations for next steps for the alliance.

Authored by Erica Lonergan, Mark Montgomery

Analysis of classification based predicted disease using machine learning and medical things model

{Health diseases have been issued seriously harmful in human life due to different dehydrated food and disturbance of working environment in the organization. Precise prediction and diagnosis of disease become a more serious and challenging task for primary deterrence, recognition, and treatment. Thus, based on the above challenges, we proposed the Medical Things (MT) and machine learning models to solve the healthcare problems with appropriate services in disease supervising, forecast, and diagnosis. We developed a prediction framework with machine learning approaches to get different categories of classification for predicted disease. The framework is designed by the fuzzy model with a decision tree to lessen the data complexity. We considered heart disease for experiments and experimental evaluation determined the prediction for categories of classification. The number of decision trees (M) with samples (MS), leaf node (ML), and learning rate (I) is determined as MS=20

Authored by Hemanta Bhuyan, Arun Sai, M. Charan, Vignesh Chowdary, Biswajit Brahma

A Novel Blockchain-Driven Framework for Deterring Fraud in Supply Chain Finance

Frauds in supply chain finance not only result in substantial loss for financial institutions (e.g., banks, trust company, private funds), but also are detrimental to the reputation of the ecosystem. However, such frauds are hard to detect due to the complexity of the operating environment in supply chain finance such as involvement of multiple parties under different agreements. Traditional instruments of financial institutions are time-consuming yet insufficient in countering fraudulent supply chain financing. In this study, we propose a novel blockchain-driven framework for deterring fraud in supply chain finance. Specifically, we use inventory financing in jewelry supply chain as an illustrative scenario. The blockchain technology enables secure and trusted data sharing among multiple parties due to its characteristics of immutability and traceability. Consequently, information on manufacturing, brand license, and warehouse status are available to financial institutions in real time. Moreover, we develop a novel rule-based fraud check module to automatically detect suspicious fraud cases by auditing documents shared by multiple parties through a blockchain network. To validate the effectiveness of the proposed framework, we employ agent-based modeling and simulation. Experimental results show that our proposed framework can effectively deter fraudulent supply chain financing as well as improve operational efficiency.

Authored by Ruiyun Xu, Zhanbo Wang, Leon Zhao

Sensitivity Support in Data Privacy Algorithms

Personal data privacy is a great concern by governments across the world as citizens generate huge amount of data continuously and industries using this for betterment of user centric services. There must be a reasonable balance between data privacy and utility of data. Differential privacy is a promise by data collector to the customer’s personal privacy. Centralised Differential Privacy (CDP) is performing output perturbation of user’s data by applying required privacy budget. This promises the inclusion or exclusion of individual’s data in data set not going to create significant change for a statistical query output and it offers -Differential privacy guarantee. CDP is holding a strong belief on trusted data collector and applying global sensitivity of the data. Local Differential Privacy (LDP) helps user to locally perturb his data and there by guaranteeing privacy even with untrusted data collector. Many differential privacy algorithms handles parameters like privacy budget, sensitivity and data utility in different ways and mostly trying to keep trade-off between privacy and utility of data. This paper evaluates differential privacy algorithms in regard to the privacy support it offers according to the sensitivity of the data. Generalized application of privacy budget is found ineffective in comparison to the sensitivity based usage of privacy budget.

Authored by Geocey Shejy, Pallavi Chavan

DP-BEGAN: A Generative Model of Differential Privacy Algorithm

In recent years, differential privacy has gradually become a standard definition in the field of data privacy protection. Differential privacy does not need to make assumptions about the prior knowledge of privacy adversaries, so it has a more stringent effect than existing privacy protection models and definitions. This good feature has been used by researchers to solve the in-depth learning problem restricted by the problem of privacy and security, making an important breakthrough, and promoting its further large-scale application. Combining differential privacy with BEGAN, we propose the DP-BEGAN framework. The differential privacy is realized by adding carefully designed noise to the gradient of Gan model training, so as to ensure that Gan can generate unlimited synthetic data that conforms to the statistical characteristics of source data and does not disclose privacy. At the same time, it is compared with the existing methods on public datasets. The results show that under a certain privacy budget, this method can generate higher quality privacy protection data more efficiently, which can be used in a variety of data analysis tasks. The privacy loss is independent of the amount of synthetic data, so it can be applied to large datasets.

Authored by Er-Mei Shi, Jia-Xi Liu, Yuan-Ming Ji, Liang Chang

Differential Privacy under Incalculable Sensitivity

Differential privacy mechanisms have been proposed to guarantee the privacy of individuals in various types of statistical information. When constructing a probabilistic mechanism to satisfy differential privacy, it is necessary to consider the impact of an arbitrary record on its statistics, i.e., sensitivity, but there are situations where sensitivity is difficult to derive. In this paper, we first summarize the situations in which it is difficult to derive sensitivity in general, and then propose a definition equivalent to the conventional definition of differential privacy to deal with them. This definition considers neighboring datasets as in the conventional definition. Therefore, known differential privacy mechanisms can be applied. Next, as an example of the difficulty in deriving sensitivity, we focus on the t-test, a basic tool in statistical analysis, and show that a concrete differential privacy mechanism can be constructed in practice. Our proposed definition can be treated in the same way as the conventional differential privacy definition, and can be applied to cases where it is difficult to derive sensitivity.

Authored by Tomoaki Mimoto, Masayuki Hashimoto, Hiroyuki Yokoyama, Toru Nakamura, Takamasa Isohara, Ryosuke Kojima, Aki Hasegawa, Yasushi Okuno

Differential Privacy High-dimensional Data Publishing Method Based on Bayesian Network

Ensuring high data availability while realizing privacy protection is a research hotspot in the field of privacy-preserving data publishing. In view of the instability of data availability in the existing differential privacy high-dimensional data publishing methods based on Bayesian networks, this paper proposes an improved MEPrivBayes privacy-preserving data publishing method, which is mainly improved from two aspects. Firstly, in view of the structural instability caused by the random selection of Bayesian first nodes, this paper proposes a method of first node selection and Bayesian network construction based on the Maximum Information Coefficient Matrix. Then, this paper proposes a privacy budget elastic allocation algorithm: on the basis of pre-setting differential privacy budget coefficients for all branch nodes and all leaf nodes in Bayesian network, the influence of branch nodes on their child nodes and the average correlation degree between leaf nodes and all other nodes are calculated, then get a privacy budget strategy. The SVM multi-classifier is constructed with privacy preserving data as training data set, and the original data set is used as input to evaluate the prediction accuracy in this paper. The experimental results show that the MEPrivBayes method proposed in this paper has higher data availability than the classical PrivBayes method. Especially when the privacy budget is small (noise is large), the availability of the data published by MEPrivBayes decreases less.

Authored by Xiaotian Lu, Chunhui Piao, Jianghe Han

RDP-WGAN: Image Data Privacy Protection Based on Rényi Differential Privacy

In recent years, artificial intelligence technology based on image data has been widely used in various industries. Rational analysis and mining of image data can not only promote the development of the technology field but also become a new engine to drive economic development. However, the privacy leakage problem has become more and more serious. To solve the privacy leakage problem of image data, this paper proposes the RDP-WGAN privacy protection framework, which deploys the Rényi differential privacy (RDP) protection techniques in the training process of generative adversarial networks to obtain a generative model with differential privacy. This generative model is used to generate an unlimited number of synthetic datasets to complete various data analysis tasks instead of sensitive datasets. Experimental results demonstrate that the RDP-WGAN privacy protection framework provides privacy protection for sensitive image datasets while ensuring the usefulness of the synthetic datasets.

Authored by Xuebin Ma, Ren Yang, Maobo Zheng

Improved differential privacy K-means clustering algorithm for privacy budget allocation

In the differential privacy clustering algorithm, the added random noise causes the clustering centroids to be shifted, which affects the usability of the clustering results. To address this problem, we design a differential privacy K-means clustering algorithm based on an adaptive allocation of privacy budget to the clustering effect: Adaptive Differential Privacy K-means (ADPK-means). The method is based on the evaluation results generated at the end of each iteration in the clustering algorithm. First, it dynamically evaluates the effect of the clustered sets at the end of each iteration by measuring the separation and tightness between the clustered sets. Then, the evaluation results are introduced into the process of privacy budget allocation by weighting the traditional privacy budget allocation. Finally, different privacy budgets are assigned to different sets of clusters in the iteration to achieve the purpose of adaptively adding perturbation noise to each set. In this paper, both theoretical and experimental results are analyzed, and the results show that the algorithm satisfies e-differential privacy and achieves better results in terms of the availability of clustering results for the three standard datasets.

Authored by Liquan Han, Yushan Xie, Di Fan, Jinyuan Liu

Differential Privacy Protection Algorithm Based on Zero Trust Architecture for Industrial Internet

The Zero Trust Architecture is an important part of the industrial Internet security protection standard. When analyzing industrial data for enterprise-level or industry-level applications, differential privacy (DP) is an important technology for protecting user privacy. However, the centralized and local DP used widely nowadays are only applicable to the networks with fixed trust relationship and cannot cope with the dynamic security boundaries in Zero Trust Architecture. In this paper, we design a differential privacy scheme that can be applied to Zero Trust Architecture. It has a consistent privacy representation and the same noise mechanism in centralized and local DP scenarios, and can balance the strength of privacy protection and the flexibility of privacy mechanisms. We verify the algorithm in the experiment, that using maximum expectation estimation method it is able to obtain equal or even better result of the utility with the same level of security as traditional methods.

Authored by Yuning Song, Liping Ding, Xuehua Liu, Mo Du

Differential Privacy Techniques for Healthcare Data

This paper analyzes techniques to enable differential privacy by adding Laplace noise to healthcare data. First, as healthcare data contain natural constraints for data to take only integral values, we show that drawing only integral values does not provide differential privacy. In contrast, rounding randomly drawn values to the nearest integer provides differential privacy. Second, when a variable is constructed using two other variables, noise must be added to only one of them. Third, if the constructed variable is a fraction, then noise must be added to its constituent private variables, and not to the fraction directly. Fourth, the accuracy of analytics following noise addition increases with the privacy budget, ϵ, and the variance of the independent variable. Finally, the accuracy of analytics following noise addition increases disproportionately with an increase in the privacy budget when the variance of the independent variable is greater. Using actual healthcare data, we provide evidence supporting the two predictions on the accuracy of data analytics. Crucially, to enable accuracy of data analytics with differential privacy, we derive a relationship to extract the slope parameter in the original dataset using the slope parameter in the noisy dataset.

Authored by Rishabh Subramanian

Privacy-Preserving Cloud Data Model based on Differential Approach

With the variety of cloud services, the cloud service provider delivers the machine learning service, which is used in many applications, including risk assessment, product recommen-dation, and image recognition. The cloud service provider initiates a protocol for the classification service to enable the data owners to request an evaluation of their data. The owners may not entirely rely on the cloud environment as the third parties manage it. However, protecting data privacy while sharing it is a significant challenge. A novel privacy-preserving model is proposed, which is based on differential privacy and machine learning approaches. The proposed model allows the various data owners for storage, sharing, and utilization in the cloud environment. The experiments are conducted on Blood transfusion service center, Phoneme, and Wilt datasets to lay down the proposed model's efficiency in accuracy, precision, recall, and Fl-score terms. The results exhibit that the proposed model specifies high accuracy, precision, recall, and Fl-score up to 97.72%, 98.04%, 97.72%, and 98.80%, respectively.

Authored by Rishabh Gupta, Ashutosh Singh