Change detection in dynamic attributed networks
Degree GrantorUniversity of Canterbury
Degree NameDoctor of Philosophy
A network provides a powerful means of representing complex relationships between entities by abstracting entities as vertices, and relationships as edges connecting vertices in a graph. Beyond the presence or absence of relationships, a network may contain additional information that can be attributed to the entities and their relationships. Attaching these additional attribute data to the corresponding vertices and edges yields an attributed graph. Moreover, in the majority of real-world applications, such as online social networks, financial networks and transactional networks, relationships between entities evolve over time. This dynamic behaviour can be modelled using a time sequence of attributed graphs. Change detection in dynamic attributed networks is an important problem in many areas, such as fraud detection, cyber intrusion detection and health care monitoring. It is a challenging problem because it involves a time sequence of attributed graphs, each of which is usually very large and can contain many attributes attached to the vertices and edges, resulting in a complex, high dimensional mathematical object.
Spectral embedding methods provide an effective way to transform a graph to a lower dimensional latent Euclidean space that preserves the underlying structure of the network. Although change detection methods that use spectral embedding are available, they do not address sparsity and degree heterogeneity that occur in most real-world graphs. Furthermore, a majority of these methods focus on changes in the behaviour of the overall network. This is a drawback for applications where it is more important to detect changes in entity behaviour rather than the behaviour of the overall network.
In this thesis, we propose a novel concept of applying Procrustes techniques to embedded points for vertices in a graph to detect changes in entity behaviour. By adapting previously developed techniques in spectral graph theory, we first formulate a strategy to extract an embedding from the weighted adjacency matrix of each graph in the sequence. Our spectral embedding approach not only addresses sparsity and degree heterogeneity issues, but also obtains an estimate of the appropriate embedding dimension. Each embedded point represents the behaviour of an entity at a given time instant. We employ a Procrustes analysis procedure to compare embeddings across time and to quantify changes. We call this method CDP (change detection using Procrustes analysis). We demonstrate the performance of CDP for undirected, unipartite graphs through extensive simulation experiments. CDP successfully detects various types of vertex- based changes considered in the experiments, including (i) changes in vertex degree, (ii) changes in community membership of vertices, and (iii) unusual increase or decrease in edge weight between vertices. Furthermore, when applied to the dynamic network for the Enron email dataset, CDP successfully detects, at the appropriate times, several fraudsters who were known to be involved in the scandal. For both simulation and real-world experiments, we compare the change detection performance of CDP with two other baseline methods (the method in Id´e and Kashima (2004) and improved version) that employ alternative spectral embedding approaches. In both cases, CDP generally show superior performance.
A multi-view network depicts multiple types of relationships between a set of entities, that may either be obtained from the same source or from different sources. A dynamic multi-view network can be represented as a time sequence of three-dimensional weighted adjacency tensors. The second part of this thesis extends CDP to handle multi-view change detection based on tensors. We propose two methods to obtain em- bedded points for vertices. The first method, MCDP-I (multi-view change detection using Procrustes analysis-I), does this by using higher order singular value decomposi- tion to factorize a tensor. The second method, MCDP-II (multi-view change detection using Procrustes analysis-II), first embeds each slice of the tensor separately using matrix factorization, and then employs Procrustes techniques to calculate an average embedding. We systematically investigate the performance of MCDP-I and MCDP-II based on several simulation experiments. In all experiments, the change detection performance of our methods are compared against five other methods (Han et al., 2015; Liu et al., 2013; Tang et al., 2012) that differ from each other mainly in the way that information from different tensor slices are combined to give an embedding for the vertices. Both MCDP-I and MCDP-II successfully detect all types of vertex-based changes considered in the experiments. MCDP-I generally performs better than MCDP-II when the change information contained in the tensor is consistent across all slices; otherwise, MCDP-II is generally better. When applied to the multi-view Enron email network, MCDP-II successfully detects several known fraudsters while MCDP-I detects only one of them.
The novel concept of applying Procrustes techniques to embedded points for vertices introduces a new direction for change detection. The algorithms we have developed can be applied to different areas, such as cyber-intrusion detection, fraud detection, healthcare monitoring and detection of natural disturbances in the eco-system.