Skip to content

mtcazzolato/tgraph-spot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TgraphSpot

TgraphSpot: Fast and Effective Anomaly Detection for Time-Evolving Graphs

Authors: Mirela T. Cazzolato1,2, Saranya Vijayakumar1, Xinyi Zheng1, Namyong Park1, Meng-Chieh Lee1, Pedro Fidalgo3,4, Bruno Lages3, Agma J. M. Traina2, Christos Faloutsos1.

Affiliations: 1 Carnegie Mellon University (CMU), 2 University of São Paulo (USP), 3 Mobileum, 4 ISCTE-IUL

Conference: IEEE International Conference on Big Data (Big Data), 2022 @ Osaka, Japan.

Please cite the paper as:

@inproceedings{cazzolato2022tgraphspot,
  title={{TgraphSpot:} Fast and Effective Anomaly Detection for Time-Evolving Graphs},
  author={Cazzolato, M.T. and Vijayakumar, S. and Zheng, X. and Park, N. and Lee, M-C. and Fidalgo, P. and Lages, B. and Traina, A.J.M. and Faloutsos, C..},
  booktitle={2022 IEEE International Conference on Big Data (Big Data)},
  year={2022},
  organization={IEEE},
}

Code Updates:

  • May 10-12, 2023
    -- Organizing modules with tabs
    -- Adding a single module for data input
    -- Making "MEASURE" and "TIMESTAMP" columns optional
    -- Updating requirements.txt file

Requirements

Check file requirements.txt

To create and use a virtual environment, type:

python -m venv tgraph_venv
source tgraph_venv/bin/activate
pip install -r requirements.txt

Running the app

Run the app with the following command on your Terminal:

make

or

streamlit run app/tgraphspot.py --server.maxUploadSize 5000
  • Parameter [--server.maxUploadSize 5000] is optional, and it is used to increase the size limit of input files.

Data Sample

We provide a toy sample dataset on folder data/. Check file sample_raw_data.csv

Acknowledgement

Matrix cross-associations The code for generating matrix cross-associations is originally from this Github repository.
The work was proposed in this paper:

Deepayan Chakrabarti, S. Papadimitriou, D. Modha, C. Faloutsos. Fully automatic cross-associations. Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining. 2004. DOI:10.1145/1014052.1014064.


TgraphSpot: Video tutorial

Step-by-step tutorial on how to use TgraphSpot to generate features and visualize the results

Step 1: Feature Extraction

Inform the path of a file containing columns corresponding to source, destination, and measure (e.g., call duration). We provide a sample file in the repository as an example. Click and check the option "Use example file " to use it in the application." After loading the file, click on "Run t-graph" and wait until de task is done. The application saves the file with generated features in the folder "data/."

1.FeatureExtraction.mov

Step 2: HexBin scatter plot

Load the file with the extracted features (from Step 1), and select pairs of features to visualize. The chart is automatically updated. Labels can also be loaded and visualized separately.

2.HexBin.mov

Step 3: Lasso selection and parallel coordinates

Load the extracted features and the file with phone calls. Then select a pair of features to visualize. The application allows the user to make a lasso selection of points of interest. The selected points are listed below the chart. From the selected nodes, the application generates the corresponding EgoNet and plots the adjacency matrix and the cross-associations found. Generating the cross-associations can take some time. The user can control the maximum size of the EgoNet to generate the corresponding visualization (see the parameter in the left panel). Finally, at the bottom of the page, the application shows a plot with parallel coordinates, allowing the user to visualize many features at once.

3.LassoSelectoinParallelCoordinates.mov

Step 4: Interactive scatter matrix

The interactive scatter matrix allows the user to visualize many scatter plots simultaneously, combining many features of interest. There are pre-set feature combinations as well, defined by experts to assist in finding abnormal behavior on logs of phone calls. As mentioned in Step 3, the user can also select desired points, and generate the EgoNet and the matrix visualizations.

4.ScatterMatrix.mov

Step 5: Deep dive

In the deep dive module, the user can visualize the incoming and outgoing behavior of the nodes from the generated EgoNet over time. In the selected period, the user can further select a node and visualize the total duration of incoming and outgoing calls per hour.

5.DeepDive.mov

Step 6: Manage negative-list

The negative-list can be used to remove numbers (or nodes) that usually receive or make many calls but should be ignored during the analysis. Examples of such cases are emergency and service numbers.

6.NegativeList.mov

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors