FaultFormer

Data

Data is available at: http://manufacturingnet.io/html/datasets.html
The codebase expects data in raw, unfeaturized form. In most cases, this means tensors of shape (batch_size, seq_len)
Data in the codebase was organized into the following folder structure, although any structure can be used. (Will have to modify data paths, etc.)
CWRU Data is organized in the /CWRU folder.
- /kfold
  - Holds main training and testing data.
  - data is a torch tensor of shape (n_samples, n_data_points), which is (2240, 1600) for training, and (560, 1600) for test
  - labels is a torch tensor of shape (n_samples), which is (2240) for training, and (560) for test
  - Each fold has a unique train-test split of the raw data/labels. The best model was trained on fold 1
- signal_data.npy
  - Raw data of shape (2800, 1600)
- signal_data_labels.npy
  - Raw labels of shape (2800)
Paderborn Data is organized in the same way.

Hyperparameters on the CWRU Dataset:
- params = { "batch_size": 1024, "epochs": 11000, "d_in": 3, "d_model": 140, "nhead": 20, "d_hid": 300, "nlayers": 6, "dropout": 0.3, "warmup": 4000, "seq_len": 40, "n_classes": 10, "model": "Transformer_cls", "fourier": True, "p_no_aug": .1, "p_two_aug": .5, }
Hyperparameters on the Paderborn Dataset:
- params = { "batch_size": 1024, "epochs": 1000, "d_in": 3, "d_model": 256, "nhead": 16, "d_hid": 512, "nlayers": 6, "dropout": 0.3, "warmup": 4000, "seq_len": 40, "n_classes": 3, "model": "Transformer_cls", "fourier": True, "p_no_aug": .1, "p_two_aug": .5, }
examples.ipynb contains examples explaining how to use codebase for training, pretraining, and visualizing results

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
models		models
new		new
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
datasets.py		datasets.py
examples.ipynb		examples.ipynb
mask_pretrain.py		mask_pretrain.py
models.py		models.py
runner.py		runner.py
train.py		train.py
triplet_pretrain.py		triplet_pretrain.py
utils.py		utils.py