pydgc.models
pydgc.models.ae
- class AE(logger, cfg)[source]
Bases:
DGCModelAutoencoder model with MLP as encoder and decoder. Performs kmeans on embeddings.
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Config.
- loss(x, hat_x)[source]
Model loss function.
- Parameters:
x (Tensor) –
hat_x (Tensor) –
- Return type:
Tensor
- train_model(data, cfg=None, flag='TRAIN AE')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- clustering(data, method='kmeans_gpu')[source]
Clustering function.
- Parameters:
data (Data) –
method (str) –
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
pydgc.models.agcdrr
- new_graph(edge_index, weight, n, device)[source]
Create a new graph with the given edge index, weight, and number of nodes.
- Parameters:
edge_index (Tensor) – Edge index.
weight (Tensor) – Edge weight.
n (int) – Number of nodes.
device (torch.device) – Device.
- Returns:
New graph.
- Return type:
Tensor
- normalize(mx)[source]
Row-normalize sparse matrix.
- Parameters:
mx (scipy.sparse.csr_matrix) – Sparse matrix.
- Returns:
Row-normalized sparse matrix.
- Return type:
scipy.sparse.csr_matrix
- class GNNLayer(in_features, out_features)[source]
Bases:
ModuleGraph Neural Network Layer.
- Parameters:
in_features (int) – Input feature dimension.
out_features (int) – Output feature dimension.
- forward(features, adj, active)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE_encoder(gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, n_input)[source]
Bases:
ModuleIGAE encoder.
- Parameters:
gae_n_enc_1 (int) – Number of hidden units in the first layer.
gae_n_enc_2 (int) – Number of hidden units in the second layer.
gae_n_enc_3 (int) – Number of hidden units in the third layer.
n_input (int) – Input feature dimension.
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class Cluster_layer(in_dims, out_dims)[source]
Bases:
ModuleClustering layer.
- Parameters:
in_dims (int) – Input feature dimension.
out_dims (int) – Output feature dimension.
- forward(h)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE(gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, n_input, clusters)[source]
Bases:
ModuleIGAE model.
- Parameters:
gae_n_enc_1 (int) – Number of hidden units in the first layer.
gae_n_enc_2 (int) – Number of hidden units in the second layer.
gae_n_enc_3 (int) – Number of hidden units in the third layer.
n_input (int) – Input feature dimension.
clusters (int) – Number of clusters.
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class ViewLearner(encoder, embedding_dim)[source]
Bases:
ModuleView learner.
- Parameters:
encoder (nn.Module) – Encoder model.
embedding_dim (int) – Embedding dimension.
- forward(x, adj, edge_index)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class AGCDRR(logger, cfg)[source]
Bases:
DGCModelAttributed Graph Clustering with Dual Redundancy Reduction.
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Config.
- train_model(data, cfg=None, flag='TRAIN AGCDRR')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.ccgc
- init_clustering(feature, cluster_num)[source]
Initialize clustering with kmeans.
- Parameters:
feature (Tensor) – Input feature.
cluster_num (int) – Number of clusters.
- Returns:
Predicted labels. dis (Tensor): Pairwise distance.
- Return type:
predict_labels (Tensor)
- class CCGC(logger, cfg)[source]
Bases:
DGCModelCluster-Guided Contrastive Graph Clustering Network.
Reference: https://ojs.aaai.org/index.php/AAAI/article/view/26285
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Config.
- train_model(data, cfg=None, flag='TRAIN CCGC')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.daegc
- class GATE(logger, cfg)[source]
Bases:
DGCModelGraph Attentional Autoencoder.
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Config.
- loss(adj_label, hat_adj)[source]
Model loss function.
- Parameters:
adj_label (Tensor) –
hat_adj (Tensor) –
- Return type:
Tensor
- train_model(data, cfg=None, flag='TRAIN GATE')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- Return type:
List
- clustering(data, method='kmeans_gpu')[source]
Clustering function.
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
- class DAEGC(logger, cfg)[source]
Bases:
DGCModelAttributed Graph Clustering: A Deep Attentional Embedding Approach.
Reference: https://arxiv.org/abs/1906.06532
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Config.
- loss(adj_label, hat_adj, q)[source]
Model loss function.
- Parameters:
adj_label (Tensor) –
hat_adj (Tensor) –
q (Tensor) –
- Return type:
Tensor
- pretrain(data, cfg=None, flag='PRETRAIN GATE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- train_model(data, cfg=None, flag='TRAIN DAEGC')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.dcrn
- normalize_adj(adj, self_loop=True, symmetry=False)[source]
Normalize the adj matrix.
- Parameters:
adj (np.ndarray) – Input adj matrix.
self_loop (bool, optional) – If add the self loop or not. Defaults to True.
symmetry (bool, optional) – Symmetry normalize or not. Defaults to False.
- Returns:
The normalized adj matrix.
- Return type:
np.ndarray
- numpy_to_torch(a, sparse=False)[source]
Convert numpy array to torch tensor.
- Parameters:
a (np.ndarray) – Input numpy array.
sparse (bool, optional) – If sparse tensor or not. Defaults to False.
- Returns:
Output torch tensor.
- Return type:
torch.Tensor
- remove_edge(A, similarity, remove_rate=0.1, device='cuda')[source]
Remove edge based on embedding similarity.
- Parameters:
A (np.ndarray) – The origin adjacency matrix.
similarity (np.ndarray) – Cosine similarity matrix of embedding.
remove_rate (float, optional) – The rate of removing linkage relation. Defaults to 0.1.
device (str, optional) – Device. Defaults to ‘cuda’.
- Returns:
Edge-masked adjacency matrix.
- Return type:
np.ndarray
- reconstruction_loss(X, A_norm, X_hat, Z_hat, A_hat)[source]
Reconstruction loss $L_{rec}$.
- Parameters:
X (torch.Tensor) – The origin feature matrix.
A_norm (torch.Tensor) – The normalized adj.
X_hat (torch.Tensor) – The reconstructed X.
Z_hat (torch.Tensor) – The reconstructed Z.
A_hat (torch.Tensor) – The reconstructed A.
- Returns:
The reconstruction loss.
- Return type:
torch.Tensor
- target_distribution(Q)[source]
Calculate the target distribution (student-t distribution).
- Parameters:
Q (torch.Tensor) – The soft assignment distribution.
- Returns:
The target distribution P.
- Return type:
torch.Tensor
- distribution_loss(Q, P)[source]
Clustering guidance loss $L_{KL}$.
- Parameters:
Q (torch.Tensor) – The soft assignment distribution.
P (torch.Tensor) – The target distribution.
- Returns:
The clustering guidance loss.
- Return type:
torch.Tensor
- r_loss(AZ, Z, eps=1e-08, clamp_val=0.0001)[source]
Propagated regularization loss $L_{R}$.
- Parameters:
AZ (torch.Tensor) – The propagated embedding.
Z (torch.Tensor) – The embedding.
eps (float, optional) – The epsilon value. Defaults to 1e-8.
clamp_val (float, optional) – The clamp value. Defaults to 1e-4.
- Returns:
The propagated regularization loss.
- Return type:
torch.Tensor
- off_diagonal(x)[source]
Off-diagonal elements of x.
- Parameters:
x (torch.Tensor) – Input matrix.
- Returns:
Off-diagonal elements of x.
- Return type:
torch.Tensor
- cross_correlation(Z_v1, Z_v2)[source]
Cross-view correlation matrix S.
- Parameters:
Z_v1 (torch.Tensor) – The first view embedding.
Z_v2 (torch.Tensor) – The second view embedding.
- Returns:
The cross-view correlation matrix S.
- Return type:
torch.Tensor
- correlation_reduction_loss(S)[source]
Correlation reduction loss $L_{CR}$.
- Parameters:
S (torch.Tensor) – The cross-view correlation matrix S.
- Returns:
The correlation reduction loss.
- Return type:
torch.Tensor
- dicr_loss(name, Z_ae, Z_igae, AZ, Z, gamma_value)[source]
Dual Information Correlation Reduction loss $L_{DICR}$.
- Parameters:
name (str) – Dataset name.
Z_ae (list of torch.Tensor) – AE embedding including two-view node embedding [0, 1] and two-view cluster-level embedding [2, 3].
Z_igae (list of torch.Tensor) – IGAE embedding including two-view node embedding [0, 1] and two-view cluster-level embedding [2, 3].
AZ (torch.Tensor) – The propagated fusion embedding AZ.
Z (torch.Tensor) – The fusion embedding Z.
gamma_value (float) – Gamma value.
- Returns:
The DICR loss.
- Return type:
torch.Tensor
- gaussian_noised_feature(X, device='cuda')[source]
Add gaussian noise to the attribute matrix X.
- Parameters:
X (torch.Tensor) – The attribute matrix.
device (str) – Device.
- Returns:
The noised attribute matrix X_tilde.
- Return type:
torch.Tensor
- class AE_encoder(ae_n_enc_1, ae_n_enc_2, ae_n_enc_3, n_input, n_z)[source]
Bases:
ModuleAE encoder.
- Parameters:
ae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
ae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
ae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
- Returns:
The encoded latent features.
- Return type:
torch.Tensor
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class AE_decoder(ae_n_dec_1, ae_n_dec_2, ae_n_dec_3, n_input, n_z)[source]
Bases:
ModuleAE decoder.
- Parameters:
ae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
ae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
ae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
- Returns:
The decoded features.
- Return type:
torch.Tensor
- forward(z_ae)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class AE(ae_n_enc_1, ae_n_enc_2, ae_n_enc_3, ae_n_dec_1, ae_n_dec_2, ae_n_dec_3, n_input, n_z, device='cuda')[source]
Bases:
ModuleAE module.
- Parameters:
ae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
ae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
ae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
ae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
ae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
ae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
device (str) – Device.
- Returns:
The decoded features.
- Return type:
torch.Tensor
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- pretrain(logger, data, cfg=None, flag='PRETRAIN AE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
- class GNNLayer(name, in_features, out_features)[source]
Bases:
ModuleGNN layer.
- Parameters:
name (str) – Name of the GNN layer.
in_features (int) – Number of input features.
out_features (int) – Number of output features.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(features, adj, active=False)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE_encoder(name, gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, n_input)[source]
Bases:
ModuleIGAE encoder.
- Parameters:
name (str) – Name of the GNN layer.
gae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
gae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
gae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
n_input (int) – The number of input features.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE_decoder(name, gae_n_dec_1, gae_n_dec_2, gae_n_dec_3, n_input)[source]
Bases:
Module- forward(z_igae, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE(name, gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, gae_n_dec_1, gae_n_dec_2, gae_n_dec_3, n_input, device='cuda')[source]
Bases:
ModuleIGAE model.
- Parameters:
name (str) – Name of the GNN layer.
gae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
gae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
gae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
gae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
gae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
gae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- pretrain(logger, data, cfg=None, flag='PRETRAIN IGAE')[source]
- Parameters:
logger (Logger) –
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
- class Readout(K)[source]
Bases:
ModuleReadout layer.
- Parameters:
K (int) – Number of clusters.
- Returns:
Cluster-level embedding.
- Return type:
torch.Tensor
- forward(Z)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class DCRN(logger, cfg)[source]
Bases:
DGCModelDeep Graph Clustering via Dual Correlation Reduction.
Reference: https://ojs.aaai.org/index.php/AAAI/article/view/20726
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Configuration.
- Returns:
Output features.
- Return type:
torch.Tensor
- q_distribute(Z, Z_ae, Z_igae)[source]
calculate the soft assignment distribution based on the embedding and the cluster centers :param Z: fusion node embedding :param Z_ae: node embedding encoded by AE :param Z_igae: node embedding encoded by IGAE
- Returns:
the soft assignment distribution Q
- pretrain(data, cfg=None, flag='PRETRAIN AE_IGAE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- train_model(data, cfg=None, flag='TRAIN DCRN')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- clustering(data, method='kmeans_gpu')[source]
Clustering function.
- Parameters:
data (Data) –
method (str) –
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
pydgc.models.dfcn
- target_distribution(q)[source]
Target distribution.
- Parameters:
q (torch.Tensor) – Input tensor.
- Returns:
Target distribution.
- Return type:
torch.Tensor
- class AE_encoder(ae_n_enc_1, ae_n_enc_2, ae_n_enc_3, n_input, n_z, device='cuda')[source]
Bases:
ModuleAutoencoder encoder.
- Parameters:
ae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
ae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
ae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class AE_decoder(ae_n_dec_1, ae_n_dec_2, ae_n_dec_3, n_input, n_z, device='cuda')[source]
Bases:
ModuleAutoencoder decoder.
- Parameters:
ae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
ae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
ae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(z_ae)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class AE(ae_n_enc_1, ae_n_enc_2, ae_n_enc_3, ae_n_dec_1, ae_n_dec_2, ae_n_dec_3, n_input, n_z, device='cuda')[source]
Bases:
ModuleAutoencoder.
- Parameters:
ae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
ae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
ae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
ae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
ae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
ae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
n_z (int) – The number of latent features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- pretrain(logger, data, cfg=None, flag='PRETRAIN AE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
- class GNNLayer(name, in_features, out_features, device='cuda')[source]
Bases:
ModuleGraph neural network layer.
- Parameters:
name (str) – Name of the dataset.
in_features (int) – Number of input features.
out_features (int) – Number of output features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(features, adj, active=False)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE_encoder(name, gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, n_input, device='cuda')[source]
Bases:
ModuleIGAE encoder.
- Parameters:
name (str) – Name of the dataset.
gae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
gae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
gae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
n_input (int) – The number of input features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE_decoder(name, gae_n_dec_1, gae_n_dec_2, gae_n_dec_3, n_input, device='cuda')[source]
Bases:
ModuleIGAE decoder.
- Parameters:
name (str) – Name of the dataset.
gae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
gae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
gae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(z_igae, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- class IGAE(name, gae_n_enc_1, gae_n_enc_2, gae_n_enc_3, gae_n_dec_1, gae_n_dec_2, gae_n_dec_3, n_input, device='cuda')[source]
Bases:
ModuleIGAE model.
- Parameters:
name (str) – Name of the dataset.
gae_n_enc_1 (int) – The number of neurons in the first layer of the encoder.
gae_n_enc_2 (int) – The number of neurons in the second layer of the encoder.
gae_n_enc_3 (int) – The number of neurons in the third layer of the encoder.
gae_n_dec_1 (int) – The number of neurons in the first layer of the decoder.
gae_n_dec_2 (int) – The number of neurons in the second layer of the decoder.
gae_n_dec_3 (int) – The number of neurons in the third layer of the decoder.
n_input (int) – The number of input features.
device (str) – Device.
- Returns:
Output features.
- Return type:
torch.Tensor
- forward(x, adj)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- pretrain(logger, data, cfg=None, flag='PRETRAIN IGAE')[source]
- Parameters:
logger (Logger) –
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
- class DFCN(logger, cfg)[source]
Bases:
DGCModelDeep Fusion Clustering Network.
Reference: https://ojs.aaai.org/index.php/AAAI/article/view/17198
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Configuration.
- Returns:
Output features.
- Return type:
torch.Tensor
- pretrain(data, cfg=None, flag='PRETRAIN AE_IGAE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- train_model(data, cfg=None, flag='TRAIN DFCN')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.dgc_model
- class DGCModel(logger, cfg)[source]
Bases:
Module,ABCDeep Graph Clustering base Model.
Implement abstractmethod reset_parameters, forward, loss, train_model, get_embedding, clustering, evaluate.
- Parameters:
logger (Logger) – Logger.
cfg (CN) – Configuration.
- Returns:
Output features.
- Return type:
torch.Tensor
- abstract train_model(*args, **kwargs)[source]
Model training function.
- Return type:
Tuple[List, List, Tensor, Tensor, Dict]
- abstract clustering(*args, **kwargs)[source]
Clustering function.
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
pydgc.models.dgcluster
- convert_scipy_torch_sp(sp_adj)[source]
Convert scipy sparse matrix to torch sparse matrix.
- Parameters:
sp_adj (scipy.sparse.csr_matrix) – Input sparse matrix.
- Returns:
Output sparse matrix.
- Return type:
torch.sparse_coo_tensor
- aux_objective(output, s, oh_labels)[source]
Auxiliary objective function.
- Parameters:
output (torch.Tensor) – Output tensor.
s (torch.Tensor) – Sample indices.
oh_labels (torch.Tensor) – One-hot labels.
- Returns:
Auxiliary objective loss.
- Return type:
torch.Tensor
- regularization(output, s)[source]
Regularization function.
- Parameters:
output (torch.Tensor) – Output tensor.
s (torch.Tensor) – Sample indices.
- Returns:
Regularization loss.
- Return type:
torch.Tensor
- class DGCLUSTER(logger, cfg)[source]
Bases:
DGCModelDGCLUSTER: A Neural Framework for Attributed Graph Clustering via Modularity Maximization.
Reference: https://ojs.aaai.org/index.php/AAAI/article/view/28983
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- training: bool
- train_model(data, cfg=None, flag='TRAIN DGCLUSTER')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
pydgc.models.gae
- class GAE(logger, cfg)[source]
Bases:
DGCModelVariational Graph Auto-Encoders.
Reference: https://arxiv.org/abs/1611.07308
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- loss(edge_index, hat_adj)[source]
Model loss function.
- Parameters:
hat_adj (Tensor) –
- Return type:
Tensor
- train_model(data, cfg=None, flag='TRAIN GAE')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- clustering(data, method='kmeans_gpu')[source]
Clustering function.
- Parameters:
data (Data) –
method (str) –
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
pydgc.models.gae_ssc
- class GAESSC(logger, cfg)[source]
Bases:
DGCModelGraph-autoencoder with self-supervised clustering used in DEC.
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- loss(edge_index, hat_adj, q)[source]
Model loss function.
- Parameters:
hat_adj (Tensor) –
q (Tensor) –
- Return type:
Tensor
- pretrain(data, cfg=None, flag='PRETRAIN GAE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- train_model(data, cfg=None, flag='TRAIN GAE-SSC')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.hsan
- comprehensive_similarity(Z1, Z2, E1, E2, alpha)[source]
Comprehensive similarity function.
- Parameters:
Z1 (torch.Tensor) – Latent representation of the first view.
Z2 (torch.Tensor) – Latent representation of the second view.
E1 (torch.Tensor) – Latent representation of the first view.
E2 (torch.Tensor) – Latent representation of the second view.
alpha (float) – Weight of the similarity function.
- Returns:
Comprehensive similarity matrix.
- Return type:
torch.Tensor
- hard_sample_aware_infoNCE(S, M, pos_neg_weight, pos_weight, node_num)[source]
Hard sample aware InfoNCE loss function.
- Parameters:
S (torch.Tensor) – Comprehensive similarity matrix.
M (torch.Tensor) – Mask matrix.
pos_neg_weight (float) – Weight of the negative samples.
pos_weight (float) – Weight of the positive samples.
node_num (int) – Number of nodes.
- Returns:
InfoNCE loss.
- Return type:
torch.Tensor
- square_euclid_distance(Z, center)[source]
Square Euclidean distance function.
- Parameters:
Z (torch.Tensor) – Latent representation.
center (torch.Tensor) – Clustering centers.
- Returns:
Square Euclidean distance matrix.
- Return type:
torch.Tensor
- phi(embedding, cluster_num)[source]
Clustering function.
- Parameters:
embedding (torch.Tensor) – Latent representation.
cluster_num (int) – Number of clusters.
- Returns:
Clustering labels. torch.Tensor: Clustering centers.
- Return type:
torch.Tensor
- class HSAN(logger, cfg)[source]
Bases:
DGCModelHard Sample Aware Network for Contrastive Deep Graph Clustering.
Reference: https://ojs.aaai.org/index.php/AAAI/article/view/26071
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- train_model(data, cfg=None, flag='TRAIN HSAN')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.magi
- class Encoder(in_channels, hidden_channels, base_model=<class 'torch_geometric.nn.conv.gcn_conv.GCNConv'>, dropout=0.5, ns=0.5)[source]
Bases:
ModuleEncoder for MAGI.
- Parameters:
in_channels (int) – Number of input channels.
hidden_channels (list) – List of hidden channels.
base_model (torch.nn.Module) – Base model for graph convolution.
dropout (float) – Dropout rate.
ns (float) – Negative slope for leaky ReLU.
- forward(x, edge_index=None, adjs=None, dropout=True)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Parameters:
x (Tensor) –
- training: bool
- class Loss(temperature=0.07, scale_by_temperature=True, scale_by_weight=False)[source]
Bases:
ModuleLoss function for MAGI.
- Parameters:
temperature (float) – Temperature
scale_by_temperature (bool) – Whether to scale loss by temperature.
scale_by_weight (bool) – Whether to scale loss by weight.
- forward(out, mask)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- clustering(feature, n_clusters, true_labels, kmeans_device='cpu', batch_size=100000, tol=0.0001, device=device(type='cuda', index=0), spectral_clustering=False)[source]
Clustering function.
- Parameters:
feature (torch.Tensor) – Latent representation.
n_clusters (int) – Number of clusters.
true_labels (torch.Tensor) – True labels.
kmeans_device (str) – Device for kmeans.
batch_size (int) – Batch size.
tol (float) – Tolerance.
device (torch.device) – Device.
spectral_clustering (bool) – Whether to use spectral clustering.
- Returns:
Clustering labels. None: Clustering centers.
- Return type:
torch.Tensor
- scale(z)[source]
Scale the latent representation.
- Parameters:
z (torch.Tensor) – Latent representation.
- Returns:
Scaled latent representation.
- Return type:
torch.Tensor
- class MAGI(logger, cfg)[source]
Bases:
DGCModelRevisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective.
Reference: https://doi.org/10.1145/3637528.3671967
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- train_model(data, cfg=None, flag='TRAIN MAGI')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.magi_batch
- class Encoder(in_channels, hidden_channels, base_model=<class 'torch_geometric.nn.conv.sage_conv.SAGEConv'>, dropout=0.5, ns=0.5)[source]
Bases:
ModuleEncoder model for MAGI-Batch.
- Parameters:
in_channels (int) – Input feature dimension.
hidden_channels (list) – Hidden layer dimensions.
base_model (torch.nn.Module) – Base model for graph convolution.
dropout (float) – Dropout rate.
ns (float) – Negative slope for LeakyReLU.
- forward(x, edge_index=None, adjs=None, dropout=True)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.- Parameters:
x (Tensor) –
- training: bool
- class Loss(temperature=0.07, scale_by_temperature=True, scale_by_weight=False)[source]
Bases:
ModuleLoss function for MAGI-Batch.
- Parameters:
temperature (float) – Temperature parameter for softmax.
scale_by_temperature (bool) – Whether to scale the loss by temperature.
scale_by_weight (bool) – Whether to scale the loss by weight.
- forward(out, mask)[source]
Define the computation performed at every call.
Should be overridden by all subclasses.
Note
Although the recipe for forward pass needs to be defined within this function, one should call the
Moduleinstance afterwards instead of this since the former takes care of running the registered hooks while the latter silently ignores them.
- training: bool
- clustering(feature, n_clusters, true_labels, kmeans_device='cpu', batch_size=100000, tol=0.0001, device=device(type='cuda', index=0), spectral_clustering=False)[source]
Clustering function.
- Parameters:
feature (torch.Tensor) – Latent representation.
n_clusters (int) – Number of clusters.
true_labels (torch.Tensor) – True labels.
kmeans_device (str) – Device for kmeans.
batch_size (int) – Batch size for kmeans.
tol (float) – Tolerance for kmeans.
device (torch.device) – Device for kmeans.
spectral_clustering (bool) – Whether to use spectral clustering.
- Returns:
Clustering labels. None: Clustering centers.
- Return type:
torch.Tensor
- scale(z)[source]
Scale the latent representation.
- Parameters:
z (torch.Tensor) – Latent representation.
- Returns:
Scaled latent representation.
- Return type:
torch.Tensor
- class EdgeIndex(edge_index, e_id, size)[source]
Bases:
tuple- Parameters:
edge_index (Tensor) –
e_id (Tensor | None) –
size (Tuple[int, int]) –
- edge_index: Tensor
Alias for field number 0
- e_id: Tensor | None
Alias for field number 1
- size: Tuple[int, int]
Alias for field number 2
- class Adj(adj_t, e_id, size)[source]
Bases:
tuple- Parameters:
adj_t (SparseTensor) –
e_id (Tensor | None) –
size (Tuple[int, int]) –
- adj_t: SparseTensor
Alias for field number 0
- e_id: Tensor | None
Alias for field number 1
- size: Tuple[int, int]
Alias for field number 2
- class NeighborSampler(edge_index, adj, sizes, is_train=False, wt=20, wl=4, drop_last=False, node_idx=None, num_nodes=None, return_e_id=True, transform=None, **kwargs)[source]
Bases:
DataLoaderNeighbor sampler for graph convolution.
This code adapted from the pytorch geometric (https://github.com/pyg-team/pytorch_geometric/blob/master/torch_geometric/loader/neighbor_sampler.py).
- drop_last: bool
- dataset: Dataset[T_co]
- batch_size: int | None
- num_workers: int
- pin_memory: bool
- timeout: float
- sampler: Sampler | Iterable
- pin_memory_device: str
- prefetch_factor: int | None
- get_mask(adj)[source]
Get mask for positive edges.
- Parameters:
adj (SparseTensor) – Adjacency matrix.
- Returns:
Masked adjacency matrix.
- Return type:
SparseTensor
- class MAGIBatch(logger, cfg)[source]
Bases:
DGCModelRevisiting Modularity Maximization for Graph Clustering: A Contrastive Learning Perspective.
Reference: https://doi.org/10.1145/3637528.3671967
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- train_model(data, cfg=None, flag='TRAIN MAGI')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool
pydgc.models.ns4gc
- mask_feat(X, mask_prob)[source]
Mask feature.
- Parameters:
X (torch.Tensor) – Feature matrix.
mask_prob (float) – Mask probability.
- Returns:
Masked feature matrix.
- Return type:
torch.Tensor
- drop_edge(A, drop_prob)[source]
Drop edge with drop probability
- Parameters:
A (torch.sparse.Tensor) – Adjacency matrix.
drop_prob (float) – Drop probability.
- Returns:
Dropped adjacency matrix.
- Return type:
torch.sparse.Tensor
- add_self_loop(A)[source]
Add self loop to the adjacency matrix.
- Parameters:
A (torch.sparse.Tensor) – Adjacency matrix.
- Returns:
Adjacency matrix with self loop.
- Return type:
torch.sparse.Tensor
- normalize(A, add_self_loops=True, returnA=False)[source]
Normalized the graph’s adjacency matrix in the torch.sparse.Tensor format.
- Parameters:
A (torch.sparse.Tensor) – Adjacency matrix.
add_self_loops (bool) – Whether to add self loops.
returnA (bool) – Whether to return the original adjacency matrix.
- Returns:
Normalized adjacency matrix.
- Return type:
torch.sparse.Tensor
- sparse_identity(dim, device)[source]
Create a sparse identity matrix.
- Parameters:
dim (int) – Dimension of the identity matrix.
device (torch.device) – Device to create the matrix on.
- Returns:
Sparse identity matrix.
- Return type:
torch.sparse.Tensor
- sparse_diag(V)[source]
Create a sparse diagonal matrix.
- Parameters:
V (torch.Tensor) – Diagonal values.
- Returns:
Sparse diagonal matrix.
- Return type:
torch.sparse.Tensor
- augment(A, X, edge_mask_rate, feat_drop_rate)[source]
Augment the graph and feature matrix.
- Parameters:
A (torch.sparse.Tensor) – Adjacency matrix.
X (torch.Tensor) – Feature matrix.
edge_mask_rate (float) – Edge mask rate.
feat_drop_rate (float) – Feature drop rate.
- Returns:
Augmented adjacency matrix. torch.Tensor: Augmented feature matrix.
- Return type:
torch.sparse.Tensor
- class GCNConv(in_dim, out_dim, activation=None)[source]
Bases:
ModuleImplementation of Graph Convolutional Network (GCN) layer.
- Parameters:
in_dim (int) – Input dimensionality of the layer.
out_dim (int) – Output dimensionality of the layer.
activation (callable, optional) – Activation function to use for the final representations. Defaults to None.
- forward(A_norm, X)[source]
Computes GCN representations according to input features and input graph.
- Parameters:
A_norm (torch.sparse.Tensor) – Normalized (n*n) sparse graph adjacency matrix.
X (torch.Tensor) – (n*in_dim) node feature matrix.
- Returns:
An (n*out_dim) node representation matrix.
- Return type:
torch.Tensor
- training: bool
- class NS4GC(logger, cfg)[source]
Bases:
DGCModelReliable Node Similarity Matrix Guided Contrastive Graph Clustering.
Reference: https://ieeexplore.ieee.org/abstract/document/10614738/
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- train_model(data, cfg=None, flag='TRAIN NS4GC')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- Return type:
Tuple[List, List, Tensor, Tensor, Dict]
- clustering(data, method='kmeans_gpu')[source]
Clustering function.
- Parameters:
data (Data) –
method (str) –
- Return type:
Tuple[Tensor, Tensor, Tensor]
- training: bool
pydgc.models.sdcn
- class SDCN(logger, cfg)[source]
Bases:
DGCModelStructural Deep Clustering Network.
Reference: https://doi.org/10.1145/3366423.3380214
- Parameters:
logger (Logger) – Logger object.
cfg (CN) – Configuration object.
- forward(data, sigma=0.5)[source]
Model forward pass.
- Parameters:
data (Data) –
sigma (float) –
- Return type:
Any
- pretrain(data, cfg=None, flag='PRETRAIN AE')[source]
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- train_model(data, cfg=None, flag='TRAIN SDCN')[source]
Model training function.
- Parameters:
data (Data) –
cfg (CfgNode | None) –
flag (str) –
- training: bool