pydgc.utils

pydgc.utils.command

parse_arguments(dataset_name='ACM', arg_config=None)[source]

Parse arguments.

Parameters:

dataset_name (str) – Dataset name.
arg_config (dict) – Custom arguments.

Returns:

Arguments.

Return type:

argparse.Namespace

pydgc.utils.config

validate_and_create_path(save_path)[source]

Validate whether save_path is valid or not. If it contains directory and is valid but not exists, create directory.

Parameters:: save_path (str) – Save path.
Returns:: True if save_path is valid, False otherwise.
Return type:: bool

default_cfg(dataset_name)[source]

Default configuration.

Parameters:: dataset_name (str) – Dataset name.
Returns:: Default configuration.
Return type:: CN

yaml_to_cfg(yaml_data)[source]

Transform YAML into CfgNode.

Parameters:: yaml_data (dict) – Data loaded from yaml.
Returns:: Transformed CfgNode.
Return type:: CN

dump_cfg(cfg, save_path=None)[source]

Records the configuration of this experiment.

Parameters:

cfg (CN) – Configuration.
save_path (str, optional) – Save path. Defaults to None.

load_dataset_specific_cfg(cfg_file_path, dataset_name)[source]

Load config on specified dataset.

Parameters:

cfg_file_path (str) – Path of config file.
dataset_name (str) – Name of specific dataset.

Returns:

Config of specific dataset.

Return type:

CN

check_required_cfg(cfg, dataset_name, auto_complete=True)[source]

Check required config items.

Parameters:

cfg (CN) – Configuration.
dataset_name (str) – Name of specific dataset.
auto_complete (bool, optional) – Whether to auto-complete missing config items. Defaults to True.

Returns:

True if all required config items are present, False otherwise.

Return type:

bool

generate_default_cfg(datasets, save_path=None)[source]

Generate default config.

Parameters:

datasets (str or list) – Name(s) of dataset(s).
save_path (str, optional) – Save path. Defaults to None.

Returns:

Default config.

Return type:

CN

pydgc.utils.device

@Reference: https://github.com/snap-stanford/GraphGym/blob/master/graphgym/utils/device.py

count_parameters(model)[source]

Count the parameters’ number of the input model.

Note: The unit of return value is millions(M) if exceeds 1,000,000.

Parameters:: model (torch.nn.Module) – The model instance you want to count.
Returns:: The number of model parameters, in Million (M).
Return type:: float

get_gpu_memory_map()[source]

Get the current gpu usage.

Returns:: The current gpu memory usage.
Return type:: np.ndarray

get_current_gpu_usage(gpu_mem, device)[source]

Get the current GPU memory usage.

Parameters:

gpu_mem (np.ndarray) – The current gpu memory usage.
device (str) – The device.

Returns:

The current GPU memory usage.

Return type:

int

auto_select_device(logger, cfg, memory_max=8000, memory_bias=200, strategy='random')[source]

Auto select device for the experiment. Useful when having multiple GPUs.

Parameters:

logger – Logger.
cfg (CN) – Config.
memory_max (int, optional) – Threshold of existing GPU memory usage. GPUs with memory usage beyond this threshold will be deprioritized. Defaults to 8000.
memory_bias (int, optional) – A bias GPU memory usage added to all the GPUs. Avoid divided by zero error. Defaults to 200.
strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’ (greedily select GPU). Defaults to ‘random’.

Returns:

Config.

Return type:

CN

pydgc.utils.logger

get_formatted_time()[source]

Get formatted time.

Returns:: Formatted time in the format of ‘YYYY-MM-DD HH-MM-SS’.
Return type:: str

create_logger(logger_name, log_mode='both', log_file_path=None, encoding='utf-8')[source]

Create logger.

Parameters:

logger_name (str) – Used to name logger.
log_mode (str, optional) – Print mode. Options: [file, stdout, both]. Defaults to ‘both’.
log_file_path (str, optional) – If print output to file, you must specify file path. Defaults to None.
encoding (str, optional) – Encoding mode, ‘utf-8’ for default. Defaults to ‘utf-8’.

Returns:

Logger.

Return type:

Logger

class Logger(name)[source]

Bases: object

Logger.

Parameters:: name (str) – Name of logger.

info(message)[source]

Info level log.

Parameters:: message (str) – Log message.

error(message)[source]

Error level log.

Parameters:: message (str) – Log message.

debug(message)[source]

Debug level log.

Parameters:: message (str) – Log message.

warning(message)[source]

Warning level log.

Parameters:: message (str) – Log message.

flag(message)[source]

Print flag to partition different parts above and below.

Parameters:: message (str) – Log message.

static table(results_dir, dataset_name, results_dict, decimal=4)[source]

Create table.

Parameters:

results_dir (str) – Results directory.
dataset_name (str) – Dataset name.
results_dict (dict) – Results dictionary.
decimal (int, optional) – Decimal. Defaults to 4.

loss(epoch, loss, decimal=6)[source]

Loss level log.

Parameters:

epoch (int) – Epoch.
loss (float) – Loss.
decimal (int, optional) – Decimal. Defaults to 6.

model_info(model)[source]

Model info level log.

Parameters:: model (nn.Module) – Model.

pydgc.utils.random

setup_seed(seed)[source]

Fix the random seed.

Parameters:: seed (int) – The random seed.

pydgc.utils.transform

get_M(adj, t=2)[source]

Calculate the matrix M by the equation:: $M=(B^1 + B^2 + … + B^t) / t$

Parameters:

adj (torch.Tensor) – The adjacency matrix.
t (int, optional) – Default value is 2.

Returns:

The matrix M.

Return type:

torch.Tensor

target_distribution(q)[source]

Target distribution.

Parameters:: q (torch.Tensor) – The input tensor.
Returns:: The target distribution.
Return type:: torch.Tensor

diffusion_adj(adj, mode='ppr', transport_rate=0.2)[source]

Graph diffusion.

Parameters:

adj (torch.Tensor) – The adjacency matrix.
mode (str, optional) – The mode of graph diffusion. Defaults to “ppr”.
transport_rate (float, optional) – The transport rate. Defaults to 0.2.

Returns:

The graph diffusion.

Return type:

torch.Tensor

add_gaussian_noise(x, mean=0, std_dev=0.1)[source]

Add gaussian noise to x.

Parameters:

x (torch.Tensor) – The input tensor.
mean (int, optional) – The mean of the gaussian noise. Defaults to 0.
std_dev (float, optional) – The standard deviation of the gaussian noise. Defaults to 0.1.

Returns:

The tensor with gaussian noise.

Return type:

torch.Tensor

perturb_data(data, cfg)[source]

Perturb the data.

Parameters:

data (Data) – The input data.
cfg (CN) – The configuration.

Returns:

The perturbed data.

Return type:

Data

sparse_mx_to_torch_sparse_tensor(sparse_mx)[source]

Convert a scipy sparse matrix to a torch sparse tensor.

Parameters:: sparse_mx (scipy.sparse.csr_matrix) – The input scipy sparse matrix.
Returns:: The torch sparse tensor.
Return type:: torch.sparse_coo_tensor

normalize_adj_torch(adj, symmetry=True)[source]

Normalize the adjacency matrix.

Parameters:

adj (torch.Tensor) – The input adjacency matrix.
symmetry (bool, optional) – Symmetry normalize or not. Defaults to True.

Returns:

The normalized adjacency matrix.

Return type:

torch.Tensor

pydgc.utils.visualization

class DGCVisual(save_path='.', save_format='png', font_family='sans-serif', font_size=20)[source]

Bases: object

A class for visualizing data.

Parameters:

save_path (str, optional) – The path to save the images. Defaults to ‘.’.
save_format (str, optional) – The format of the images. Defaults to ‘png’.
font_family (str or list, optional) – The font family. Defaults to ‘sans-serif’.
font_size (int, optional) – The font size. Defaults to 20.

static check_save_format(save_format)[source]

Check if the save format is supported.

Parameters:: save_format (str) – The save format, e.g., ‘png’, ‘pdf’, ‘jpg’, ‘jpeg’, ‘bmp’, ‘tiff’, ‘gif’, ‘svg’, ‘eps’.
Raises:: ValueError – If the save format is not supported.

plot_clustering(data, labels, method='tsne', palette='viridis', fig_size=(10, 8), filename='tsne_plot', show_axis=False, legend=False, dpi=300, random_state=42)[source]

Plot the clustering results with tsne or umap dimension reduction.

Parameters:

data (np.array) – The input data, shape (n_samples, n_features).
labels (np.array) – The data labels.
method (str, optional) – The dimensionality reduction method, ‘tsne’ or ‘umap’. Defaults to ‘tsne’.
palette (str, optional) – The color palette. Defaults to “viridis”.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (10, 8).
filename (str, optional) – The filename to save the plot. Defaults to “tsne_plot”.
show_axis (bool, optional) – Whether to show the axis. Defaults to False.
legend (bool, optional) – Whether to show the legend. Defaults to False.
dpi (int, optional) – The DPI of the plot. Defaults to 300.
random_state (int, optional) – The random state. Defaults to 42.

plot_heatmap(data, labels, method='inner_product', color_map='YlGnBu', fig_size=(8, 8), filename='heatmap_plot', show_color_bar=False, show_axis=False, dpi=300)[source]

Plot the heatmap of the data.

Parameters:

data (np.array) – The input data, shape (n_samples, n_features).
labels (np.array) – The data labels.
method (str, optional) – The similarity method, ‘cosine’ or ‘euclidean’ or ‘inner_product’. Defaults to ‘inner_product’.
color_map (str, optional) – The color map. Defaults to “YlGnBu”.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (8, 8).
filename (str, optional) – The filename to save the plot. Defaults to “heatmap_plot”.
show_color_bar (bool, optional) – Whether to show the color bar. Defaults to False.
show_axis (bool, optional) – Whether to show the axis. Defaults to False.
dpi (int, optional) – The DPI of the plot. Defaults to 300.

plot_loss(losses, metrics=None, metrics_name=None, fig_size=(3.149606299212598, 2.3622047244094486), marker='o', line_style='-', color='blue', line_width=2, title=None, dpi=300, filename='loss_curve_plot')[source]

Plot the loss curve and metrics curve if metrics valid.

Parameters:

losses (list) – The loss values.
metrics (list, optional) – The metrics values. Defaults to None.
metrics_name (str, optional) – The metrics name. Defaults to None.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (8/2.54, 6/2.54).
marker (str, optional) – The marker style. Defaults to ‘o’.
line_style (str, optional) – The line style. Defaults to ‘-‘.
color (str, optional) – The line color. Defaults to ‘blue’.
line_width (int, optional) – The line width. Defaults to 2.
title (str, optional) – The title. Defaults to None.
dpi (int, optional) – The DPI. Defaults to 300.
filename (str, optional) – The filename. Defaults to “loss_curve_plot”.