pydgc.utils
pydgc.utils.command
pydgc.utils.config
- validate_and_create_path(save_path)[source]
Validate whether save_path is valid or not. If it contains directory and is valid but not exists, create directory.
- Parameters:
save_path (str) – Save path.
- Returns:
True if save_path is valid, False otherwise.
- Return type:
bool
- default_cfg(dataset_name)[source]
Default configuration.
- Parameters:
dataset_name (str) – Dataset name.
- Returns:
Default configuration.
- Return type:
CN
- yaml_to_cfg(yaml_data)[source]
Transform YAML into CfgNode.
- Parameters:
yaml_data (dict) – Data loaded from yaml.
- Returns:
Transformed CfgNode.
- Return type:
CN
- dump_cfg(cfg, save_path=None)[source]
Records the configuration of this experiment.
- Parameters:
cfg (CN) – Configuration.
save_path (str, optional) – Save path. Defaults to None.
- load_dataset_specific_cfg(cfg_file_path, dataset_name)[source]
Load config on specified dataset.
- Parameters:
cfg_file_path (str) – Path of config file.
dataset_name (str) – Name of specific dataset.
- Returns:
Config of specific dataset.
- Return type:
CN
- check_required_cfg(cfg, dataset_name, auto_complete=True)[source]
Check required config items.
- Parameters:
cfg (CN) – Configuration.
dataset_name (str) – Name of specific dataset.
auto_complete (bool, optional) – Whether to auto-complete missing config items. Defaults to True.
- Returns:
True if all required config items are present, False otherwise.
- Return type:
bool
pydgc.utils.device
@Reference: https://github.com/snap-stanford/GraphGym/blob/master/graphgym/utils/device.py
- count_parameters(model)[source]
Count the parameters’ number of the input model.
Note: The unit of return value is millions(M) if exceeds 1,000,000.
- Parameters:
model (torch.nn.Module) – The model instance you want to count.
- Returns:
The number of model parameters, in Million (M).
- Return type:
float
- get_gpu_memory_map()[source]
Get the current gpu usage.
- Returns:
The current gpu memory usage.
- Return type:
np.ndarray
- get_current_gpu_usage(gpu_mem, device)[source]
Get the current GPU memory usage.
- Parameters:
gpu_mem (np.ndarray) – The current gpu memory usage.
device (str) – The device.
- Returns:
The current GPU memory usage.
- Return type:
int
- auto_select_device(logger, cfg, memory_max=8000, memory_bias=200, strategy='random')[source]
Auto select device for the experiment. Useful when having multiple GPUs.
- Parameters:
logger – Logger.
cfg (CN) – Config.
memory_max (int, optional) – Threshold of existing GPU memory usage. GPUs with memory usage beyond this threshold will be deprioritized. Defaults to 8000.
memory_bias (int, optional) – A bias GPU memory usage added to all the GPUs. Avoid divided by zero error. Defaults to 200.
strategy (str, optional) – ‘random’ (random select GPU) or ‘greedy’ (greedily select GPU). Defaults to ‘random’.
- Returns:
Config.
- Return type:
CN
pydgc.utils.logger
- get_formatted_time()[source]
Get formatted time.
- Returns:
Formatted time in the format of ‘YYYY-MM-DD HH-MM-SS’.
- Return type:
str
- create_logger(logger_name, log_mode='both', log_file_path=None, encoding='utf-8')[source]
Create logger.
- Parameters:
logger_name (str) – Used to name logger.
log_mode (str, optional) – Print mode. Options: [file, stdout, both]. Defaults to ‘both’.
log_file_path (str, optional) – If print output to file, you must specify file path. Defaults to None.
encoding (str, optional) – Encoding mode, ‘utf-8’ for default. Defaults to ‘utf-8’.
- Returns:
Logger.
- Return type:
- class Logger(name)[source]
Bases:
objectLogger.
- Parameters:
name (str) – Name of logger.
- flag(message)[source]
Print flag to partition different parts above and below.
- Parameters:
message (str) – Log message.
- static table(results_dir, dataset_name, results_dict, decimal=4)[source]
Create table.
- Parameters:
results_dir (str) – Results directory.
dataset_name (str) – Dataset name.
results_dict (dict) – Results dictionary.
decimal (int, optional) – Decimal. Defaults to 4.
pydgc.utils.random
pydgc.utils.transform
- get_M(adj, t=2)[source]
- Calculate the matrix M by the equation:
$M=(B^1 + B^2 + … + B^t) / t$
- Parameters:
adj (torch.Tensor) – The adjacency matrix.
t (int, optional) – Default value is 2.
- Returns:
The matrix M.
- Return type:
torch.Tensor
- target_distribution(q)[source]
Target distribution.
- Parameters:
q (torch.Tensor) – The input tensor.
- Returns:
The target distribution.
- Return type:
torch.Tensor
- diffusion_adj(adj, mode='ppr', transport_rate=0.2)[source]
Graph diffusion.
- Parameters:
adj (torch.Tensor) – The adjacency matrix.
mode (str, optional) – The mode of graph diffusion. Defaults to “ppr”.
transport_rate (float, optional) – The transport rate. Defaults to 0.2.
- Returns:
The graph diffusion.
- Return type:
torch.Tensor
- add_gaussian_noise(x, mean=0, std_dev=0.1)[source]
Add gaussian noise to x.
- Parameters:
x (torch.Tensor) – The input tensor.
mean (int, optional) – The mean of the gaussian noise. Defaults to 0.
std_dev (float, optional) – The standard deviation of the gaussian noise. Defaults to 0.1.
- Returns:
The tensor with gaussian noise.
- Return type:
torch.Tensor
- perturb_data(data, cfg)[source]
Perturb the data.
- Parameters:
data (Data) – The input data.
cfg (CN) – The configuration.
- Returns:
The perturbed data.
- Return type:
Data
pydgc.utils.visualization
- class DGCVisual(save_path='.', save_format='png', font_family='sans-serif', font_size=20)[source]
Bases:
objectA class for visualizing data.
- Parameters:
save_path (str, optional) – The path to save the images. Defaults to ‘.’.
save_format (str, optional) – The format of the images. Defaults to ‘png’.
font_family (str or list, optional) – The font family. Defaults to ‘sans-serif’.
font_size (int, optional) – The font size. Defaults to 20.
- static check_save_format(save_format)[source]
Check if the save format is supported.
- Parameters:
save_format (str) – The save format, e.g., ‘png’, ‘pdf’, ‘jpg’, ‘jpeg’, ‘bmp’, ‘tiff’, ‘gif’, ‘svg’, ‘eps’.
- Raises:
ValueError – If the save format is not supported.
- plot_clustering(data, labels, method='tsne', palette='viridis', fig_size=(10, 8), filename='tsne_plot', show_axis=False, legend=False, dpi=300, random_state=42)[source]
Plot the clustering results with tsne or umap dimension reduction.
- Parameters:
data (np.array) – The input data, shape (n_samples, n_features).
labels (np.array) – The data labels.
method (str, optional) – The dimensionality reduction method, ‘tsne’ or ‘umap’. Defaults to ‘tsne’.
palette (str, optional) – The color palette. Defaults to “viridis”.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (10, 8).
filename (str, optional) – The filename to save the plot. Defaults to “tsne_plot”.
show_axis (bool, optional) – Whether to show the axis. Defaults to False.
legend (bool, optional) – Whether to show the legend. Defaults to False.
dpi (int, optional) – The DPI of the plot. Defaults to 300.
random_state (int, optional) – The random state. Defaults to 42.
- plot_heatmap(data, labels, method='inner_product', color_map='YlGnBu', fig_size=(8, 8), filename='heatmap_plot', show_color_bar=False, show_axis=False, dpi=300)[source]
Plot the heatmap of the data.
- Parameters:
data (np.array) – The input data, shape (n_samples, n_features).
labels (np.array) – The data labels.
method (str, optional) – The similarity method, ‘cosine’ or ‘euclidean’ or ‘inner_product’. Defaults to ‘inner_product’.
color_map (str, optional) – The color map. Defaults to “YlGnBu”.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (8, 8).
filename (str, optional) – The filename to save the plot. Defaults to “heatmap_plot”.
show_color_bar (bool, optional) – Whether to show the color bar. Defaults to False.
show_axis (bool, optional) – Whether to show the axis. Defaults to False.
dpi (int, optional) – The DPI of the plot. Defaults to 300.
- plot_loss(losses, metrics=None, metrics_name=None, fig_size=(3.149606299212598, 2.3622047244094486), marker='o', line_style='-', color='blue', line_width=2, title=None, dpi=300, filename='loss_curve_plot')[source]
Plot the loss curve and metrics curve if metrics valid.
- Parameters:
losses (list) – The loss values.
metrics (list, optional) – The metrics values. Defaults to None.
metrics_name (str, optional) – The metrics name. Defaults to None.
fig_size (Tuple[int, int], optional) – The figure size. Defaults to (8/2.54, 6/2.54).
marker (str, optional) – The marker style. Defaults to ‘o’.
line_style (str, optional) – The line style. Defaults to ‘-‘.
color (str, optional) – The line color. Defaults to ‘blue’.
line_width (int, optional) – The line width. Defaults to 2.
title (str, optional) – The title. Defaults to None.
dpi (int, optional) – The DPI. Defaults to 300.
filename (str, optional) – The filename. Defaults to “loss_curve_plot”.