medsegpy.utils

medsegpy.utils.cluster

class medsegpy.utils.cluster.Cluster(name: str, patterns: Union[str, typing.Sequence[str]], data_dir: str = None, results_dir: str = None)[source]

Tracks config of different nodes/clusters.

This class is helpful for managing different storage paths across different nodes/clusters without the overhead of duplicating the codebase across multiple nodes.

To identify the current node, we inspect the hostname. This can be problematic if two machines have the same hostname, though this has not been an issue as of yet.

DO NOT use the node’s public ip address to identify it. Not only is this not returned by socket.hostname(), but there are also some security issues.

Note

This class is not thread safe. Saving/deleting configs should be done on the main thread.

__init__(name: str, patterns: Union[str, typing.Sequence[str]], data_dir: str = None, results_dir: str = None)[source]
Parameters:
  • name (str) – The name of the cluster. Name is case-sensitive.
  • patterns (Sequence[str]) – Regex pattern(s) for identifying cluster. Cluster will be identified by any(re.match(p, socket.gethostname()) for p in patterns).
  • data_dir (str, optional) – The data directory. Defaults to os.environ['MEDSEGPY_RESULTS'] or "./datasets".
  • results_dir (str, optional) – The results directory. Defaults to “os.environ[‘MEDSEGPY_DATASETS’]” or "./results".
save()[source]

Save cluster config to yaml file.

delete()[source]

Deletes the config file for this cluster.

filepath()[source]

Returns config file path.

Note

This does not guarantee the config exists. To save the cluster config to a file, use save().

Returns:str – The config file path.
save_dir

Deprecated – Legacy alias for self.results_dir

classmethod cluster()[source]

Searches saved clusters by regex matching with hostname.

Note

The cluster must have been saved to a config file. Also, if there are multiple cluster matches, only the first (sorted alphabetically) will be returned.

Returns:Cluster – The current cluster.
classmethod from_config(name)[source]
Parameters:name (str) – Cluster name or path to config file.
Returns:Cluster – The Cluster object
static set_working_cluster()[source]

Sets the working cluster.

Parameters:cluster (str or Cluster) – The cluster name or cluster. If None, will reset cluster to _UNKNOWN, meaning default data and results dirs will be used.
medsegpy.utils.cluster.set_cluster(cluster: Union[str, medsegpy.utils.cluster.Cluster] = None)[source]

Sets the working cluster.

Parameters:cluster (str or Cluster) – The cluster name or cluster. If None, will reset cluster to _UNKNOWN, meaning default data and results dirs will be used.

medsegpy.utils.dl_utils

medsegpy.utils.dl_utils.get_weights(experiment_dir)[source]

Gets the weights file corresponding to lowest validation loss.

Assumes that only the best weights are stored, so searching for the epoch should be enough. TODO: remove this assumption.

Parameters:experiment_dir (str) – Experiment directory where weights are stored.
Returns:str – Path to weights h5 file.
medsegpy.utils.dl_utils.get_valid_subdirs(root_dir: str, exist_ok: bool = False)[source]

Recursively search for experiments that are ready to be tested.

Different experiments live in different folders. Based on training protocol, we assume that an valid experiment has completed training if its folder contains files “config.ini” and “pik_data.dat”.

To avoid recomputing experiments with results, exist_ok=False by default.

Parameters:
  • root_dir (str) – Root folder to search.
  • exist_ok (bool, optional) – If True, recompute results for experiments.
Returns:

List[str] – Experiment directories to test.

medsegpy.utils.dl_utils.get_available_gpus(num_gpus: int = None)[source]

Get gpu ids for gpus that are >95% free.

Tensorflow does not support checking free memory on gpus. This is a crude method that relies on nvidia-smi to determine which gpus are occupied and which are free.

Parameters:num_gpus – Number of requested gpus. If not specified, ids of all available gpu(s) are returned.
Returns:List[int]
List of gpu ids that are free. Length
will equal num_gpus, if specified.

medsegpy.utils.env

medsegpy.utils.logger

medsegpy.utils.logger.setup_logger[source]

Initialize the detectron2 logger and set its verbosity level to “INFO”.

Parameters:
  • output (str) – a file name or a directory to save log. If None, will not save log file. If ends with “.txt” or “.log”, assumed to be a file name. Otherwise, logs will be saved to output/log.txt.
  • name (str) – the root module name of this logger
  • abbrev_name (str) – an abbreviation of the module, to avoid long names in logs. Set to “” to not log the root module in logs. By default, will abbreviate “detectron2” to “d2” and leave other modules unchanged.
Returns:

logging.Logger – a logger

medsegpy.utils.logger.log_first_n(lvl, msg, n=1, *, name=None, key='caller')[source]

Log only for the first n times.

Parameters:
  • lvl (int) – the logging level
  • msg (str) –
  • n (int) –
  • name (str) – name of the logger to use. Will use the caller’s module by default.
  • key (str or tuple[str]) – the string(s) can be one of “caller” or “message”, which defines how to identify duplicated logs. For example, if called with n=1, key=”caller”, this function will only log the first call from the same caller, regardless of the message content. If called with n=1, key=”message”, this function will log the same content only once, even if they are called from different places. If called with n=1, key=(“caller”, “message”), this function will not log only if the same caller has logged the same message before.
medsegpy.utils.logger.log_every_n(lvl, msg, n=1, *, name=None)[source]

Log once per n times.

Parameters:
  • lvl (int) – the logging level
  • msg (str) –
  • n (int) –
  • name (str) – name of the logger to use. Will use the caller’s module by default.
medsegpy.utils.logger.log_every_n_seconds(lvl, msg, n=1, *, name=None)[source]

Log no more than once per n seconds.

Parameters:
  • lvl (int) – the logging level
  • msg (str) –
  • n (int) –
  • name (str) – name of the logger to use. Will use the caller’s module by default.
medsegpy.utils.logger.create_small_table(small_dict)[source]

Create a small table using the keys of small_dict as headers. This is only suitable for small dictionaries.

Parameters:small_dict (dict) – a result dictionary of only a few items.
Returns:str – the table as a string.