## Installation We recommend using a virtual environment (Anaconda, virtualenv) to to control package visibility. For a detailed list of requirements, see [environment.yml](https://github.com/ad12/MedSegPy/blob/master/environment.yml). ### Requirements - Python ≥3.6 - tensorflow(-gpu) ≥1.8.0 - keras ≥2.1.6 - pydot (download via `conda install pydot`) - graphviz (download via `conda install python-graphviz`) MedSegPy supports both TensorFlow 1 and 2. For gpu tensorflow versions, compatible cuda drivers and toolkits must be installed. If you do not have a gpu or you are using `tensorflow>=2.0`, replace the `tensorflow-gpu` package with `tensorflow`. More instructions about TensorFlow cuda compatibility and installation can be found [here](https://www.tensorflow.org/install/source#gpu). Install these requirements prior to installing MedSegPy. ### Build MedSegPy from Source ```bash python -m pip install 'git+https://github.com/ad12/MedSegPy.git' # (add --user if you don't have permission) # Or, to install it from a local clone (recommended): git clone https://github.com/ad12/MedSegPy.git cd MedSegPy && python -m pip install -e . ``` You may need to rebuild medsegpy after reinstalling TensorFlow and/or Keras. ### Step-by-Step Guide The following instructions install MedSegPy with Python 3.7, Tensorflow 2.3, and Keras 2.4 into an Anaconda environment. Note gpu use with Tensorflow 2.3 requires cuda toolkit 10.1. ```bash # Create conda environment conda create -n medsegpy_env python=3.7 conda activate medsegpy_env # Install tensorflow and keras dependencies. pip install tensorflow==2.3.1 keras==2.4.3 # Install visualization dependencies conda install pydot python-graphviz # Install MedSegPy git clone https://github.com/ad12/MedSegPy.git cd MedSegPy && python -m pip install -e . ``` ### Configuring Paths (optional) There are two primary paths that MedSegPy uses: 1. dataset paths, the path to the directory holding the datasets, and 2. result paths, the path to the directory where results should be stored. ***Builtin Dataset Paths*** You can set the location for builtin datasets by export MEDSEGPY_DATASETS=/path/to/datasets. If left unset, the default is ./datasets relative to your current working directory. ***Result Paths*** Similarly, you can set the location for the results directory by export MEDSEGPY_RESULTS=/path/to/results. If left unset, the default is ./results relative to your current working directory. As a shortcut, we designate the prefix `"results://"` in any filepath to point to a result directory of your choosing. For example, `"results://exp1"` will resolve to the path `"/exp1"`. An example of how to do this in python (i.e. without export statements) is shown below: ```python import os os.environ["MEDSEGPY_DATASETS"] = "/path/to/datasets" os.environ["MEDSEGPY_RESULTS"] = "/path/to/results" import medsegpy.utils # import implicitly registers prefixes from fvcore.common.file_io import PathManager PathManager.get_local_path("results://exp1") # returns "/path/to/results/exp1" ``` You can also define your own prefixes to resolve by adding your own path handler. This is useful if you want to use the same script to run multiple projects. See fvcore's [fileio](https://github.com/facebookresearch/fvcore/blob/master/fvcore/common/file_io.py) for more information. ***Using Clusters*** It can often be difficult to manage multiple machines may be used. For example, data in `machineA` could be stored at a different location than on `machineB`, but the code you have written is visible to both machines. In these scenarios, managing centralized code can become quite challenging because of hardcoded paths. We can use `Cluster` to store our preferences for paths on different machines. The code below saves the path preferences for each machine to a file. Whenever these machines are detected (matched using hostname) when using MedSegPy, the machine-specific datasets and results paths will be use. ```python from dosma.utils.cluster import Cluster # Define paths for machineA machineA = Cluster("machineA", ["machineA_hostname_pattern"], "path/to/machineA/datasets", "path/to/machineA/results") machineA.save() # Define paths for machineB machineB = Cluster("machineB", ["machineB_hostname_pattern"], "path/to/machineB/datasets", "path/to/machineB/results") machineB.save() ```