Evaluation¶
Evaluation is a process that takes a number of input/output/time triplets and aggregate them. You can always use the model directly and just parse its inputs/outputs manually to perform evaluation. Alternatively, evaluation is implemented in medsegpy using the DatasetEvaluator interface.
MedSegPy includes SemSegEvaluator
, an extension of DatasetEvaluator
that computes popular semantic segmentation metrics for medical images.
You can also implement your own DatasetEvaluator
that performs some other jobs
using the inputs/outputs pairs.
For example, to count how many instances are detected on the validation set:
class Counter(DatasetEvaluator):
def reset(self):
self.count = 0
def process(self, inputs, outputs, time_elapsed):
for output in outputs:
self.count += len(output["instances"])
def evaluate(self):
# save self.count somewhere, or print it, or return it.
return {"count": self.count}
Once you have some DatasetEvaluator
, you can run it with
inference_on_dataset.
For example,
val_results = inference_on_dataset(
model,
val_data_loader,
DatasetEvaluators(SemSegEvaluator(...)),
)
The inference_on_dataset
function also provides accurate speed benchmarks for the
given model and dataset.