# Evaluation Evaluation is a process that takes a number of input/output/time triplets and aggregate them. You can always [use the model](models.html) directly and just parse its inputs/outputs manually to perform evaluation. Alternatively, evaluation is implemented in medsegpy using the [DatasetEvaluator](../modules/evaluation.html#medsegpy.evaluation.DatasetEvaluator) interface. MedSegPy includes [`SemSegEvaluator`](../modules/evaluation.html#medsegpy.evaluation.SemSegEvaluator), an extension of `DatasetEvaluator` that computes popular semantic segmentation metrics for medical images. You can also implement your own `DatasetEvaluator` that performs some other jobs using the inputs/outputs pairs. For example, to count how many instances are detected on the validation set: ``` class Counter(DatasetEvaluator): def reset(self): self.count = 0 def process(self, inputs, outputs, time_elapsed): for output in outputs: self.count += len(output["instances"]) def evaluate(self): # save self.count somewhere, or print it, or return it. return {"count": self.count} ``` Once you have some `DatasetEvaluator`, you can run it with [inference_on_dataset](../modules/evaluation.html#medsegpy.evaluation.inference_on_dataset). For example, ```python val_results = inference_on_dataset( model, val_data_loader, DatasetEvaluators(SemSegEvaluator(...)), ) ``` The `inference_on_dataset` function also provides accurate speed benchmarks for the given model and dataset.