We recently released TTI Eval `text-to-image-eval`, an open-source library for evaluating zero-shot classification models like CLIP and domain-specific ones like BioCLIP against your (or HF) datasets to estimate how well the model will perform.
You can evaluate custom and HuggingFace text-to-image/zero-shot image classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. The evaluation metrics include Zero-shot accuracy, linear probe, image retrieval, and KNN accuracy.
We built this for ML engineers and developers using CLIP models.
We recently released TTI Eval `text-to-image-eval`, an open-source library for evaluating zero-shot classification models like CLIP and domain-specific ones like BioCLIP against your (or HF) datasets to estimate how well the model will perform.
You can evaluate custom and HuggingFace text-to-image/zero-shot image classification models like CLIP, SigLIP, DFN5B, and EVA-CLIP. The evaluation metrics include Zero-shot accuracy, linear probe, image retrieval, and KNN accuracy.
We built this for ML engineers and developers using CLIP models.
Here's the installation guide if you want to get started: https://github.com/encord-team/text-to-image-eval?tab=readme...
I'd love to hear your thoughts on this. I'm open to contributions and feedback from the community. Thank you.