How well does CLIP understand texture?
Abstract
We investigate how well CLIP understands texture in natural images described by natural language. To this end, we analyze CLIP's ability to: (1) perform zero-shot learning on various texture and material classification datasets; (2) represent compositional properties of texture such as red dots or yellow stripes on the Describable Texture in Detail(DTDD) dataset; and (3) aid fine-grained categorization of birds in photographs described by color and texture of their body parts.
- Publication:
-
arXiv e-prints
- Pub Date:
- March 2022
- DOI:
- 10.48550/arXiv.2203.11449
- arXiv:
- arXiv:2203.11449
- Bibcode:
- 2022arXiv220311449W
- Keywords:
-
- Computer Science - Computer Vision and Pattern Recognition
- E-Print:
- ECCV 2022 CVinW Workshop