[Add] transformers text and image models#132
[Add] transformers text and image models#132ariG23498 wants to merge 6 commits intoLAION-AI:mainfrom
transformers text and image models#132Conversation
| def load_transformers_clip(model_name, pretrained, cache_dir, device): | ||
| ckpt = f"{model_name}/{pretrained}" |
There was a problem hiding this comment.
ckpt = f"{model_name}/{pretrained}" may confusing, it's better to provide model_name as checkpoint on the hub, and hardcode pretrained as True IMO. Otherwise it's going to be like
model_name = "openai"
pretrained = "clip-..."
There was a problem hiding this comment.
I had to choose this option for better verbosity.
CLIP_benchmark/clip_benchmark/cli.py
Line 247 in a230282
| def encode_image(self, image): | ||
| return self.model.get_image_features(image["pixel_values"].squeeze(1)) |
There was a problem hiding this comment.
Interestingly, why should we remove 1-st dim?
There was a problem hiding this comment.
I wanted to brain storm on this bit, thanks for asking this question.
The load function should return a model, transforms, and a tokenizer. In the load function I have written it returns the model, the image processor, and the tokenizer.
The transform is used in the collation function while building the dataloader, which is where it adds another dimension to the tensors. So I noticed that the images["pixel_values].shape=(b, 1, c, h, w).
Is there a way I could extract the transform function from an image_processor?
| processor = AutoProcessor.from_pretrained(ckpt) | ||
|
|
||
| transforms = partial(processor.image_processor, return_tensors="pt") | ||
| tokenizer = partial(processor.tokenizer, return_tensors="pt", padding="max_length") |
There was a problem hiding this comment.
AFAIR it might be good to be able to pass additional args for the tokenizer, e.g. for siglip we should specify
padding="max_length", max_length=64
There was a problem hiding this comment.
I would have loved to pass extra parameters too, but there is no way to pass extra parameters to the load function. The only way we pass parameters is by arguments provided to the script.
|
Worth mentioning with the current integration to Results:
|
Co-authored-by: Pavel Iakubovskii <qubvel@gmail.com>
|
Awesome @ariG23498 ! this is very helpful. Sorry for late answer @ariG23498 @qubvel looking into this now. |
This PR adds the provision of evaluating models using
transformers.One can start a run by the following command