CLIP-based image captioning via unsupervised cycle-consistency in the latent space

Romain Bielawski, Rufin VanRullen

The 8th Workshop on Representation Learning for NLP (RepL4NLP 2023) Long Paper

TLDR:
You can open the #paper-ACL_3566 channel in a separate window.
Abstract: