BLIP1 [ Paper Review ] BLIP (Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation, 2022) BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationVision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has beearxiv.org 0. Abstract본 논문에서는 vision-la.. 2024. 8. 16. 이전 1 다음