본문 바로가기

분류 전체보기24

[ Paper Review ] LoRA (Low-Rank Adaptation of Large Language Models, 2021) LoRA: Low-Rank Adaptation of Large Language ModelsAn important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes learxiv.org 0. AbstractNLP의 중요한 패러다임은 general domain 데이터로 대규모 사전 학습을 한 뒤 특정 task나 domain에 적용하는.. 2024. 8. 30.
[ Paper Review ] Flamingo (a Visual Language Model for Few-Shot Learning, 2022) Flamingo: a Visual Language Model for Few-Shot LearningBuilding models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research. We introduce Flamingo, a family of Visual Language Models (VLM) with this ability. We propoarxiv.org 0. Abstract본 연구에서는 소수의 annotated 예제를 사용하여 새로운 task에 빠르게 적응될 수 있는 모델인 Flamingo.. 2024. 8. 22.
[ Paper Review ] BLIP (Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation, 2022) BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and GenerationVision-Language Pre-training (VLP) has advanced the performance for many vision-language tasks. However, most existing pre-trained models only excel in either understanding-based tasks or generation-based tasks. Furthermore, performance improvement has beearxiv.org 0. Abstract본 논문에서는 vision-la.. 2024. 8. 16.
[ Paper Review ] CoCa (Contrastive Captioners are Image-Text Foundation Models, 2022) CoCa: Contrastive Captioners are Image-Text Foundation ModelsExploring large-scale pretrained foundation models is of significant interest in computer vision because these models can be quickly transferred to many downstream tasks. This paper presents Contrastive Captioner (CoCa), a minimalist design to pretrain anarxiv.org 0. Abstract본 논문에서는 이미지-텍스트 encoder-decoder foundation 모델을 사전학습 하기 위힌 최소한.. 2024. 8. 12.