Imran Lab’s Paper Reading Group
Fall 2024 Discussions
Dec 11, 2024: [N. Munia], VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Dec 4, 2024: [K. Rifa], ZIGMA: A DiT-style Zigzag Mamba Diffusion Model
Nov 20, 2024: [T. Ward], Diffuse, Attend, and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion
Nov 13, 2024: [N. Munia], What Matters in Transformers? Not All Attention is Needed
Nov 6, 2024: [M. Shah], Context-Aware Layout to Image Generation With Enhanced Object Appearance
Oct 30, 2024: [M. Massey], Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
Oct 23, 2024: [K. Rifa], Dimba: Transformer-Mamba Diffusion Models
Oct 16, 2024: [T. Ward], From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation
Oct 2, 2024: [N. Munia], Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Sep 18, 2024: [M. Shah], ObjBlur: A Curriculum Learning Approach With Progressive Object-Level Blurring for Improved Layout-to-Image Generation
Sep 11, 2024: [M. Massey], Bridging Remote Sensors with Multisensor Geospatial Foundation Models
Sep 4, 2024: [K. Rifa], Blind CT Image Quality Assessment Using DDPM-derived Content and Transformer-based Evaluator
Ag 28, 2024: [T. Ward], Medical SAM 2: Segment medical images as video via Segment Anything Model 2
Aug 20, 2024: [N. Munia], LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation