MMstar
Emerging9papers using it
2024first seen
Papers using MMstar (9)
- ChatVLA: Unified Multimodal Understanding and Robot Control with
Vision-Language-Action ModelACPO: Counteracting Likelihood Displacement in Vision-Language Alignment with Asymmetric ConstraintsDifference Feedback: Generating Multimodal Process-Level Supervision for VLM Reinforcement LearningAnnotation-Free Visual Reasoning for High-Resolution Large Multimodal Models via Reinforcement LearningMultimodal Chain of Continuous Thought for Latent-Space Reasoning in Vision-Language ModelsQianfan-vl: Domain-enhanced Universal Vision-language ModelsVISTA: Enhancing Vision-Text Alignment in MLLMs via Cross-Modal Mutual Information MaximizationAre We on the Right Way for Evaluating Large Vision-Language Models?ViCToR: Improving Visual Comprehension via Token Reconstruction for Pretraining LMMs