Improve model card: Add metadata, structured paper link, project page, and code links

#1
by nielsr HF Staff - opened

This PR significantly improves the model card for AdaptVision by enriching its content and metadata.

Key changes include:

  • Adding pipeline_tag: image-text-to-text to the metadata for better discoverability and categorization on the Hub, reflecting the model's Vision-Language capabilities.
  • Adding library_name: transformers to the metadata, as evidence from config.json ("architectures": ["Qwen2_5_VLForConditionalGeneration"]) and the GitHub README (pip install transformers==4.51.0) indicates compatibility with the Transformers library. This enables the automated "how to use" widget on the model page.
  • Enhancing the existing arXiv link by formatting it as [AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition](https://arxiv.org/abs/2512.03794).
  • Adding direct links to the official project page and the GitHub repository for further details and code access.
  • Including a concise description of the AdaptVision model, derived from the paper's abstract.
  • Adding the BibTeX citation for proper academic referencing.

These updates aim to provide users with a more comprehensive and well-structured overview of the AdaptVision model.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment