You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Highlights

  • Base model: google/vit-base-patch16-224
  • Parameters: ~97.2M
  • Training samples: 2,000,000 curated plant occurrences
  • Species coverage: ~14,000 unique species
  • Source data: GBIF (research-grade iNaturalist images)
  • Training method: End-to-end supervised fine-tuning
  • Use-case: Fast species-level classification from a single photo

Example Usage

from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests
import torch

model_id = "juppy44/plant-identification-2m-vit-b"

processor = AutoImageProcessor.from_pretrained(model_id)
model = AutoModelForImageClassification.from_pretrained(model_id)

url = "https://example.com/plant.jpg"
image = Image.open(requests.get(url, stream=True).raw)

inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
    logits = model(**inputs).logits

pred = logits.softmax(dim=-1)[0]
topk = torch.topk(pred, k=5)

for prob, idx in zip(topk.values, topk.indices):
    label = model.config.id2label[idx.item()]
    print(f"{label}: {prob.item():.4f}")

Intended Applications

  • Ecological surveys
  • Nursery and horticulture tools
  • Restoration and revegetation workflows
  • Field research and biodiversity monitoring
  • Citizen science and educational platforms
  • Image-based species tagging pipelines

Data & Training Details

Dataset Construction

  • Sourced from GBIF occurrences with valid species and image metadata.
  • Cleaned and deduplicated.
  • Species filtered to those with ≥ 20 images.
  • Maximum cap of 1,000 images per species to reduce class imbalance.
  • Final training dataset: 2,000,000 images across ~14k species.

Training

  • ViT-Base fine-tuned for 1 epoch over 2M samples.
  • AdamW optimizer, standard ViT augmentations.
  • Mixed-precision training on GPU.

Limitations

  • Some species are visually indistinguishable without context (location, reproduction structures, etc.).
  • Performance varies for rare, morphologically similar, or poorly photographed species.
  • No location metadata incorporated yet — purely image-based.

Labels

Species names follow the canonical GBIF taxonomy (species_name). Each class corresponds directly to a species.

You can inspect all labels via:

from transformers import AutoConfig
cfg = AutoConfig.from_pretrained("juppy44/plant-identification-2m-vit-b")
labels = cfg.id2label

Performance

No fixed metric is published yet due to absence of a clean evaluation split in GBIF/iNat. However, the model performs strongly in practical testing and is suitable for downstream adaptation or LoRA-based fine-tuning.

Formal benchmarks will be added when a standardized evaluation subset is released.


Fine-Tuning & Adapters

You can further specialize the model using LoRA adapters for:

  • Regional subsets
  • Functional groups
  • Threatened species
  • Agricultural crops
  • Disease classification

The base model is trained broadly enough to support domain-specific adapter tuning with minimal compute.


License

Model weights follow the same license as the underlying ViT-Base model. Users are responsible for ensuring compliance with GBIF/iNaturalist usage terms for any downstream dataset creation.


Downloads last month
50
Safetensors
Model size
97.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for juppy44/plant-identification-2m-vit-b

Finetuned
(927)
this model

Dataset used to train juppy44/plant-identification-2m-vit-b