Highlights
- Base model:
google/vit-base-patch16-224 - Parameters: ~97.2M
- Training samples: 2,000,000 curated plant occurrences
- Species coverage: ~14,000 unique species
- Source data: GBIF (research-grade iNaturalist images)
- Training method: End-to-end supervised fine-tuning
- Use-case: Fast species-level classification from a single photo
Example Usage
from transformers import AutoImageProcessor, AutoModelForImageClassification
from PIL import Image
import requests
import torch
model_id = "juppy44/plant-identification-2m-vit-b"
processor = AutoImageProcessor.from_pretrained(model_id)
model = AutoModelForImageClassification.from_pretrained(model_id)
url = "https://example.com/plant.jpg"
image = Image.open(requests.get(url, stream=True).raw)
inputs = processor(images=image, return_tensors="pt")
with torch.no_grad():
logits = model(**inputs).logits
pred = logits.softmax(dim=-1)[0]
topk = torch.topk(pred, k=5)
for prob, idx in zip(topk.values, topk.indices):
label = model.config.id2label[idx.item()]
print(f"{label}: {prob.item():.4f}")
Intended Applications
- Ecological surveys
- Nursery and horticulture tools
- Restoration and revegetation workflows
- Field research and biodiversity monitoring
- Citizen science and educational platforms
- Image-based species tagging pipelines
Data & Training Details
Dataset Construction
- Sourced from GBIF occurrences with valid
speciesand image metadata. - Cleaned and deduplicated.
- Species filtered to those with ≥ 20 images.
- Maximum cap of 1,000 images per species to reduce class imbalance.
- Final training dataset: 2,000,000 images across ~14k species.
Training
- ViT-Base fine-tuned for 1 epoch over 2M samples.
- AdamW optimizer, standard ViT augmentations.
- Mixed-precision training on GPU.
Limitations
- Some species are visually indistinguishable without context (location, reproduction structures, etc.).
- Performance varies for rare, morphologically similar, or poorly photographed species.
- No location metadata incorporated yet — purely image-based.
Labels
Species names follow the canonical GBIF taxonomy (species_name).
Each class corresponds directly to a species.
You can inspect all labels via:
from transformers import AutoConfig
cfg = AutoConfig.from_pretrained("juppy44/plant-identification-2m-vit-b")
labels = cfg.id2label
Performance
No fixed metric is published yet due to absence of a clean evaluation split in GBIF/iNat. However, the model performs strongly in practical testing and is suitable for downstream adaptation or LoRA-based fine-tuning.
Formal benchmarks will be added when a standardized evaluation subset is released.
Fine-Tuning & Adapters
You can further specialize the model using LoRA adapters for:
- Regional subsets
- Functional groups
- Threatened species
- Agricultural crops
- Disease classification
The base model is trained broadly enough to support domain-specific adapter tuning with minimal compute.
License
Model weights follow the same license as the underlying ViT-Base model. Users are responsible for ensuring compliance with GBIF/iNaturalist usage terms for any downstream dataset creation.
- Downloads last month
- 50
Model tree for juppy44/plant-identification-2m-vit-b
Base model
google/vit-base-patch16-224