--- language: - en pipeline_tag: video-classification tags: - birds - swifts - MViTv2 - Ballinrobe license: other license_name: bcs-lcs license_link: LICENSE base_model: - timm/mvitv2_small.fb_in1k library_name: transformers datasets: - odinglynn/swift-150 --- # SwiftViT-150 MViT-v2 fine-tuned on 150 videos for common swift feeding behavior classification. ## Model Fine-tuned `mvit_v2_s` (Kinetics-400 pretrained) on single-camera nestbox footage. Achieves ~87% validation accuracy (in controlled settings) and demonstrates surprising cross-camera generalization despite training on a single viewpoint and on a miniscule dataset (150 samples). ## Usage ```python import torch import torchvision model = torchvision.models.video.mvit_v2_s(weights=None) model.head = torch.nn.Sequential( torch.nn.Dropout(0.5), torch.nn.Linear(768, 512), torch.nn.GELU(), torch.nn.Dropout(0.3), torch.nn.Linear(512, 3), ) checkpoint = torch.load("swiftvit-150.pth") model.load_state_dict(checkpoint["model_state_dict"]) model.eval() # Inference with torch.no_grad(): video = load_video() # Shape: [C, T, H, W] output = model(video.unsqueeze(0)) prediction = torch.argmax(output, dim=1) # 0: feeding, 1: possible_feeding, 2: not_feeding ``` ## Architecture - Base: MViT-v2 Small (24M params) - Head: Custom 768→512→3 with dropout - Input: 16 frames @ 224x224 - Classes: 3 (feeding, possible_feeding, not_feeding) ## Training - 120 train / 30 val samples - Batch size: 4 - Optimizer: AdamW (lr=1e-4, wd=0.05) - Scheduler: CosineAnnealingWarmRestarts - Mixed precision training on H100 - Early stopping: 40 epoch patience ## Performance - Train accuracy: 100% - Val accuracy: 87% - Unexpected cross-camera generalization observed ## Dataset Trained on [swift-150](https://huggingface.co/datasets/odinglynn/swift-150) - 150 videos from GABLE nestbox camera (Ireland, 2020-2025). ## Context Part of climate research correlating swift feeding patterns with weather data at terrabyte scale. Ballinrobe Community School entry for REDACTED. ## Citation If you reference this work, cite: ```bibtex @misc{swift150bcs, title={Swift-150: A Dataset for Common Swift Feeding Behavior Analysis}, author={Odin Glynn-Martin, Culan O'Meara, Anas Rashid, Shayden D'Souza, Pádraig Foley and Mark Lally}, year={2025}, institution={Ballinrobe Community School}, url={https://ballinrobecommunityschool.ie}, note={REDACTED - Entry 2025} } ``` ## License Proprietary. See LICENSE for restrictions.