🧬 vaccineStance-flan-t5-large

A fine-tuned FLAN-T5 Large model for stance classification of tweets related to COVID-19 vaccines. It outputs one of three stance categories:

  • in-favor
  • against
  • neutral-or-unclear

πŸ“Œ Model Description

This model builds on FLAN-T5-Large using two-stage fine-tuning:

  1. Sentiment pre-finetuning on cardiffnlp/tweet_eval to teach emotional polarity.
  2. Stance-specific finetuning on a curated COVID-19 stance dataset (annotated .csv), augmented for balance and stratified across splits.

Instruction-tuning + prompt-based generation was retained from the original FLAN-T5 formulation.

πŸ§ͺ Evaluation Results

Metric Score
Macro F1 0.93
Micro F1 0.94

Evaluation was conducted across a 5732-tweet dataset split 80:10:10 (train:test:val). The model showed consistent generalization and balanced performance across all splits.

🧠 Intended Use

  • Research in public health NLP and LLM alignment
  • Automated stance detection in social media monitoring systems
  • Baseline for multi-agent LLM stance alignment studies

πŸ“₯ How to Use

from transformers import T5ForConditionalGeneration, T5Tokenizer

model = T5ForConditionalGeneration.from_pretrained("DopplerEffect/vaccineStance-flan-t5-large")
tokenizer = T5Tokenizer.from_pretrained("google/flan-t5-large")

prompt = '''
You are a sentiment analyst tasked with understanding public opinion about COVID-19 on Twitter. Your job is to classify the sentiment of each tweet as one of the following categories:

- in-favor: The tweet expresses positive support or agreement regarding COVID-19 policies, vaccines, or public health advice.
- against: The tweet expresses opposition, criticism, or distrust of COVID-19-related efforts.
- neutral-or-unclear: The tweet neither clearly supports nor opposes, or the sentiment is ambiguous.

Tweet: "Vaccines saved so many lives!"
Sentiment:
'''
outputMap = {
                "positive":"in-favor",
                "negative":"against",
                "neutral":"neutral-or-unclear"
            }
inputIds = self.tokenizer(prompt, return_tensors="pt").input_ids # The tweet is enclosed in the prompt
output = self.model.generate(inputIds)
prediction = outputMap.get(self.tokenizer.decode(output[0], skip_special_tokens=True).strip())

print(prediction)  # Output: in-favor
Downloads last month
4
Safetensors
Model size
0.8B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train DopplerEffect/vaccineStance-flan-t5-large

Evaluation results

  • Macro F1 on COVID-19 Vaccine Tweet Dataset (Custom)
    self-reported
    0.930
  • Micro F1 on COVID-19 Vaccine Tweet Dataset (Custom)
    self-reported
    0.940