Model Card: Arxiv AI User Preference Classifier
Application Link
Demo Space: Link to Arxiv AI App Demo Space (e.g., Hugging Face Space)
Model Details
| Item | Description |
|---|---|
| Model ID | arxiv-preference-classifier-zzd-jch (Placeholder) |
| Model Type | Binary Text Classifier / Preference Model (e.g., fine-tuned BERT/RoBERTa) |
| Model Creators | Zachary Zdobinski and Je Choi |
| Intended Use | This classifier predicts user interest (Like/Dislike) in an Arxiv paper, which then activates the core recommendation engine. It is designed to rank or classify Arxiv paper abstracts/metadata based on user-specific preference data for a personalized AI application. |
| Base Model | [Information Needed: e.g., bert-base-uncased, RoBERTa-large, custom architecture] |
Intended Uses
This classifier is the trigger mechanism for the personalized recommendation engine within the Arxiv AI application.
Core Recommendation Engine Activation
After the user has liked at least one paper, the core recommendation engine is activated when they click the "Save Ratings and Get More Papers" button. By navigating to the Automated Bert Recommendation tab, this classifier's prediction (the "Like" action) is used to analyze the vector embeddings of the liked papers to understand the user's emerging interests. It then generates a new, more refined list of ten papers based on a sophisticated 70/30 split:
- 70% "Exploitation" recommendations: Chosen for their high cosine similarity to the user's selections and representing topics the system is confident the user will like.
- 30% "Exploration" recommendations: Purposefully selected from related fields to introduce novelty and prevent the user's search from becoming too narrow.
This new list is enhanced with transparent justifications; each paper card now includes a "Reason for Recommendation" tag, with messages like "Recommended because you liked 'Attention Is All You Need'" for exploitation picks, or "Exploratory pick from the related field of Computational Linguistics" for exploration ones. This entire process is a continuous loop, allowing the user to progressively refine their recommendations with each interaction.
- Out-of-Scope Use: Should not be used for general-purpose text classification outside the domain of Arxiv papers, for non-English text, or for making critical, high-stakes decisions.
Training Data
The model was trained on a proprietary dataset of user preference data collected by Zachary Zdobinski and Je Choi for the Arxiv AI application.
| Statistic | Value |
|---|---|
| Total Examples | 603 |
| Training Examples | 480 |
| Validation Examples | 80 |
| Test Examples | 43 |
| Dataset Size | 916.86 KB |
Data Fields
| Feature Name | Data Type | Description |
|---|---|---|
label |
int64 |
The target variable. $1$ for 'interested' (liked), $0$ for 'not interested' (disliked/ignored). |
combined_text |
string |
The input feature used for prediction, a concatenation of the Arxiv paper title, abstract, and user id. |
__index_level_0__ |
int64 |
Original index from the source data (non-essential for training). |
Evaluation
Evaluation was performed on the test set of 43 examples at epoch 9.0.
| Metric | Result |
|---|---|
| Loss | 0.3740845 |
| Accuracy | 0.75 |
| F1 Score | 0.7422680412371134 |
| Precision | 0.7659574468085106 |
| Recall | 0.72 |
| Runtime (seconds) | 6.025 |
| Epoch | 9.0 |
Area Under the Curve (AUC)
| Metric | Value | Interpretation |
|---|---|---|
| AUC Score | 0.84 | The classifier shows good discriminatory skill (well above a random guess of 0.5 and approaching a perfect score of 1.0). It has a good ability to rank a random positive example higher than a random negative example. |
Environmental Impact
| Item | Value |
|---|---|
| Hardware Type | [Information Needed] |
| Hours Used | [Information Needed] |
| Carbon Emitted | [Information Needed] |
Bias, Risks, and Limitations
- Bias from Data: The model is inherently biased toward the specific preference patterns of the two data creators, Zachary Zdobinski and Je Choi. It will not generalize well to other users without further fine-tuning or a multi-user context.
- Domain Limitation: Performance is expected to degrade significantly on documents outside the topical distribution of the Arxiv papers used in the original data collection.
- Generalization: The model size is relatively small (603 total examples), which may lead to overfitting and poor generalization to truly novel Arxiv papers.
Citation
@misc{arxiv_preference_model_zzd_jch,
author = {Zdobinski, Zachary and Choi, Je},
title = {Arxiv AI Application User Preference Data and Classification Model},
howpublished = {Internal Project Documentation},
year = {[Information Needed: Year of Release/Creation]}
}
- Downloads last month
- 15