|
|
--- |
|
|
tags: |
|
|
- text-classification |
|
|
- gibberish |
|
|
- detector |
|
|
- spam |
|
|
- distilbert |
|
|
- nlp |
|
|
- text-filter |
|
|
- akto |
|
|
language: en |
|
|
widget: |
|
|
- text: I love Machine Learning! |
|
|
license: mit |
|
|
library_name: transformers |
|
|
base_model: distilbert-base-uncased |
|
|
model-index: |
|
|
- name: gibberish-detector |
|
|
results: |
|
|
- task: |
|
|
type: text-classification |
|
|
name: Gibberish Detection |
|
|
metrics: |
|
|
- type: accuracy |
|
|
value: 0.9736 |
|
|
name: Accuracy |
|
|
- type: f1 |
|
|
value: 0.9736 |
|
|
name: F1 Score |
|
|
--- |
|
|
|
|
|
# Gibberish Detector - Text Classification Model |
|
|
|
|
|
**High-performance gibberish detection model** for identifying nonsensical text, spam, and incoherent input. Built with DistilBERT, achieving **97.36% accuracy** in multi-class text classification. |
|
|
|
|
|
This model is designed for production use with Akto's security frameworks and LLM protection systems. |
|
|
|
|
|
## π― Quick Start |
|
|
|
|
|
```python |
|
|
from transformers import pipeline |
|
|
|
|
|
# Initialize the gibberish detector |
|
|
detector = pipeline("text-classification", model="TangoBeeAkto/gibberish-detector") |
|
|
|
|
|
# Detect gibberish in text |
|
|
result = detector("I love Machine Learning!") |
|
|
print(result) |
|
|
# Output: [{'label': 'clean', 'score': 0.99}] |
|
|
``` |
|
|
|
|
|
## π₯ Key Features |
|
|
|
|
|
- **π― 97.36% Accuracy**: High-performance gibberish detection |
|
|
- **β‘ Fast Inference**: Optimized DistilBERT architecture |
|
|
- **π·οΈ Multi-Class Detection**: Noise, Word Salad, Mild Gibberish, and Clean text |
|
|
- **π§ Easy Integration**: Standard transformers pipeline |
|
|
- **π Production Ready**: Tested and validated for security applications |
|
|
- **π Efficient**: Low computational footprint |
|
|
|
|
|
## Problem Description |
|
|
|
|
|
The ability to process and understand user input is crucial for various applications, such as chatbots or downstream tasks. However, a common challenge faced in such systems is the presence of gibberish or nonsensical input. This project focuses on developing a gibberish detector for the English language. |
|
|
|
|
|
The primary goal is to classify user input as either **gibberish** or **non-gibberish**, enabling more accurate and meaningful interactions with the system. |
|
|
|
|
|
## Label Categories |
|
|
|
|
|
The model classifies text into 4 categories: |
|
|
|
|
|
1. **Clean (0)**: Proper, meaningful sentences |
|
|
- Example: `I love this website` |
|
|
|
|
|
2. **Mild Gibberish (1)**: Sentences with grammatical or syntactical errors |
|
|
- Example: `I study in a teacher` |
|
|
|
|
|
3. **Noise (2)**: Random character sequences with no meaningful words |
|
|
- Example: `dfdfer fgerfow2e0d qsqskdsd` |
|
|
|
|
|
4. **Word Salad (3)**: Valid words without coherent meaning |
|
|
- Example: `apple banana car house randomly` |
|
|
|
|
|
## π Use Cases |
|
|
|
|
|
### Input Validation for Security Systems |
|
|
```python |
|
|
def validate_user_input(text): |
|
|
result = detector(text)[0] |
|
|
if result['label'] in ['noise', 'word_salad']: |
|
|
return "Invalid input detected. Please provide meaningful text." |
|
|
return process_query(text) |
|
|
``` |
|
|
|
|
|
### Content Moderation |
|
|
```python |
|
|
def moderate_content(post): |
|
|
classification = detector(post)[0] |
|
|
if classification['label'] != 'clean': |
|
|
return f"Content flagged: {classification['label']}" |
|
|
return "Content approved" |
|
|
``` |
|
|
|
|
|
### LLM Prompt Filtering |
|
|
```python |
|
|
def filter_prompt(prompt): |
|
|
result = detector(prompt)[0] |
|
|
if result['label'] in ['noise', 'word_salad'] and result['score'] > 0.8: |
|
|
return "Potentially malicious or gibberish prompt detected" |
|
|
return "Prompt is valid" |
|
|
``` |
|
|
|
|
|
## π οΈ Installation & Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForSequenceClassification, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
# Load model and tokenizer |
|
|
model = AutoModelForSequenceClassification.from_pretrained("TangoBeeAkto/gibberish-detector") |
|
|
tokenizer = AutoTokenizer.from_pretrained("TangoBeeAkto/gibberish-detector") |
|
|
|
|
|
def detect_gibberish(text): |
|
|
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True) |
|
|
with torch.no_grad(): |
|
|
outputs = model(**inputs) |
|
|
|
|
|
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1) |
|
|
predicted_label_id = probabilities.argmax().item() |
|
|
|
|
|
return model.config.id2label[predicted_label_id] |
|
|
|
|
|
# Example usage |
|
|
print(detect_gibberish("Hello world!")) # Output: clean |
|
|
print(detect_gibberish("asdkfj asdf")) # Output: noise |
|
|
``` |
|
|
|
|
|
## Model Details |
|
|
|
|
|
- **Architecture**: DistilBERT for Sequence Classification |
|
|
- **Base Model**: distilbert-base-uncased |
|
|
- **Max Length**: 64 tokens |
|
|
- **Vocab Size**: 30,522 |
|
|
- **Parameters**: ~67M |
|
|
|
|
|
## Performance Metrics |
|
|
|
|
|
- **Accuracy**: 97.36% |
|
|
- **F1 Score**: 97.36% |
|
|
- **Precision**: 97.38% |
|
|
- **Recall**: 97.36% |
|
|
|
|
|
## ONNX Support |
|
|
|
|
|
This model supports ONNX optimization for faster inference in production environments. Use with optimized runtimes for best performance. |
|
|
|
|
|
## Integration with Akto Security Framework |
|
|
|
|
|
This model is optimized for use with Akto's LLM security and protection systems. It provides real-time gibberish detection for: |
|
|
|
|
|
- Prompt injection detection |
|
|
- Input validation |
|
|
- Content filtering |
|
|
- Security monitoring |
|
|
|
|
|
## License |
|
|
|
|
|
This model is licensed under the MIT License. |
|
|
|
|
|
--- |
|
|
|
|
|
**Developed by Akto for enterprise security applications** |