Model Card for `colorize-unet-pytorch`

Model Details

Model Description

colorization-model-unet is a PyTorch U‑Net model for automatic image colorization using the LAB color space.
It takes the L* (grayscale/lightness) channel as input and predicts the a* and b* chrominance channels, which are then combined with the original L* channel and converted back to RGB to produce a final colorized image.

Developed by: Ammar Ahmed, Ahmad Naeem, Khurram Imran
Model type: U‑Net (encoder–decoder CNN with skip connections)
Task: Image colorization (grayscale → color)
Input: Grayscale image / L* channel (1×H×W)
Output: Predicted a*, b* channels (2×H×W)
Training dataset: MS COCO 2017 (Validation set used as training source in the project)
License: MIT
Framework: PyTorch

Model Sources

Repository: https://github.com/AmmarAhm3d/colorize-unet-pytorch
Demo (Hugging Face Space): https://huggingface.co/spaces/AmmarAhm3d/colorize-unet-pytorch

Uses

Direct Use

This model is intended for:

Colorizing grayscale images for demos, coursework, and experimentation.
Running inference through:
- the included Gradio UI (app.py), or
- a simple Python inference script.

Example use cases:

restoring approximate color for old black-and-white photos (best-effort)
educational demonstrations of LAB-space colorization with U‑Net

Downstream Use

Possible downstream uses include:

Fine-tuning on domain-specific grayscale→color datasets (e.g., historical photos, medical/industrial imagery).
Using the model as a baseline for:
- GAN-based colorization
- reference-based / user-guided colorization pipelines
- perceptual-loss training

Out-of-Scope Use

Color accuracy guarantees: The model does not guarantee historically accurate or “ground-truth” colors. Colorization is inherently ambiguous.
Safety-critical interpretation: Do not use colorized outputs for medical diagnosis, forensics, or any task where inferred color could cause harm.
High-resolution production pipelines without adaptation: The default training/inference setup is centered around ~256×256 processing; higher resolutions may need tiling and careful postprocessing.

Bias, Risks, and Limitations

Dataset bias: Trained using MS COCO imagery; outputs may reflect dataset distributions (typical objects, scenes, colors).
Multi-modal uncertainty: Many objects can have multiple plausible colors (e.g., clothing, cars). The model may choose a plausible but incorrect option.
Desaturation risk: With L1 loss, outputs can trend toward “safe” average colors (less vibrant).
Artifact risk: Some regions (thin structures, rare textures) may show color bleeding or inconsistent tones.

Recommendations

Use the model for visual enhancement / demo purposes, not for factual color restoration.
For better quality:
- fine-tune with domain-specific data
- consider perceptual losses or adversarial training
- add postprocessing or user-guided constraints

How to Get Started with the Model

Installation

pip install torch torchvision numpy pillow scikit-image gradio

Minimal inference outline (conceptual)

Read image (RGB or grayscale)
Convert to LAB
Normalize L* to [-1, 1]
Predict ab with the U‑Net
Denormalize ab and reconstruct LAB
Convert LAB → RGB

For a ready-to-run demo, use the Gradio app:

python app.py

Hosted demo:

https://huggingface.co/spaces/AmmarAhm3d/colorize-unet-pytorch

Training Details

Training Data

Dataset: MS COCO 2017 (val2017 subset; ~5k images referenced in the repository README)
Preprocessing:
- RGB → LAB
- L normalized from [0, 100] → [-1, 1]
- a,b normalized from approximately [-128, 127] → [-1, 1]

Training Procedure

Objective: Predict chrominance (a*, b*) given luminance (L*) using supervised learning.
Loss: L1 (Mean Absolute Error) between predicted and target ab channels.
Optimizer: Adam

Training Hyperparameters

Image size: 256×256
Batch size: 16
Learning rate: 2e-4
Epochs: 50 (loss report mentions convergence around epoch ~24)
Training regime: fp32 (not explicitly stated; update if you used fp16/bf16)

Speeds, Sizes, Times

Reported training duration: ~2 hours on GPU (Kaggle P100 GPU)

If you want this section to be more precise, add your GPU type (e.g., T4 / RTX 3060) and actual training wall-clock time.

Evaluation

This repository includes an evaluation script producing objective fidelity metrics:

PSNR
SSIM
RMSE
SNR

Testing Data

Sampled images from the same COCO subset / dataset path as configured in the repo (see evaluation.py).

Metrics

PSNR: image-level fidelity measure based on MSE; higher is better.
SSIM: structural similarity measure; closer to 1 is better.
RMSE: root mean squared error; lower is better.
SNR: signal-to-noise ratio; higher is better.

Results (Evaluation Results)

The repo states evaluation output is generated by evaluation.py and written to:

evaluation_results/metrics_summary.txt
plus per-image comparisons and distribution plots.

To fully populate the “Eval Results” table on the Hugging Face Hub UI, paste your final numeric summary (mean/std and sample count) here.
If you share the contents of evaluation_results/metrics_summary.txt, I can format it into the Hub’s recommended “Evaluation Results” table format.

Model Architecture and Objective

Architecture

U‑Net encoder–decoder CNN with skip connections:

Encoder: 6 downsampling blocks (Conv2d + BatchNorm + ReLU)
Bottleneck: 512 channels
Decoder: 6 upsampling blocks (ConvTranspose2d + BatchNorm + ReLU), skip concatenations
Output: 2-channel (ab) with Tanh activation

Objective

Learn mapping: L* → (a*, b*) in LAB space.

Environmental Impact

Not reported.
(For completeness, you can estimate emissions using: https://mlco2.github.io/impact)

Citation

If you use this model in academic work, consider citing the repository:

Repository: https://github.com/AmmarAhm3d/colorize-unet-pytorch

Model Card Contact

GitHub Issues: https://github.com/AmmarAhm3d/colorize-unet-pytorch/issues

Downloads last month: -; Downloads are not tracked for this model. How to track

AmmarAhm3d
/

colorization-model-unet

Model Card for `colorize-unet-pytorch`

Model Details

Model Description

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Installation

Minimal inference outline (conceptual)

Training Details

Training Data

Training Procedure

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Testing Data

Metrics

Results (Evaluation Results)

Model Architecture and Objective

Architecture

Objective

Environmental Impact

Citation

Model Card Contact

Dataset used to train AmmarAhm3d/colorization-model-unet

Model Card for colorize-unet-pytorch

Model Details

Model Description

Model Sources

Uses

Direct Use

Downstream Use

Out-of-Scope Use

Bias, Risks, and Limitations

Recommendations

How to Get Started with the Model

Installation

Minimal inference outline (conceptual)

Training Details

Training Data

Training Procedure

Training Hyperparameters

Speeds, Sizes, Times

Evaluation

Testing Data

Metrics

Results (Evaluation Results)

Model Architecture and Objective

Architecture

Objective

Environmental Impact

Citation

Model Card Contact

Dataset used to train AmmarAhm3d/colorization-model-unet

Model Card for `colorize-unet-pytorch`