--- language: - en license: apache-2.0 tags: - mlx - vision - multimodal base_model: janhq/Jan-v2-VL-high --- # Jan-v2-VL-high 8-bit MLX This is an 8-bit quantized MLX conversion of [janhq/Jan-v2-VL-high](https://huggingface.co/janhq/Jan-v2-VL-high). ## Model Description Jan-v2-VL is an 8-billion parameter vision-language model designed for long-horizon, multi-step tasks in real software environments. This "high" variant is optimized for deeper reasoning and complex task execution, providing the highest quality outputs for agentic automation and UI control tasks. **Key Features:** - Vision-language understanding for browser and desktop applications - Screenshot grounding and tool call capabilities - Stable multi-step execution with minimal performance drift - Error recovery and intermediate state maintenance ## Quantization This model was converted to MLX format with 8-bit quantization using [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) by Prince Canuma. **Conversion command:** ```bash mlx_vlm.convert --hf-path janhq/Jan-v2-VL-high --quantize --q-bits 8 --mlx-path Jan-v2-VL-high-8bit-mlx ``` ## Usage ### Installation ```bash pip install mlx-vlm ``` ### Python ```python from mlx_vlm import load, generate from mlx_vlm.prompt_utils import apply_chat_template from mlx_vlm.utils import load_config # Load the model model_path = "mlx-community/Jan-v2-VL-high-8bit-mlx" model, processor = load(model_path) config = load_config(model_path) # Prepare input image = ["path/to/image.jpg"] prompt = "Describe this image." # Apply chat template formatted_prompt = apply_chat_template( processor, config, prompt, num_images=len(image) ) # Generate output output = generate(model, processor, formatted_prompt, image, verbose=False) print(output) ``` ### Command Line ```bash mlx_vlm.generate --model mlx-community/Jan-v2-VL-high-8bit-mlx --max-tokens 100 --prompt "Describe this image" --image path/to/image.jpg ``` ## Intended Use This model is designed for: - Agentic automation and UI control - Stepwise operation in browsers and desktop applications - Screenshot grounding and tool calls - Long-horizon multi-step task execution ## License This model is released under the Apache 2.0 license. ## Original Model For more information, please refer to the original model: [janhq/Jan-v2-VL-high](https://huggingface.co/janhq/Jan-v2-VL-high) ## Acknowledgments - Original model by [Jan](https://huggingface.co/janhq) - [MLX](https://github.com/ml-explore/mlx) framework by Apple - MLX conversion framework by [Prince Canuma](https://github.com/Blaizzy/mlx-vlm) - Model conversion by [Incept5](https://incept5.ai)