Skip to main content

Mistral LLM Model Integration

Mistral is a high-performance open-source language model that can be used with OpenRegister through Ollama or Hugging Face integrations.

Overview

Mistral models are available in multiple sizes and can be run locally using:

  • Ollama: Simple setup, native API
  • Hugging Face TGI/vLLM: OpenAI-compatible API, optimized for production

Model Variants

ModelSizeParametersUse CaseMemory Required
Mistral 7B7B7 billionGeneral purpose, RAG16GB
Mistral 7B Instruct7B7 billionChat, instructions16GB
Mixtral 8x7B47B47 billionHigh quality, complex tasks48GB+

Using Mistral with Ollama

Quick Start

# Pull Mistral model
docker exec openregister-ollama ollama pull mistral:7b

# Or Mistral Instruct (recommended for chat)
docker exec openregister-ollama ollama pull mistral:latest

Configuration

  1. Navigate to SettingsOpenRegisterLLM Configuration
  2. Select Ollama as provider
  3. Configure:
    • Ollama URL: http://openregister-ollama:11434
    • Chat Model: mistral:latest or mistral:7b

See Ollama Integration for detailed setup instructions.

Using Mistral with Hugging Face

Quick Start

# Start TGI with Mistral (using huggingface profile)
docker-compose -f docker-compose.dev.yml --profile huggingface up -d tgi-mistral

# Or start vLLM with Mistral (if configured)
docker-compose -f docker-compose.dev.yml --profile huggingface up -d vllm-mistral

Configuration

  1. Navigate to SettingsOpenRegisterLLM Configuration
  2. Select OpenAI as provider (TGI/vLLM are OpenAI-compatible)
  3. Configure:
    • Base URL: http://tgi-mistral:80 (TGI) or http://vllm-mistral:8000 (vLLM)
    • Model: mistral-7b-instruct
    • API Key: dummy (not used for local)

See Hugging Face Integration for detailed setup instructions.

Use Cases

1. General Purpose Chat

Mistral excels at:

  • Conversational AI
  • Question answering
  • Text generation
  • Code generation

2. RAG (Retrieval Augmented Generation)

Use Mistral with OpenRegister's RAG features:

  • Answer questions using your data
  • Context-aware responses
  • Citation support

3. Function Calling

Mistral supports function calling for:

  • Object search
  • Object creation
  • Object updates
  • Register queries

Performance Comparison

SetupSpeedQualityEase of Use
Ollama⚡⚡⚡ Fast⭐⭐⭐⭐⭐⭐⭐⭐⭐ Easy
TGI⚡⚡ Fast⭐⭐⭐⭐⭐⭐⭐ Medium
vLLM⚡⚡⚡ Very Fast⭐⭐⭐⭐⭐⭐⭐ Medium

For Development

Use Ollama with Mistral:

  • Easiest setup
  • Good performance
  • Native API

For Production

Use TGI or vLLM with Mistral:

  • Better throughput
  • OpenAI-compatible API
  • Optimized inference

Troubleshooting

Model Not Found (Ollama)

# List available models
docker exec openregister-ollama ollama list

# Pull Mistral if missing
docker exec openregister-ollama ollama pull mistral:latest

# Verify model name includes tag
docker exec openregister-ollama ollama show mistral:latest

Slow Performance

Solutions:

  1. Use GPU acceleration (10-100x faster)
  2. Use Mistral 7B instead of Mixtral 8x7B
  3. Ensure models are loaded in memory

Further Reading

Support

For issues specific to: