Dolphin LLM Model Integration
Dolphin is an open-source document image parsing model (0.3B parameters) developed by ByteDance. It provides advanced document processing capabilities including OCR, layout analysis, and structured extraction.
Overview
Dolphin is a Vision-Language Model (VLM) that excels at:
- ✅ Advanced OCR: Better accuracy on scanned documents and images
- ✅ Layout Analysis: Understand document structure and reading order
- ✅ Complex Documents: Handle multi-column layouts, mixed content types
- ✅ Table Extraction: Parse complex tables with high accuracy
- ✅ Formula Recognition: Extract mathematical formulas from documents
Prerequisites
- Nextcloud 28+ with OpenRegister installed
- Docker and Docker Compose
- GPU recommended (4GB+ VRAM) for optimal performance
- At least 8GB RAM
Quick Start
Step 1: Deploy Dolphin Container
Dolphin is included in docker-compose.dev.yml with the huggingface profile:
# Start Dolphin service
docker-compose -f docker-compose.dev.yml --profile huggingface up -d dolphin-vlm
# Wait for model download and initialization
docker logs -f openregister-dolphin-vlm
Step 2: Configure OpenRegister
Once Dolphin is deployed, configure in OpenRegister:
PHP Configuration (config/text_extraction.php):
return [
'text_extraction' => [
'dolphin' => [
'enabled' => true,
'api_url' => 'http://dolphin-vlm:5000', // From Nextcloud container
'timeout' => 60, // Longer timeout for complex documents
'fallback_to_llphant' => true // Fallback if Dolphin unavailable
]
]
];
Step 3: Test Integration
# Test Dolphin health
curl http://localhost:8083/health
# Test document parsing
curl -X POST http://localhost:8083/parse \
-F "file=@document.pdf" \
-F "format=text"
Configuration Details
Dolphin Service Configuration
dolphin-vlm:
build:
context: ./docker/dolphin
dockerfile: Dockerfile
container_name: openregister-dolphin-vlm
restart: always
ports:
- "8083:5000"
volumes:
- ./docker/dolphin/models:/app/models
environment:
- MODEL_PATH=/app/models
- LOG_LEVEL=INFO
deploy:
resources:
limits:
memory: 8G
reservations:
memory: 4G
devices:
- driver: nvidia
count: all
capabilities: [gpu]
Accessing Dolphin
Important: Docker Container Communication
- ✅ From Nextcloud container:
http://dolphin-vlm:5000 - ✅ From host machine:
http://localhost:8083 - ❌ NOT:
http://localhost:5000(from Nextcloud container)
Use Cases
1. Scanned Document Processing
Dolphin excels at extracting text from scanned PDFs and images:
use OCA\OpenRegister\Service\TextExtractionService;
$textExtractionService = $this->container->get(TextExtractionService::class);
// Extract text from scanned PDF
$text = $textExtractionService->extractFromFile($fileId, [
'method' => 'dolphin', // Use Dolphin for OCR
'format' => 'text'
]);
2. Complex Layout Analysis
Parse documents with complex layouts:
- Multi-column layouts
- Mixed text and images
- Tables and forms
- Mathematical formulas
3. Table Extraction
Extract structured data from tables:
// Extract tables from document
$result = $textExtractionService->extractFromFile($fileId, [
'method' => 'dolphin',
'format' => 'structured', // Get structured output
'extract_tables' => true
]);
API Endpoints
Health Check
GET http://dolphin-vlm:5000/health
Response:
{
"status": "ok",
"service": "dolphin-api"
}
Parse Document
POST http://dolphin-vlm:5000/parse
Content-Type: multipart/form-data
file: <file>
format: text|structured|json
Parse PDF
POST http://dolphin-vlm:5000/parse/pdf
Content-Type: multipart/form-data
file: <pdf_file>
Integration with Text Extraction Pipeline
Dolphin integrates with OpenRegister's text extraction pipeline:
When to Use Dolphin
Use Dolphin When:
- ✅ Processing scanned documents
- ✅ Complex multi-column layouts
- ✅ Tables and structured data extraction
- ✅ Mathematical formulas
- ✅ High accuracy OCR required
Use LLPhant When:
- ✅ Simple text documents
- ✅ Privacy-first requirements (no external API)
- ✅ Fast processing needed
- ✅ No GPU available
Troubleshooting
Container Won't Start
# Check logs
docker logs openregister-dolphin-vlm
# Common issues:
# 1. Port 8083 already in use
sudo lsof -i :8083
# 2. Insufficient memory
docker stats openregister-dolphin-vlm
# 3. GPU not available
docker exec openregister-dolphin-vlm nvidia-smi
Model Download Fails
# Check internet connection
docker exec openregister-dolphin-vlm ping -c 3 huggingface.co
# Manual model download
docker exec openregister-dolphin-vlm \
huggingface-cli download ByteDance/Dolphin-1.5 \
--local-dir /app/models
Connection Errors from OpenRegister
Problem: OpenRegister can't connect to Dolphin.
Solutions:
- Verify API URL uses container name:
http://dolphin-vlm:5000 - Check containers are on same Docker network
- Test connection from Nextcloud container:
docker exec <nextcloud-container> curl http://dolphin-vlm:5000/health
Slow Processing
Solutions:
- Use GPU acceleration (10-100x faster)
- Process smaller documents
- Increase timeout settings
- Use batch processing for multiple files
Performance Optimization
GPU Acceleration
For best performance, use GPU:
deploy:
resources:
devices:
- driver: nvidia
count: all
capabilities: [gpu]
Performance Gain: 10-100x faster processing with GPU
Batch Processing
Process multiple documents efficiently:
// Process multiple files
$fileIds = [123, 124, 125];
foreach ($fileIds as $fileId) {
$textExtractionService->extractFromFile($fileId, [
'method' => 'dolphin',
'async' => true // Process asynchronously
]);
}
Maintenance
View Logs
docker logs -f openregister-dolphin-vlm
Restart Container
docker restart openregister-dolphin-vlm
Update Model
docker exec openregister-dolphin-vlm \
huggingface-cli download ByteDance/Dolphin-2.0 \
--local-dir /app/models
docker restart openregister-dolphin-vlm
Check GPU Usage
docker exec openregister-dolphin-vlm nvidia-smi
Further Reading
- Docker Setup - Docker setup with Hugging Face profile
- Text Extraction Implementation
- Dolphin GitHub Repository
- Hugging Face Dolphin Model
Support
For issues specific to:
- Dolphin setup: Check Docker Setup - Use
--profile huggingfaceto enable Dolphin - OpenRegister integration: OpenRegister GitHub issues
- Model issues: Check Dolphin GitHub