API Server¶

The AniSearch Model includes a powerful REST API server built with FastAPI that allows you to integrate the semantic search capabilities into your applications through HTTP requests.

Overview¶

The API server provides endpoints for searching anime and manga databases using the same cross-encoder models available in the command line interface. It's designed to be:

Fast and Efficient: Built on the high-performance FastAPI and Uvicorn
Production-Ready: Configurable CORS, multi-worker support, and route restrictions
Developer-Friendly: Interactive OpenAPI documentation at /docs
Flexible: Configurable for various deployment scenarios

Getting Started¶

Installation¶

Ensure you have all the necessary dependencies installed:

pip install -r requirements.txt

Starting the Server¶

To start the API server with default settings:

python -m src.api

This will start the server on 0.0.0.0:8000 with default settings.

Accessing the API¶

Once the server is running, you can:

Visit http://localhost:8000/docs for interactive API documentation
Make direct HTTP requests to the available endpoints

API Endpoints¶

Endpoint	Method	Description
`/`	GET	Health check and CUDA availability
`/models`	GET	List available models and fine-tuned models
`/search/anime`	POST	Search for anime matching a description
`/search/manga`	POST	Search for manga matching a description

Server Configuration¶

The API server provides numerous configuration options to customize its behavior:

# Basic configuration
python -m src.api --host=127.0.0.1 --port=8000

# Performance tuning
python -m src.api --workers=4 --limit-concurrency=100 --timeout=60

# CORS configuration
python -m src.api --cors-origins="https://myapp.com,https://admin.myapp.com" --cors-methods="GET,POST"

# Route restrictions for production
python -m src.api --enable-routes=search,health

Configuration Options¶

Option	Description	Default
`--host`	Host to bind the server to	`0.0.0.0`
`--port`	Port to bind the server to	`8000`
`--workers`	Number of worker processes	Half of CPUs
`--limit-concurrency`	Maximum number of concurrent connections	`50`
`--timeout`	Timeout for keep-alive connections (seconds)	`30`
`--cors-origins`	Allowed origins for CORS	`*` (all)
`--cors-methods`	Allowed HTTP methods for CORS	`*` (all)
`--cors-headers`	Allowed HTTP headers for CORS	`*` (all)
`--enable-routes`	Comma-separated list of routes to enable	`all`

Usage Examples¶

Health Check¶

curl -X GET "http://localhost:8000/"

Response:

{
  "status": "healthy",
  "models_loaded": {
    "anime": true,
    "manga": true
  },
  "cuda_available": true
}

List Available Models¶

curl -X GET "http://localhost:8000/models"

Response:

{
  "models": {
    "Semantic Search": {
      "cross-encoder/ms-marco-MiniLM-L-6-v2": "Recommended for general search",
      "cross-encoder/ms-marco-MiniLM-L-12-v2": "More accurate but slower"
    }
  },
  "fine_tuned": {
    "anime-v1": "model/fine-tuned/anime-v1"
  }
}

Search for Anime¶

curl -X POST "http://localhost:8000/search/anime?model_name=cross-encoder/ms-marco-MiniLM-L-6-v2&device=cuda" \
  -H "Content-Type: application/json" \
  -d '{"query": "A story about robots and AI", "num_results": 3}'

Response:

{
  "results": [
    {
      "id": 1023,
      "title": "Ghost in the Shell: Stand Alone Complex",
      "score": 0.892,
      "synopsis": "In the not so distant future, mankind has advanced..."
    },
    {
      "id": 43,
      "title": "Ghost in the Shell",
      "score": 0.857,
      "synopsis": "In the year 2029, the barriers of our world have been broken..."
    },
    {
      "id": 851,
      "title": "Ergo Proxy",
      "score": 0.813,
      "synopsis": "Within the domed city of Romdo lies one of the last human..."
    }
  ],
  "execution_time_ms": 156.32,
  "device_used": "cuda"
}

Search for Manga with Light Novels Included¶

curl -X POST "http://localhost:8000/search/manga?include_light_novels=true" \
  -H "Content-Type: application/json" \
  -d '{"query": "A fantasy adventure in a magical world", "num_results": 3, "batch_size": 64}'

Production Deployment¶

For production deployment, consider the following best practices:

Restrict Routes: Use --enable-routes=search,health to only expose necessary endpoints
Configure CORS: Set --cors-origins to your application domains
Set Worker Count: Adjust --workers based on your server's CPU cores
Use HTTPS: Deploy behind a reverse proxy like Nginx with HTTPS
Monitor Performance: Adjust concurrency limits based on server capabilities

Example Production Configuration¶

python -m src.api \
  --host=127.0.0.1 \
  --port=8000 \
  --workers=4 \
  --enable-routes=search,health \
  --cors-origins="https://yourdomain.com" \
  --limit-concurrency=200 \
  --timeout=60

Advanced Usage¶

Using GPU Acceleration¶

The API supports GPU acceleration for faster query processing. To use a GPU:

Ensure you have PyTorch installed with CUDA support
Specify device=cuda in your API requests
Monitor GPU memory usage to optimize worker count

Custom Models¶

You can use fine-tuned models by specifying the model path in the API request:

curl -X POST "http://localhost:8000/search/anime?model_name=model/fine-tuned/anime-v1" \
  -H "Content-Type: application/json" \
  -d '{"query": "A story about robots and AI"}'

API Architecture¶

The AniSearch API server implements a caching layer to avoid reloading models between requests:

flowchart TB
    Client[HTTP Client] --> API[FastAPI Server]
    API --> Cache{Model Cache}
    Cache -- "Cache Miss" --> ModelLoader[Load Model]
    ModelLoader --> Dataset[Load Dataset]
    ModelLoader --> CrossEncoder[Init Cross-Encoder]
    Cache -- "Cache Hit" --> Search[Perform Search]
    Search --> Results[Return Results]

    style Client fill:#f9f9f9,stroke:#333
    style API fill:#ffe0b2,stroke:#fb8c00
    style Cache fill:#e1bee7,stroke:#8e24aa
    style Search fill:#bbdefb,stroke:#1976d2
    style Results fill:#c8e6c9,stroke:#43a047

Press "Alt" / "Option" to enable Pan & Zoom