API Server Implementation¶
This page provides a detailed technical reference for the AniSearch API Server implementation.
Module Overview¶
The src.api
module implements a FastAPI server that exposes the AniSearch Model functionality through RESTful HTTP endpoints. It's designed to be scalable, configurable, and production-ready.
API Reference¶
AniSearch API Server¶
A FastAPI server that exposes the AniSearch functionality through HTTP endpoints.
This module provides a REST API for searching anime and manga datasets using cross-encoder models for semantic similarity. It allows clients to:
- Search for anime matching a description
- Search for manga matching a description
- List available models
- Get health check status
Features¶
- RESTful API: Clean, standards-compliant API design
- Interactive Documentation: Automatic OpenAPI/Swagger UI at
/docs
- CORS Support: Configurable cross-origin resource sharing
- Multi-worker Architecture: Handles concurrent requests efficiently
- Model Caching: Avoids reloading models for each request
- Route Restrictions: Configurable endpoint enabling/disabling for production
- Custom Performance Settings: Configurable worker count and connection limits
API Endpoints¶
Endpoint | Method | Description |
---|---|---|
/ | GET | Health check and CUDA availability |
/models | GET | List available models and fine-tuned models |
/search/anime | POST | Search for anime matching a description |
/search/manga | POST | Search for manga matching a description |
Server Usage¶
# Basic usage
python -m src.api
# With custom settings
python -m src.api --host=127.0.0.1 --port=9000 --workers=4
# Production mode with restricted routes
python -m src.api --enable-routes=search --cors-origins="https://yourdomain.com"
GPU Acceleration¶
For optimal performance, especially with larger models, using a GPU is recommended. To enable GPU support, install PyTorch with CUDA:
You can then specify device=cuda
in your API requests to utilize GPU acceleration.
app module-attribute
¶
app = FastAPI(title='AniSearch API', description='API for searching anime and manga using semantic similarity', version='1.0.0')
headers module-attribute
¶
methods module-attribute
¶
origins module-attribute
¶
restricted_app module-attribute
¶
HealthResponse ¶
Bases: BaseModel
Response model for the health check endpoint.
This model defines the structure of the response returned by the health check endpoint. It includes the overall API status, the status of each model type, and information about CUDA availability.
ATTRIBUTE | DESCRIPTION |
---|---|
status | Overall status of the API ('healthy' or 'degraded') TYPE: |
models_loaded | Dictionary of model types and their loading status TYPE: |
cuda_available | Whether CUDA is available on the system TYPE: |
Example
ModelsResponse ¶
Bases: BaseModel
Response model for the models endpoint.
This model defines the structure of the response returned by the models endpoint. It includes information about available pre-trained models and any fine-tuned models.
ATTRIBUTE | DESCRIPTION |
---|---|
models | Dictionary of model categories and available models TYPE: |
fine_tuned | Dictionary of fine-tuned model names and their paths TYPE: |
Example
SearchRequest ¶
Bases: BaseModel
Request model for anime and manga search endpoints.
This model defines the required and optional parameters for search requests. It includes validation rules to ensure the parameters are within acceptable ranges.
ATTRIBUTE | DESCRIPTION |
---|---|
query | The search query text describing the anime/manga to find TYPE: |
num_results | Number of results to return (default: 5) TYPE: |
batch_size | Batch size for processing the search in the model (default: 32) TYPE: |
Example
SearchResponse ¶
Bases: BaseModel
Response model for anime and manga search endpoints.
This model defines the structure of the response returned by the search endpoints. It includes the search results, execution time, and device used for computation.
ATTRIBUTE | DESCRIPTION |
---|---|
results | List of search results sorted by relevance TYPE: |
execution_time_ms | Total execution time of the search in milliseconds TYPE: |
device_used | The device used for computation (e.g., 'cpu', 'cuda') TYPE: |
Example
device_used class-attribute
instance-attribute
¶
execution_time_ms class-attribute
instance-attribute
¶
results class-attribute
instance-attribute
¶
results: List[SearchResult] = Field(..., description='Search results')
SearchResult ¶
Bases: BaseModel
Individual search result item returned by the search endpoints.
This model represents a single anime or manga entry matched by the search. It includes the basic information needed to display the result to the user.
ATTRIBUTE | DESCRIPTION |
---|---|
id | Unique identifier for the entry (anime_id or manga_id) TYPE: |
title | Title of the anime/manga TYPE: |
score | Relevance score between 0.0 and 1.0 (higher is more relevant) TYPE: |
synopsis | Partial synopsis text (may be truncated for display) TYPE: |
Example
get_available_models async
¶
get_available_models() -> ModelsResponse
Get a list of available pre-trained and fine-tuned models.
This endpoint returns information about models that can be used with the search endpoints. It includes:
- Pre-trained models categorized by type (e.g., Semantic Search, Question Answering)
- Fine-tuned models specifically trained for anime/manga search
RETURNS | DESCRIPTION |
---|---|
ModelsResponse | Available models and their descriptions TYPE: |
ModelsResponse |
|
ModelsResponse |
|
Note
Fine-tuned models are located in the model/fine-tuned
directory. The API will only list models that have a valid configuration file.
Source code in src/api.py
get_or_create_model ¶
get_or_create_model(dataset_type: str, model_name: str, device: Optional[str] = None, include_light_novels: bool = False) -> BaseSearchModel
Get a cached model or create a new one if not already cached.
This function manages the model cache to avoid reloading models for each request. It handles device selection, CUDA availability checking, and model initialization.
PARAMETER | DESCRIPTION |
---|---|
dataset_type | The type of dataset to use ('anime' or 'manga') TYPE: |
model_name | The name or path of the model to use TYPE: |
device | Device to run the model on ('cpu', 'cuda', 'cuda:0', etc.) If None, automatically selects the best available device TYPE: |
include_light_novels | Whether to include light novels in manga search results Only relevant for manga dataset_type TYPE: |
RETURNS | DESCRIPTION |
---|---|
BaseSearchModel | An initialized search model ready for queries TYPE: |
RAISES | DESCRIPTION |
---|---|
ValueError | If the model or dataset cannot be loaded |
RuntimeError | If there are issues initializing the model |
Note
If CUDA is requested but not available, it will automatically fall back to CPU with a warning.
Source code in src/api.py
health_check async
¶
health_check() -> HealthResponse
Check if the API is running and ready to handle requests.
This endpoint verifies that the API server is operational and provides information about the status of different components:
- Whether the API server itself is running
- Whether each model type (anime, manga) can be loaded
- Whether CUDA is available for GPU acceleration
RETURNS | DESCRIPTION |
---|---|
HealthResponse | The health status of the API TYPE: |
HealthResponse |
|
HealthResponse |
|
HealthResponse |
|
Note
This endpoint intentionally uses CPU for model loading checks to avoid GPU memory issues during health checking.
Source code in src/api.py
restricted_get_models async
¶
List available models endpoint for the restricted API mode.
This endpoint returns information about models that can be used with the search endpoints in restricted mode.
RETURNS | DESCRIPTION |
---|---|
ModelsResponse | Available models and their descriptions |
Source code in src/api.py
restricted_health_check async
¶
Health check endpoint for the restricted API mode.
This endpoint verifies that the API server is operational in restricted mode and provides information about the status of different components.
RETURNS | DESCRIPTION |
---|---|
HealthResponse | The health status of the API |
Source code in src/api.py
restricted_search_anime async
¶
Search for anime endpoint for the restricted API mode.
This endpoint performs semantic search against the anime dataset using the specified model in restricted mode.
Parameters are the same as the regular search_anime endpoint.
RETURNS | DESCRIPTION |
---|---|
SearchResponse | The search results with relevant anime matches |
Source code in src/api.py
restricted_search_manga async
¶
Search for manga endpoint for the restricted API mode.
This endpoint performs semantic search against the manga dataset using the specified model in restricted mode.
Parameters are the same as the regular search_manga endpoint.
RETURNS | DESCRIPTION |
---|---|
SearchResponse | The search results with relevant manga matches |
Source code in src/api.py
search_anime async
¶
search_anime(request: SearchRequest, model_name: str = Query('cross-encoder/ms-marco-MiniLM-L-6-v2', description='Model name or path'), device: Optional[str] = Query(None, description="Device to run the model on ('cpu', 'cuda', 'cuda:0', etc.). If not specified, uses the best available device.")) -> SearchResponse
Search for anime matching the provided description.
This endpoint performs semantic search against the anime dataset using the specified model, returning the most relevant matches sorted by score.
Parameters¶
-
request: The search request body containing:
- query: The search query text describing the anime
- num_results: Number of results to return (default: 5, max: 100)
- batch_size: Batch size for processing (default: 32)
-
model_name: The model to use for search (query parameter)
- Can be a pre-trained model name or path to a fine-tuned model
- Default: "cross-encoder/ms-marco-MiniLM-L-6-v2"
-
device: The device to run the model on (query parameter)
- Options: 'cpu', 'cuda', 'cuda:0', etc.
- If not specified, uses the best available device
Returns¶
- results: List of anime matching the query, sorted by relevance
- execution_time_ms: Time taken to execute the search in milliseconds
- device_used: The device used for computation (e.g., 'cpu', 'cuda')
Example¶
curl -X POST "http://localhost:8000/search/anime?device=cuda" \
-H "Content-Type: application/json" \
-d '{"query": "A story about robots and AI"}'
Notes¶
- For optimal performance on large queries, use GPU acceleration with
device=cuda
- Model caching is used to avoid reloading models between requests
- Results include truncated synopses; full content is available in the dataset
Source code in src/api.py
418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 |
|
search_manga async
¶
search_manga(request: SearchRequest, model_name: str = Query('cross-encoder/ms-marco-MiniLM-L-6-v2', description='Model name or path'), include_light_novels: bool = Query(False, description='Whether to include light novels in search results'), device: Optional[str] = Query(None, description="Device to run the model on ('cpu', 'cuda', 'cuda:0', etc.). If not specified, uses the best available device.")) -> SearchResponse
Search for manga matching the provided description.
This endpoint performs semantic search against the manga dataset using the specified model, returning the most relevant matches sorted by score.
Parameters¶
-
request: The search request body containing:
- query: The search query text describing the manga
- num_results: Number of results to return (default: 5, max: 100)
- batch_size: Batch size for processing (default: 32)
-
model_name: The model to use for search (query parameter)
- Can be a pre-trained model name or path to a fine-tuned model
- Default: "cross-encoder/ms-marco-MiniLM-L-6-v2"
-
include_light_novels: Whether to include light novels in results (query parameter)
- Default: false
-
device: The device to run the model on (query parameter)
- Options: 'cpu', 'cuda', 'cuda:0', etc.
- If not specified, uses the best available device
Returns¶
- results: List of manga matching the query, sorted by relevance
- execution_time_ms: Time taken to execute the search in milliseconds
- device_used: The device used for computation (e.g., 'cpu', 'cuda')
Example¶
curl -X POST "http://localhost:8000/search/manga?include_light_novels=true&device=cuda" \
-H "Content-Type: application/json" \
-d '{"query": "A fantasy adventure in a magical world", "num_results": 10}'
Notes¶
- Use
include_light_novels=true
to include light novels in search results - For optimal performance on large queries, use GPU acceleration with
device=cuda
- Model caching is used to avoid reloading models between requests
- Results include truncated synopses; full content is available in the dataset
Source code in src/api.py
499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 |
|