ModelService

The ModelService class is the central utility for managing various models, including Large Language Models (LLMs), Large Vision Models (LVMs), and Text-to-Image (T2I) models. It abstracts the complexity of model initialization, request handling, and response generation, providing a unified interface for both text and image-based tasks.

1. Quick Start

The ModelService is included in the trustgen package. To get started:

from trustgen.src.generation import ModelService

# Initialize ModelService
model_service = ModelService(
    request_type="t2i",
    handler_type="local",
    model_name="sd-3.5-large",
    config_path="/path/to/config.yaml"
)

# Process a single prompt
response = model_service.process("Generate an image of a futuristic city.")

2. Initialization

To create an instance of the ModelService class, you need to specify the following parameters:

Constructor Parameters

Parameter	Type	Description
`request_type`	`str`	The type of request: `"llm"` (Language Model), `"lvm"` (Vision Model), or `"t2i"` (Text-to-Image).
`handler_type`	`str`	The handler type: `"api"` (remote API-based inference) or `"local"` (local inference).
`model_name`	`str`	The name of the model to use, mapped internally.
`config_path`	`str`	Path to the YAML configuration file.
`**kwargs`	`dict`	Additional parameters to customize behavior (e.g., temperature).

3. Supported Models

The ModelService supports a wide range of models for text, vision, and text-to-image tasks. Below is a comprehensive table of the supported models, categorized by request type (llm, lvm, t2i), model name, and whether the model uses api or local inference.

Request Type	Model Name	Handler Type
LLM	`gpt-4o`	`api`
LLM	`gpt-4o-mini`	`api`
LLM	`gpt-3.5-turbo`	`api`
LLM	`text-embedding-ada-002`	`api`
LLM	`glm-4`	`api`
LLM	`glm-4-plus`	`api`
LLM	`llama-3-8B`	`api`
LLM	`llama-3.1-70B`	`api`
LLM	`llama-3.1-8B`	`api`
LLM	`qwen-2.5-72B`	`api`
LLM	`mistral-7B`	`api`
LLM	`mistral-8x7B`	`api`
LLM	`claude-3.5-sonnet`	`api`
LLM	`claude-3-haiku`	`api`
LLM	`gemini-1.5-pro`	`api`
LLM	`gemini-1.5-flash`	`api`
LLM	`command-r-plus`	`api`
LLM	`command-r`	`api`
LLM	`gemma-2-27B`	`api`
LLM	`deepseek-chat`	`api`
LLM	`yi-lightning`	`api`
LVM	`glm-4v`	`api`
LVM	`glm-4v-plus`	`api`
LVM	`llama-3.2-90B-V`	`api`
LVM	`llama-3.2-11B-V`	`api`
LVM	`qwen-vl-max-0809`	`api`
LVM	`qwen-2-vl-72B`	`api`
LVM	`internLM-72B`	`api`
LVM	`claude-3-haiku`	`api`
LVM	`gemini-1.5-pro`	`api`
LVM	`gemini-1.5-flash`	`api`
T2I	`dall-e-3`	`api`
T2I	`flux-1.1-pro`	`api`
T2I	`flux_schnell`	`api`
T2I	`playgroundai/playground-v2.5-1024px-aesthetic`	`local`
T2I	`cogview-3-plus`	`api`
T2I	`sd-3.5-large`	`local`
T2I	`sd-3.5-large-turbo`	`local`
T2I	`HunyuanDiT`	`local`
T2I	`kolors`	`local`
T2I	`playground-v2.5`	`local`

4. Pipeline Initialization

The _initialize_pipeline method sets up the appropriate pipeline based on the provided parameters. It automatically configures the model, handler, and other runtime options.

Example: Initialize a Stable Diffusion Pipeline for Local Use

model_service = ModelService(
    request_type="t2i",
    handler_type="local",
    model_name="sd-3.5-large",
    config_path="/path/to/config.yaml"
)

5. Methods

process

Definition:
process(prompt: Union[str, List[str]], **kwargs) -> Any

Processes a single prompt or a list of prompts synchronously. It supports both one-off interactions and multi-turn conversations.

Parameters: - prompt (str or List[str]): The input prompt(s). - **kwargs: Additional parameters for model customization.

Returns:
Model-generated responses.

Example:

# Single prompt
response = model_service.process("Generate an image of a peaceful forest.")

# Multi-turn interaction
prompts = [
    "What is the capital of France?",
    "What is the population of Paris?"
]
responses = model_service.process(prompts)

process_async

Definition:
process_async(prompt: Union[str, List[str]], **kwargs) -> Awaitable[Any]

Handles requests asynchronously, enabling high concurrency for demanding applications.

Parameters: - prompt (str or List[str]): The input prompt(s). - **kwargs: Additional parameters for model customization.

Returns:
An awaitable object with the model's response.

Example:

# Asynchronous prompt
response = await model_service.process_async("Describe a sunset over the ocean.")

6. Configuration

The configuration file (config.yaml) specifies model settings, resource allocation, and runtime options. Below is an example configuration format:

Example Configuration (config.yaml):

openai_sdk_llms:
  gpt-4o: OPENAI
  gpt-3.5-turbo: OPENAI
  glm-4: ZHIPU
  llama-3.1-70B: DEEPINFRA

local_models:
  sd-3.5-large: LOCAL
  HunyuanDiT: LOCAL

OPENAI:
  OPENAI_API_KEY: sk-XXXXXX
  OPENAI_BASE_URL: xxx

Replace the API keys and configuration paths with your own values.