Skip to content
Dashboard

Create Chat Completion

POST/chat/completions

Create Chat Completion

Body ParametersJSONExpand Collapse
messages: array of object { content, role, name } or object { content, role, name } or object { content, role, name } or 5 more
One of the following:
ChatCompletionDeveloperMessageParam = object { content, role, name }

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, developer messages replace the previous system messages.

content: string or array of object { text, type }
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type }
text: string
type: "text"
role: "developer"
name: optional string
ChatCompletionSystemMessageParam = object { content, role, name }

Developer-provided instructions that the model should follow, regardless of messages sent by the user. With o1 models and newer, use developer messages for this purpose instead.

content: string or array of object { text, type }
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type }
text: string
type: "text"
role: "system"
name: optional string
ChatCompletionUserMessageParam = object { content, role, name }

Messages sent by an end user, containing prompts or additional context information.

content: string or array of object { text, type } or object { image_url, type } or object { input_audio, type } or object { file, type }
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type } or object { image_url, type } or object { input_audio, type } or object { file, type }
One of the following:
ChatCompletionContentPartTextParam = object { text, type }

Learn about text inputs.

text: string
type: "text"
ChatCompletionContentPartImageParam = object { image_url, type }

Learn about image inputs.

image_url: object { url, detail }
url: string
detail: optional "auto" or "low" or "high"
One of the following:
"auto"
"low"
"high"
type: "image_url"
ChatCompletionContentPartInputAudioParam = object { input_audio, type }

Learn about audio inputs.

input_audio: object { data, format }
data: string
format: "wav" or "mp3"
One of the following:
"wav"
"mp3"
type: "input_audio"
File = object { file, type }

Learn about file inputs for text generation.

file: object { file_data, file_id, filename }
file_data: optional string
file_id: optional string
filename: optional string
type: "file"
role: "user"
name: optional string
ChatCompletionAssistantMessageParam = object { role, audio, content, 4 more }

Messages sent by the model in response to user messages.

role: "assistant"
audio: optional object { id }

Data about a previous audio response from the model. Learn more.

id: string
content: optional string or array of object { text, type } or object { refusal, type }
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type } or object { refusal, type }
One of the following:
ChatCompletionContentPartTextParam = object { text, type }

Learn about text inputs.

text: string
type: "text"
ChatCompletionContentPartRefusalParam = object { refusal, type }
refusal: string
type: "refusal"
function_call: optional object { arguments, name }

Deprecated and replaced by tool_calls.

The name and arguments of a function that should be called, as generated by the model.

arguments: string
name: string
name: optional string
refusal: optional string
tool_calls: optional array of object { id, function, type } or object { id, custom, type }
One of the following:
ChatCompletionMessageFunctionToolCallParam = object { id, function, type }

A call to a function tool created by the model.

id: string
function: object { arguments, name }

The function that the model called.

arguments: string
name: string
type: "function"
ChatCompletionMessageCustomToolCallParam = object { id, custom, type }

A call to a custom tool created by the model.

id: string
custom: object { input, name }

The custom tool that the model called.

input: string
name: string
type: "custom"
ChatCompletionToolMessageParam = object { content, role, tool_call_id }
content: string or array of object { text, type }
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type }
text: string
type: "text"
role: "tool"
tool_call_id: string
ChatCompletionFunctionMessageParam = object { content, name, role }
content: string
name: string
role: "function"
CustomChatCompletionMessageParam = object { role, content, name, 4 more }

Enables custom roles in the Chat Completion API.

role: string
content: optional string or array of object { text, type } or object { image_url, type } or object { input_audio, type } or 11 more
One of the following:
UnionMember0 = string
UnionMember1 = array of object { text, type } or object { image_url, type } or object { input_audio, type } or 11 more
One of the following:
ChatCompletionContentPartTextParam = object { text, type }

Learn about text inputs.

text: string
type: "text"
ChatCompletionContentPartImageParam = object { image_url, type }

Learn about image inputs.

image_url: object { url, detail }
url: string
detail: optional "auto" or "low" or "high"
One of the following:
"auto"
"low"
"high"
type: "image_url"
ChatCompletionContentPartInputAudioParam = object { input_audio, type }

Learn about audio inputs.

input_audio: object { data, format }
data: string
format: "wav" or "mp3"
One of the following:
"wav"
"mp3"
type: "input_audio"
File = object { file, type }

Learn about file inputs for text generation.

file: object { file_data, file_id, filename }
file_data: optional string
file_id: optional string
filename: optional string
type: "file"
ChatCompletionContentPartAudioParam = object { audio_url, type }
audio_url: object { url }
url: string
type: "audio_url"
ChatCompletionContentPartVideoParam = object { type, video_url }
type: "video_url"
video_url: object { url }
url: string
ChatCompletionContentPartRefusalParam = object { refusal, type }
refusal: string
type: "refusal"
CustomChatCompletionContentSimpleImageParam = object { image_url, uuid }

A simpler version of the param that only accepts a plain image_url. This is supported by OpenAI API, although it is not documented.

Example: { "image_url": "https://example.com/image.jpg" }

image_url: optional string
uuid: optional string
ChatCompletionContentPartImageEmbedsParam = object { type, image_embeds, uuid }
type: "image_embeds"
image_embeds: optional string or map[string]
One of the following:
UnionMember0 = string
UnionMember1 = map[string]
uuid: optional string
ChatCompletionContentPartAudioEmbedsParam = object { type, audio_embeds, uuid }
type: "audio_embeds"
audio_embeds: optional string or map[string]
One of the following:
UnionMember0 = string
UnionMember1 = map[string]
uuid: optional string
CustomChatCompletionContentSimpleAudioParam = object { audio_url }

A simpler version of the param that only accepts a plain audio_url.

Example: { "audio_url": "https://example.com/audio.mp3" }

audio_url: optional string
CustomChatCompletionContentSimpleVideoParam = object { uuid, video_url }

A simpler version of the param that only accepts a plain audio_url.

Example: { "video_url": "https://example.com/video.mp4" }

uuid: optional string
video_url: optional string
UnionMember12 = string
CustomThinkCompletionContentParam = object { thinking, type, closed }

A Think Completion Content Param that accepts a plain text and a boolean.

Example: { "thinking": "I am thinking about the answer", "closed": True, "type": "thinking" }

thinking: string
type: "thinking"
closed: optional boolean
name: optional string
reasoning: optional string
tool_call_id: optional string
tool_calls: optional array of object { id, function, type }
id: string
function: object { arguments, name }

The function that the model called.

arguments: string
name: string
type: "function"
tools: optional array of object { function, type }
function: object { name, description, parameters, strict }
name: string
description: optional string
parameters: optional map[unknown]
strict: optional boolean
type: "function"
Message = object { author, channel, content, 2 more }
author: object { role, name }
role: "user" or "assistant" or "system" or 2 more

The role of a message author (mirrors chat::Role).

One of the following:
"user"
"assistant"
"system"
"developer"
"tool"
name: optional string
channel: optional string
content: optional array of unknown
content_type: optional string
recipient: optional string
add_generation_prompt: optional boolean

If true, the generation prompt will be added to the chat template. This is a parameter used by chat template in tokenizer config of the model.

add_special_tokens: optional boolean

If true, special tokens (e.g. BOS) will be added to the prompt on top of what is added by the chat template. For most models, the chat template takes care of adding the special tokens so this should be set to false (as is the default).

allowed_token_ids: optional array of number
bad_words: optional array of string
cache_salt: optional string

If specified, the prefix cache will be salted with the provided string to prevent an attacker to guess prompts in multi-user environments. The salt should be random, protected from access by 3rd parties, and long enough to be unpredictable (e.g., 43 characters base64-encoded, corresponding to 256 bit).

chat_template: optional string

A Jinja template to use for this conversion. As of transformers v4.44, default chat template is no longer allowed, so you must provide a chat template if the tokenizer does not define one.

chat_template_kwargs: optional map[unknown]

Additional keyword args to pass to the template renderer. Will be accessible by the chat template.

continue_final_message: optional boolean

If this is set, the chat will be formatted so that the final message in the chat is open-ended, without any EOS tokens. The model will continue this message rather than starting a new one. This allows you to "prefill" part of the model's response for it. Cannot be used at the same time as add_generation_prompt.

documents: optional array of map[string]

A list of dicts representing documents that will be accessible to the model if it is performing RAG (retrieval-augmented generation). If the template does not support RAG, this argument will have no effect. We recommend that each document should be a dict containing "title" and "text" keys.

echo: optional boolean

If true, the new message will be prepended with the last message if they belong to the same role.

frequency_penalty: optional number
ignore_eos: optional boolean
include_reasoning: optional boolean
include_stop_str_in_output: optional boolean
kv_transfer_params: optional map[unknown]

KVTransfer parameters used for disaggregated serving.

length_penalty: optional number
logit_bias: optional map[number]
logits_processors: optional array of string or object { qualname, args, kwargs }

A list of either qualified names of logits processors, or constructor objects, to apply when sampling. A constructor is a JSON object with a required 'qualname' field specifying the qualified name of the processor class/factory, and optional 'args' and 'kwargs' fields containing positional and keyword arguments. For example: {'qualname': 'my_module.MyLogitsProcessor', 'args': [1, 2], 'kwargs': {'param': 'value'}}.

One of the following:
UnionMember0 = string
LogitsProcessorConstructor = object { qualname, args, kwargs }
qualname: string
args: optional array of unknown
kwargs: optional map[unknown]
logprobs: optional boolean
max_completion_tokens: optional number
Deprecatedmax_tokens: optional number
min_p: optional number
min_tokens: optional number
mm_processor_kwargs: optional map[unknown]

Additional kwargs to pass to the HF processor.

model: optional string
n: optional number
parallel_tool_calls: optional boolean
presence_penalty: optional number
priority: optional number

The priority of the request (lower means earlier handling; default: 0). Any priority other than 0 will raise an error if the served model does not use priority scheduling.

prompt_logprobs: optional number
reasoning_effort: optional "low" or "medium" or "high"
One of the following:
"low"
"medium"
"high"
repetition_penalty: optional number
request_id: optional string

The request_id related to this request. If the caller does not set it, a random_uuid will be generated. This id is used through out the inference process and return in response.

response_format: optional object { type, json_schema } or object { format, type } or object { structures, triggers, type }
One of the following:
ResponseFormat = object { type, json_schema }
type: "text" or "json_object" or "json_schema"
One of the following:
"text"
"json_object"
"json_schema"
json_schema: optional object { name, description, schema, strict }
name: string
description: optional string
schema: optional map[unknown]
strict: optional boolean
StructuralTagResponseFormat = object { format, type }
format: unknown
type: "structural_tag"
LegacyStructuralTagResponseFormat = object { structures, triggers, type }
structures: array of object { begin, end, schema }
begin: string
end: string
schema: optional map[unknown]
triggers: array of string
type: "structural_tag"
return_token_ids: optional boolean

If specified, the result will include token IDs alongside the generated text. In streaming mode, prompt_token_ids is included only in the first chunk, and token_ids contains the delta tokens for each chunk. This is useful for debugging or when you need to map generated text back to input tokens.

return_tokens_as_token_ids: optional boolean

If specified with 'logprobs', tokens are represented as strings of the form 'token_id:{token_id}' so that tokens that are not JSON-encodable can be identified.

seed: optional number
maximum9223372036854776000
minimum-9223372036854776000
skip_special_tokens: optional boolean
spaces_between_special_tokens: optional boolean
stop: optional string or array of string
One of the following:
UnionMember0 = string
UnionMember1 = array of string
stop_token_ids: optional array of number
stream: optional boolean
stream_options: optional object { continuous_usage_stats, include_usage }
continuous_usage_stats: optional boolean
include_usage: optional boolean
structured_outputs: optional object { _backend, _backend_was_auto, choice, 9 more }

Additional kwargs for structured outputs

_backend: optional string
_backend_was_auto: optional boolean
choice: optional array of string
disable_additional_properties: optional boolean
disable_any_whitespace: optional boolean
disable_fallback: optional boolean
grammar: optional string
json: optional string or map[unknown]
One of the following:
UnionMember0 = string
UnionMember1 = map[unknown]
json_object: optional boolean
regex: optional string
structural_tag: optional string
whitespace_pattern: optional string
temperature: optional number
tool_choice: optional "none" or "auto" or "required" or object { function, type }
One of the following:
UnionMember0 = "none" or "auto" or "required"
One of the following:
"none"
"auto"
"required"
ChatCompletionNamedToolChoiceParam = object { function, type }
function: object { name }
name: string
type: optional "function"
tools: optional array of object { function, type }
function: object { name, description, parameters }
name: string
description: optional string
parameters: optional map[unknown]
type: optional "function"
top_k: optional number
top_logprobs: optional number
top_p: optional number
truncate_prompt_tokens: optional number
maximum9223372036854776000
minimum-1
user: optional string
vllm_xargs: optional map[string or number or array of string or number]

Additional request parameters with (list of) string or numeric values, used by custom extensions.

One of the following:
UnionMember0 = string
UnionMember1 = number
UnionMember2 = array of string or number
One of the following:
UnionMember0 = string
UnionMember1 = number

Create Chat Completion

curl https://api.tzafon.ai/chat/completions \
    -H 'Content-Type: application/json' \
    -H "Authorization: Bearer $TZAFON_API_KEY" \
    -d '{
          "messages": [
            {
              "content": "string",
              "role": "developer"
            }
          ]
        }'
{}
Returns Examples
{}