组件
MultimodalTextbox

Gradio 新用户？从这里开始：入门指南

Model3D

Number

MultimodalTextbox

gradio.MultimodalTextbox(···)

import gradio as gr with gr.Blocks() as demo: gr.MultimodalTextbox(interactive=True) demo.launch()

描述

创建一个文本区域，供用户输入字符串或显示字符串输出，并允许上传多媒体文件。

行为

作为输入组件: 将文本值和文件列表作为 `dict` 传入函数。

您的函数应该接受以下类型之一

def predict(
	value: MultimodalValue | None
)
	...

作为输出组件: 期望一个 `dict`，包含可选的 "text" 和 "files"。files 数组是文件路径或 URL 的列表。

您的函数应该返回以下类型之一

def predict(···) -> MultimodalValue | None
	...	
	return value

初始化

🔗

value: str | dict[str, str | list] | Callable | None

默认值 = None

在 MultimodalTextbox 中显示的默认值。一个字符串值，或一个形如 {"text": "sample text", "files": [{path: "files/file.jpg", orig_name: "file.jpg", url: "http://image_url.jpg", size: 100}]} 的字典。如果提供函数，则每次应用程序加载时都会调用该函数来设置此组件的初始值。

🔗

sources: list[Literal['upload', 'microphone']] | Literal['upload', 'microphone'] | None

默认值 = None

允许的来源列表。"upload" 创建一个按钮，用户可以点击或拖放文件上传，"microphone" 创建一个麦克风输入。如果为 None，则默认为 ["upload"]。

🔗

file_types: list[str] | None

默认值 = None

要上传的文件扩展名或文件类型列表（例如 ['image', '.json', '.mp4']）。"file" 允许上传任何文件，"image" 仅允许上传图像文件，"audio" 仅允许上传音频文件，"video" 仅允许上传视频文件，"text" 仅允许上传文本文件。

🔗

file_count: Literal['single', 'multiple', 'directory']

默认值 = "single"

如果为 "single"，则允许用户上传一个文件。如果为 "multiple"，则用户上传多个文件。如果为 "directory"，则用户上传选中目录中的所有文件。对于 "multiple" 或 "directory" 的情况，返回类型将是每个文件的列表。

🔗

lines: int

默认值 = 1

文本区域中提供的最小行数。

🔗

max_lines: int

默认值 = 20

文本区域中提供的最大行数。

🔗

placeholder: str | None

默认值 = None

在文本区域后提供的占位符提示。

🔗

label: str | I18nData | None

默认值 = None

此组件的标签，如果 `show_label` 为 `True`，则显示在组件上方，如果此组件有示例表，则也用作标题。如果为 None 且在 `gr.Interface` 中使用，则标签将是此组件对应的参数名称。

🔗

info: str | I18nData | None

默认值 = None

附加的组件描述，以较小的字体显示在标签下方。支持 Markdown / HTML 语法。

🔗

every: Timer | float | None

默认值 = None

如果 `value` 是一个函数，则持续调用 `value` 以重新计算它（否则无效）。可以提供一个 Timer，其滴答声会重置 `value`，或者一个浮点数，提供重置 Timer 的常规间隔。

🔗

inputs: Component | list[Component] | set[Component] | None

默认值 = None

如果 `value` 是一个函数，则用作计算 `value` 的输入组件（否则无效）。`value` 会在输入更改时重新计算。

🔗

show_label: bool | None

默认值 = None

如果为 True，将显示标签。

🔗

container: bool

默认值 = True

如果为 True，则会将组件放置在一个容器中 - 在边框周围提供一些额外的填充。

🔗

scale: int | None

默认值 = None

相对于相邻组件的相对大小。例如，如果组件 A 和 B 在一行中，A 的 scale=2，B 的 scale=1，那么 A 将是 B 的两倍宽。应该是一个整数。scale 适用于行，以及 `Blocks` 中 `fill_height=True` 的顶级组件。

🔗

min_width: int

默认值 = 160

最小像素宽度，如果屏幕空间不足以满足此值，则会换行。如果某个比例值导致此组件宽度小于 `min_width`，则 `min_width` 参数将被优先遵守。

🔗

interactive: bool | None

默认值 = None

如果为 True，将渲染为可编辑文本框；如果为 False，将禁用编辑。如果未提供，则根据组件是用作输入还是输出进行推断。

🔗

visible: bool

默认值 = True

如果为 False，组件将被隐藏。

🔗

elem_id: str | None

默认值 = None

一个可选字符串，作为此组件在 HTML DOM 中的 id。可用于 CSS 样式定位。

🔗

autofocus: bool

默认值 = False

如果为 True，则在页面加载时会聚焦文本框。请谨慎使用，因为它可能导致有视力和无视力用户的可用性问题。

🔗

autoscroll: bool

默认值 = True

如果为 True，则当值改变时会自动滚动到文本框底部，除非用户向上滚动。如果为 False，则当值改变时不会滚动到文本框底部。

🔗

elem_classes: list[str] | str | None

默认值 = None

一个可选字符串列表，作为此组件在 HTML DOM 中的类。可用于 CSS 样式定位。

🔗

render: bool

默认值 = True

如果为 False，组件将不会在 Blocks 上下文中渲染。如果目的是现在分配事件监听器但稍后渲染组件，则应使用此参数。

🔗

key: int | str | tuple[int | str, ...] | None

默认值 = None

在 `gr.render` 中，跨重新渲染具有相同键的组件被视为同一组件，而不是新组件。在 'preserved_by_key' 中设置的属性不会在重新渲染时重置。

🔗

preserved_by_key: list[str] | str | None

默认值 = "value"

此组件构造函数中的参数列表。在 `gr.render()` 函数中，如果组件使用相同的键重新渲染，则这些（且仅这些）参数将在 UI 中被保留（如果它们已被用户或事件监听器更改），而不是根据构造函数中提供的值重新渲染。

🔗

text_align: Literal['left', 'right'] | None

默认值 = None

文本框中文本的对齐方式，可以是："left"、"right" 或 None（默认）。如果为 None，则在 `rtl` 为 False 时左对齐，在 `rtl` 为 True 时右对齐。仅当 `type` 为 "text" 时可更改。

🔗

rtl: bool

默认值 = False

如果为 True 且 `type` 为 "text"，则将文本方向设置为从右到左（光标出现在文本左侧）。默认为 False，光标显示在右侧。

🔗

submit_btn: str | bool | None

默认值 = True

如果为 False，将不显示提交按钮。如果为字符串，将使用该字符串作为提交按钮文本。

🔗

stop_btn: str | bool | None

默认值 = False

如果为 True，将显示停止按钮（对流式演示很有用）。如果为字符串，将使用该字符串作为停止按钮文本。

🔗

max_plain_text_length: int

默认值 = 1000

文本框中纯文本的最大长度。如果文本超过此长度，文本将作为文件粘贴。默认值为 1000。

快捷方式

类	Interface 字符串快捷方式	初始化
`gradio.MultimodalTextbox`	"multimodaltextbox"	使用默认值

演示

import gradio as gr import time # 带有多模态输入（文本、Markdown、LaTeX、代码块、图像、音频和视频）的聊天机器人演示。还展示了对流式文本的支持。 def print_like_dislike(x: gr.LikeData): print(x.index, x.value, x.liked) def add_message(history, message): for x in message["files"]: history.append({"role": "user", "content": {"path": x}}) if message["text"] is not None: history.append({"role": "user", "content": message["text"]}) return history, gr.MultimodalTextbox(value=None, interactive=False) def bot(history: list): response = "**That's cool!**" history.append({"role": "assistant", "content": ""}) for character in response: history[-1]["content"] += character time.sleep(0.05) yield history with gr.Blocks() as demo: chatbot = gr.Chatbot(elem_id="chatbot", bubble_full_width=False, type="messages") chat_input = gr.MultimodalTextbox( interactive=True, file_count="multiple", placeholder="输入消息或上传文件...", show_label=False, sources=["microphone", "upload"], ) chat_msg = chat_input.submit( add_message, [chatbot, chat_input], [chatbot, chat_input] ) bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response") bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input]) chatbot.like(print_like_dislike, None, None, like_user_message=True) if __name__ == "__main__": demo.launch()

import gradio as gr
import time

# Chatbot demo with multimodal input (text, markdown, LaTeX, code blocks, image, audio, & video). Plus shows support for streaming text.


def print_like_dislike(x: gr.LikeData):
    print(x.index, x.value, x.liked)


def add_message(history, message):
    for x in message["files"]:
        history.append({"role": "user", "content": {"path": x}})
    if message["text"] is not None:
        history.append({"role": "user", "content": message["text"]})
    return history, gr.MultimodalTextbox(value=None, interactive=False)


def bot(history: list):
    response = "**That's cool!**"
    history.append({"role": "assistant", "content": ""})
    for character in response:
        history[-1]["content"] += character
        time.sleep(0.05)
        yield history


with gr.Blocks() as demo:
    chatbot = gr.Chatbot(elem_id="chatbot", bubble_full_width=False, type="messages")

    chat_input = gr.MultimodalTextbox(
        interactive=True,
        file_count="multiple",
        placeholder="Enter message or upload file...",
        show_label=False,
        sources=["microphone", "upload"],
    )

    chat_msg = chat_input.submit(
        add_message, [chatbot, chat_input], [chatbot, chat_input]
    )
    bot_msg = chat_msg.then(bot, chatbot, chatbot, api_name="bot_response")
    bot_msg.then(lambda: gr.MultimodalTextbox(interactive=True), None, [chat_input])

    chatbot.like(print_like_dislike, None, None, like_user_message=True)

if __name__ == "__main__":
    demo.launch()

事件监听器

描述

事件监听器允许您响应用户与您在 Gradio Blocks 应用程序中定义的 UI 组件的交互。当用户与元素交互时，例如更改滑块值或上传图像，将调用一个函数。

支持的事件监听器

MultimodalTextbox 组件支持以下事件监听器。每个事件监听器都接受相同的参数，这些参数列在下面的事件参数表中。

监听器	描述
`MultimodalTextbox.change(fn, ···)`	当 MultimodalTextbox 的值因用户输入（例如用户在文本框中输入）或函数更新（例如图像从事件触发器的输出接收值）而更改时触发。有关仅由用户输入触发的监听器，请参阅 `.input()`。
`MultimodalTextbox.input(fn, ···)`	当用户更改 MultimodalTextbox 的值时，此监听器被触发。
`MultimodalTextbox.select(fn, ···)`	当用户选择或取消选择 MultimodalTextbox 时触发的事件监听器。使用事件数据 gradio.SelectData 来携带指代 MultimodalTextbox 标签的 `value`，以及指代 MultimodalTextbox 状态的 `selected`。有关如何使用此事件数据，请参阅 EventData 文档。
`MultimodalTextbox.submit(fn, ···)`	当 MultimodalTextbox 获得焦点时，用户按下回车键时，此监听器被触发。
`MultimodalTextbox.focus(fn, ···)`	当 MultimodalTextbox 获得焦点时，此监听器被触发。
`MultimodalTextbox.blur(fn, ···)`	当 MultimodalTextbox 失去焦点/模糊时，此监听器被触发。
`MultimodalTextbox.stop(fn, ···)`	当用户到达 MultimodalTextbox 中播放媒体的末尾时，此监听器被触发。

事件参数

🔗

fn: Callable | None | Literal['decorator']

默认值 = "decorator"

此事件触发时调用的函数。通常是机器学习模型的预测函数。函数的每个参数对应一个输入组件，函数应返回单个值或值的元组，其中元组中的每个元素对应一个输出组件。

🔗

inputs: Component | BlockContext | list[Component | BlockContext] | Set[Component | BlockContext] | None

默认值 = None

用作输入的 gradio.components 列表。如果函数没有输入，则应为空列表。

🔗

outputs: Component | BlockContext | list[Component | BlockContext] | Set[Component | BlockContext] | None

默认值 = None

用作输出的 gradio.components 列表。如果函数没有返回输出，则应为空列表。

🔗

api_name: str | None | Literal[False]

默认值 = None

定义端点在 API 文档中如何显示。可以是字符串、None 或 False。如果设置为字符串，则端点将以给定名称在 API 文档中公开。如果为 None（默认），则使用函数名称作为 API 端点。如果为 False，则端点将不会在 API 文档中公开，并且下游应用程序（包括通过 `gr.load` 加载此应用程序的应用程序）将无法使用此事件。

🔗

scroll_to_output: bool

默认值 = False

如果为 True，完成时将滚动到输出组件。

🔗

show_progress: Literal['full', 'minimal', 'hidden']

默认值 = "full"

事件运行时如何显示进度动画：“full”显示一个旋转器，覆盖输出组件区域和右上角的运行时显示，“minimal”仅显示运行时显示，“hidden”完全不显示进度动画。

🔗

show_progress_on: Component | list[Component] | None

默认值 = None

显示进度动画的组件或组件列表。如果为 None，将在所有输出组件上显示进度动画。

🔗

queue: bool

默认值 = True

如果为 True，如果队列已启用，则会将请求放入队列。如果为 False，即使队列已启用，也不会将此事件放入队列。如果为 None，则将使用 Gradio 应用程序的队列设置。

🔗

batch: bool

默认值 = False

如果为 True，则函数应处理一批输入，这意味着它应为每个参数接受一个输入值列表。这些列表的长度应相等（且长度不超过 `max_batch_size`）。然后，函数*必须*返回一个列表元组（即使只有一个输出组件），元组中的每个列表对应一个输出组件。

🔗

max_batch_size: int

默认值 = 4

如果从队列调用此批处理（仅当 batch=True 时相关），则最大批处理输入数。

🔗

preprocess: bool

默认值 = True

如果为 False，则在运行 'fn' 之前不会对组件数据进行预处理（例如，如果此方法使用 `Image` 组件调用，则将其保留为 base64 字符串）。

🔗

postprocess: bool

默认值 = True

如果为 False，则在将“fn”输出返回到浏览器之前不会对组件数据进行后处理。

🔗

cancels: dict[str, Any] | list[dict[str, Any]] | None

默认值 = None

当此监听器触发时要取消的其他事件列表。例如，设置 cancels=[click_event] 将取消 click_event，其中 click_event 是另一个组件的 .click 方法的返回值。尚未运行的函数（或正在迭代的生成器）将被取消，但当前正在运行的函数将允许完成。

🔗

trigger_mode: Literal['once', 'multiple', 'always_last'] | None

默认值 = None

如果为 "once"（所有事件的默认值，除了 `.change()`），则在事件待处理时不允许任何提交。如果设置为 "multiple"，则在待处理期间允许无限次提交，而 "always_last"（`.change()` 和 `.key_up()` 事件的默认值）则允许在待处理事件完成后进行第二次提交。

🔗

js: str | Literal[True] | None

默认值 = None

在运行 'fn' 之前运行的可选前端 JS 方法。JS 方法的输入参数是 'inputs' 和 'outputs' 的值，返回应为输出组件的值列表。

🔗

concurrency_limit: int | None | Literal['default']

默认值 = "default"

如果设置，这是可以同时运行此事件的最大数量。可以设置为 None 表示没有并发限制（可以同时运行任意数量的此事件）。设置为 "default" 表示使用默认并发限制（由 `Blocks.queue()` 中的 `default_concurrency_limit` 参数定义，其本身默认为 1）。

🔗

concurrency_id: str | None

默认值 = None

如果设置，这是并发组的 ID。具有相同 concurrency_id 的事件将受最低设置的 concurrency_limit 限制。

🔗

show_api: bool

默认值 = True

是否在 Gradio 应用程序的“查看 API”页面或 Gradio 客户端的“.view_api()”方法中显示此事件。与将 api_name 设置为 False 不同，将 show_api 设置为 False 仍将允许下游应用程序和客户端使用此事件。如果 fn 为 None，show_api 将自动设置为 False。

🔗

time_limit: int | None

默认值 = None

🔗

stream_every: float

默认值 = 0.5

🔗

like_user_message: bool

默认值 = False

🔗

key: int | str | tuple[int | str, ...] | None

默认值 = None

此事件监听器的唯一键，用于 @gr.render()。如果设置，此值会在键相同时将事件标识为跨重新渲染的相同事件。

←

Model3D

Number

→

MultimodalTextbox

描述

创建一个文本区域，供用户输入字符串或显示字符串输出，并允许上传多媒体文件。

行为

作为输入组件: 将文本值和文件列表作为 dict 传入函数。

您的函数应该接受以下类型之一

作为输出组件: 期望一个 dict，包含可选的 "text" 和 "files"。files 数组是文件路径或 URL 的列表。

您的函数应该返回以下类型之一

初始化

快捷方式

演示

事件监听器

描述

支持的事件监听器

事件参数

作为输入组件: 将文本值和文件列表作为 `dict` 传入函数。

作为输出组件: 期望一个 `dict`，包含可选的 "text" 和 "files"。files 数组是文件路径或 URL 的列表。