Insights

Marketplace

Enterprise

Pricing

Docs

Blog

GitHub

Get Started

Dify x Open Audio: Expand Your AI with the Fish Audio Plugin — TTS and Voice Cloning Made Easy

Dify integrates Fish Audio from Open Audio, enabling AI apps with text-to-speech and voice cloning.

Evan Chen

Product Manager

Leilei

Product Marketing

Written on

Mar 26, 2025

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Mar 26, 2025

Dify x Open Audio: Expand Your AI with the Fish Audio Plugin — TTS and Voice Cloning Made Easy

Dify integrates Fish Audio from Open Audio, enabling AI apps with text-to-speech and voice cloning.

Evan Chen

Product Manager

Leilei

Product Marketing

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Dify x Open Audio: Expand Your AI with the Fish Audio Plugin — TTS and Voice Cloning Made Easy

Dify integrates Fish Audio from Open Audio, enabling AI apps with text-to-speech and voice cloning.

Evan Chen

Product Manager

Leilei

Product Marketing

Written on

Mar 26, 2025

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Mar 26, 2025

Dify x Open Audio: Expand Your AI with the Fish Audio Plugin — TTS and Voice Cloning Made Easy

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Mar 26, 2025

Dify x Open Audio: Expand Your AI with the Fish Audio Plugin — TTS and Voice Cloning Made Easy

Share to Twitter

Share to LinkedIn

Share to Hacker News

We are thrilled to announce a new collaboration between Dify and Open Audio. The versatile Fish Audio toolset plugin from Open Audio is now available on the Dify Marketplace. This integration enables Dify users to seamlessly incorporate high-quality text-to-speech and voice cloning into their AI applications.

Core Functions of Fish Audio

Fish Audio excels in speech generation and processing, offering the following key capabilities:

Speech Generation (TTS): Fish Audio provides robust real-time text-to-speech conversion. It features a WebSocket API for streaming audio output, giving users control over parameters like speed and volume. It supports common audio formats including Opus, MP3, and WAV.

Voice Cloning: The tool also features excellent voice cloning abilities. Users can perform fast cloning with just 30-45 seconds of voice samples.

Getting Started

To begin using Fish Audio tools in Dify, find and install the "Fish Audio" plugin from the Dify Marketplace.

Next, configure the plugin with your Fish Audio API key and endpoint URL, which you can obtain from here. You'll also need to select the balance mode during this setup.

Using the Fish Audio TTS Tool in a Dify Chatflow

For instance, you can build a Dify chatflow where a Large Language Model (LLM) generates text. You can then use the Fish Audio Text-to-Speech (TTS) tool node to automatically convert that text output into an audio segment.

To configure the Fish Audio TTS node within your workflow:

Input Text: Specify the text you want to convert to speech. In this case, you would link the text output from the LLM node to the input field of the TTS node.
Select Voice: Choose the desired voice by selecting the appropriate Voice ID.
Output Format: Set your preferred output audio file type.

This setup allows the workflow to seamlessly generate speech from the LLM's written response using the specific voice and format you've chosen.

Understanding Voice ID

A Voice ID is the unique identifier for a specific voice model on the Fish Audio platform. It essentially represents a distinct voice profile that you can select for text-to-speech generation.

Creating and Using Custom Voices

You aren't limited to the standard voices. You can train your own unique voice model using the "Build Voice" feature within Fish Audio. Once the training process is complete, you can find your custom trained voice listed in your "My Library". Simply copy the Voice ID associated with your custom voice from there to use it in your Dify workflows.

Real-World Use Cases

Multilingual Customer Support Scenarios Using Fish Audio's voice cloning feature, businesses can create custom voice models based on recordings of their top customer service representatives. The system then automatically turns written customer service replies into natural-sounding audio using these custom voices. It can even switch to the appropriate voice and language automatically based on the customer's language. This whole process leverages Fish Audio's core capabilities: voice cloning, automatic speech recognition (ASR), and text-to-speech (TTS), leading to more natural and efficient customer interactions.
Creating Educational and Training Content For education and training, Fish Audio helps quickly create standardized course materials. For instance, in language learning, it can clone the voices of native speakers to provide clear pronunciation examples, while also using ASR technology to give real-time feedback on a learner's pronunciation. Furthermore, TTS technology can generate consistent audio explanations for course content. This streamlines both the creation and delivery of educational materials, ensuring consistency.
Podcast and Media Content Creation Fish Audio offers media creators a flexible solution for producing content. Creators can use samples of their own voice to create a personalized digital voice and then use this model to turn written scripts into audio recordings. In post-production, the ASR feature can quickly generate transcripts and subtitles, making the content more accessible. The platform also allows adjusting things like speaking speed and emotional tone to ensure the final audio perfectly fits their creative needs.

About Open Audio

Open Audio is a Research lab belonging to Hanabi AI Inc, dedicated to providing better audio-related projects for the open-source community. Currently, its product Fish Audio offers audio synthesis and speech recognition capabilities that have reached industry-leading levels in both open-source and closed-source domains.

Website | Github | FishAudio | X | Discord

About Dify.AI

Dify.AI is revolutionizing AI-native application development by providing an open-source platform that simplifies the entire lifecycle of AI application creation, deployment, and management. With its extensible plugin ecosystem, Dify.AI enables developers and businesses to seamlessly integrate AI capabilities, customize workflows, and accelerate innovation. By lowering the barriers to AI adoption, Dify.AI empowers users to build intelligent applications with greater efficiency and flexibility.

On this page

Product

Dify Integrates Palo Alto Networks Plugin for Enhanced AI Application Security

The addition of the PANW AI Security plugin enriches the Dify Marketplace ecosystem and provides Dify users with a crucial layer of enterprise-grade security.

Leilei

Apr 30, 2025

Product

Dify Integrates Palo Alto Networks Plugin for Enhanced AI Application Security

The addition of the PANW AI Security plugin enriches the Dify Marketplace ecosystem and provides Dify users with a crucial layer of enterprise-grade security.

Leilei

Apr 30, 2025

Product

Dify MCP Plugin Hands-On Guide: Integrating Zapier for Effortless Agent Tool Calls

Integrate Zapier's thousands of apps into Dify AI agents using the Model Context Protocol (MCP).

Leilei

Apr 1, 2025

Product

Dify MCP Plugin Hands-On Guide: Integrating Zapier for Effortless Agent Tool Calls

Integrate Zapier's thousands of apps into Dify AI agents using the Model Context Protocol (MCP).

Leilei

Apr 1, 2025

Product

DupDub Plugins Land on Dify Marketplace with Advanced Audio AI Capabilities

The DupDub AI audio plug-in is now available in the Dify Marketplace, providing voice translation, voice cloning, speaker recognition, and text-to-speech capabilities to help users build more engaging AI applications.

Dify

Dupdub

Mar 27, 2025

Product

DupDub Plugins Land on Dify Marketplace with Advanced Audio AI Capabilities

Dify

Dupdub

Mar 27, 2025

Product

Real-Time Interactive Voice AI Made Simple: Agora’s Conversational AI Extension Lands on Dify Marketplace

Agora’s Conversational AI Extension is now on Dify Marketplace, enabling developers to easily build real-time, low-latency voice AI agents.

Grace

Mar 24, 2025

Product

Real-Time Interactive Voice AI Made Simple: Agora’s Conversational AI Extension Lands on Dify Marketplace

Agora’s Conversational AI Extension is now on Dify Marketplace, enabling developers to easily build real-time, low-latency voice AI agents.

Grace

Mar 24, 2025

Build Agentic AI Solutions of Tomorrow Now

Launch production-ready agentic workflow powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Build Agentic AI Solutions of Tomorrow Now

Launch production-ready agentic workflow powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Build Agentic AI Solutions of Tomorrow Now

Launch production-ready agentic workflow powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Resources

Docs

Blog

Education

Partner

Support

Roadmap

Company

Talk to Us

Data Protection Agreement

Marketplace Agreement

Brand Guidelines

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic AI Solutions

Resources

Docs

Blog

Education

Partner

Support

Roadmap

Company

Talk to Us

Data Protection Agreement

Marketplace Agreement

Brand Guidelines

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic AI Solutions

Resources

Docs

Blog

Education

Partner

Support

Roadmap

Company

Talk to Us

Data Protection Agreement

Marketplace Agreement

Brand Guidelines

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic AI Solutions

Resources

Docs

Blog

Education

Partner

Support

Roadmap

Company

Talk to Us

Data Protection Agreement

Marketplace Agreement

Brand Guidelines

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic AI Solutions