Insights

Marketplace

Solutions

Pricing

Docs

Blog

Community

NEW

GitHub

Get Started

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Dify has integrated TiDB Vector into knowledge pipeline to support distributed semantic search, accelerate knowledge retrieval speed, and provide scalable context management functions for production-level AI applications.

Zhenan Sun

Digital Marketing

Written on

Oct 30, 2025

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Oct 30, 2025

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Zhenan Sun

Digital Marketing

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Zhenan Sun

Digital Marketing

Written on

Oct 30, 2025

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Oct 30, 2025

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Share to Twitter

Share to LinkedIn

Share to Hacker News

Product

Oct 30, 2025

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Share to Twitter

Share to LinkedIn

Share to Hacker News

We’re excited to share that Dify’s Knowledge Pipeline now officially integrates with TiDB Vector, a high-performance distributed vector database, strengthening our data foundation for large-scale, intelligent retrieval.

A few weeks ago, we launched the Knowledge Pipeline to help developers build modular RAG workflows for knowledge ingestion, parsing, and embedding. With the integration of TiDB Vector, developers can seamlessly reuse the processed knowledge, such as parsed tables, extracted entities, and embedded text, across downstream AI applications like Agent Workflows or Chatflows.

TiDB Vector provides a unified, high-performance data layer that supports hybrid SQL + vector search, enabling developers to use SQL syntax to filter structured metadata before performing semantic retrieval. This ensures precise, context-aware responses while maintaining enterprise-grade scalability and stability through TiDB’s distributed architecture.

Quick Start: Configuring TiDB Vector with Dify
Prerequisites

Make sure that you have deployed Dify locally and created a TiDB Cloud account.

I. Create a TiDB Vector cluster on TiDB Cloud：

Create a cluster, and configure the relevant information
Initialize a schema:

create schema dify;

Get Cluster Properties, note: HOST, PORT=4000, USER, PASSWORD, DATABASE.

II. Set Dify environment:

In docker-compose.yaml, update both api and worker service environments using values from your TiDB cluster properties:

# Use TiDB Vector as the vector store

VECTOR_STORE: tidb_vector

# Fill in the relevant information

TIDB_VECTOR_HOST: gateway01.eu-central-1.prod.aws.tidbcloud.com

TIDB_VECTOR_PORT: 4000

TIDB_VECTOR_USER: <your_user>.root

TIDB_VECTOR_PASSWORD: <your_password>

TIDB_VECTOR_DATABASE: dify

III. Upload and Process Data in the Pipeline:

In Dify, you can create Knowledge Pipelines to process unstructured data before storing it in TiDB. The pipeline automatically handles extraction, chunking, embedding, and storage of vectors into TiDB Vector, according to your configuration.

We make setup easier with several ready-to-use processing templates. You can simply pick from several pre-built templates designed for common scenarios — for instance, efficient processing for general documents, advanced parent-child chunking for lengthy technical manuals, or structured Q&A extraction from tabular data.

When you finish creating the knowledge base and uploading your data, the processed content, including text, metadata, and generated embeddings, is securely stored in TiDB.

Ⅳ. Build a RAG Application Workflow：

Once your knowledge base is ready, you can use it as context to build an agentic workflow. The data stored in TiDB Vector can then be retrieved through the Knowledge Node as contextual knowledge, helping LLMs reason with greater accuracy.

The Knowledge Retrieval Node

Choose the knowledge base you previously created, which now uses TiDB Vector for vector storage, as the data source. When a user submits a query, this node automatically converts the question into a vector, runs a similarity search in TiDB, and retrieves the most relevant information.

The LLM Node

The information retrieved from TiDB is passed to the LLM node, enabling the model to generate responses that are more accurate, context-aware, and grounded in real-world data.

With Dify’s visual workflow builder and TiDB’s distributed vector storage, teams can efficiently design, deploy, and scale AI assistants, document Q&A systems, and knowledge bots.
This modular and end-to-end approach streamlines every step of RAG development, from data ingestion and retrieval to generation, within a single unified workflow.

About TiDB

TiDB is an open-source, distributed SQL database designed to drive enterprise digital transformation. Its distributed architecture offers a scalable data infrastructure that supports a variety of business workloads. Traditionally, organizations needed multiple technology stacks to address different data processing needs, but TiDB consolidates these capabilities into a unified real-time HTAP platform, supporting both transactional and analytical tasks.

Learn more

About Dify

Dify is an open-source platform for developing LLM applications. Its intuitive interface combines agentic AI workflows, RAG pipelines, agent capabilities, model management, observability features, and more—allowing you to quickly move from prototype to production.

On this page

Product

Kakaku Accelerates AI Adoption with Dify: Fast, Secure, and Scalable

Kakaku.com adopted Dify Enterprise to turn scattered AI experiments into production-ready solutions in hours. Before long, 75% of employees had created nearly 950 internal apps, making AI a real force for company-wide innovation.

Jing Yan

Nov 21, 2025

Product

Kakaku Accelerates AI Adoption with Dify: Fast, Secure, and Scalable

Jing Yan

Nov 21, 2025

Product

EdgeOne Pages Plugin: Rapid Deployment, Instant Validation

The EdgeOne Pages plugin on the Dify Marketplace enables instant deployment of ZIP or HTML files as public URLs, no setup required. Powered by Tencent Cloud EdgeOne’s CDN, it delivers secure, low-latency performance and automatic versioning, streamlining rapid prototyping and MVP testing so developers can focus on innovation, not infrastructure.

Tencent EdgeOne

Nov 5, 2025

Product

EdgeOne Pages Plugin: Rapid Deployment, Instant Validation

Tencent EdgeOne

Nov 5, 2025

Product

Dify x Qdrant: Building and Powering the Next-Gen AI Applications

Dify now integrates Qdrant, a high-performance Rust-based vector database, delivering faster retrieval, hybrid search, and scalable performance for building reliable, production-ready AI and knowledge-based applications.

Anne Zhu

Oct 29, 2025

Product

Dify x Qdrant: Building and Powering the Next-Gen AI Applications

Anne Zhu

Oct 29, 2025

Product

Structured Web Data, Simplified: Bright Data’s Web Scraper Extension Lands on Dify Marketplace

Dify has partnered with Bright Data to launch the new Bright Data Web Scraper plugin, enabling direct access to real-time, structured web data through the Dify Marketplace and enriching the Knowledge Pipeline with external knowledge sources.

Aileen Li

Oct 21, 2025

Product

Structured Web Data, Simplified: Bright Data’s Web Scraper Extension Lands on Dify Marketplace

Aileen Li

Oct 21, 2025

Ready to Build the AI App of Tomorrow?

Launch production-ready agents powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Ready to Build the AI App of Tomorrow?

Launch production-ready agents powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Ready to Build the AI App of Tomorrow?

Launch production-ready agents powered by RAG pipelines, integrations, and full observability - no heavy lifting required.

Resources

Docs

Blog

Education

Partner

Events

Support

Roadmap

Company

Talk to Us

Cookie Settings

Data Protection Agreement

Marketplace Agreement

End User License Agreement

Dify Affiliate Program Agreement

Dify Brand Guidelines

Dify Brand Usage Terms

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic Workflow

Resources

Docs

Blog

Education

Partner

Events

Support

Roadmap

Company

Talk to Us

Cookie Settings

Data Protection Agreement

Marketplace Agreement

End User License Agreement

Dify Affiliate Program Agreement

Dify Brand Guidelines

Dify Brand Usage Terms

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic Workflow

Resources

Docs

Blog

Education

Partner

Events

Support

Roadmap

Company

Talk to Us

Cookie Settings

Data Protection Agreement

Marketplace Agreement

End User License Agreement

Dify Affiliate Program Agreement

Dify Brand Guidelines

Dify Brand Usage Terms

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic Workflow

Resources

Docs

Blog

Education

Partner

Events

Support

Roadmap

Company

Talk to Us

Cookie Settings

Data Protection Agreement

Marketplace Agreement

End User License Agreement

Dify Affiliate Program Agreement

Dify Brand Guidelines

Dify Brand Usage Terms

Unlock Agentic AI with Dify. Develop, deploy, and manage autonomous agents, RAG pipelines, and more for teams at any scale, effortlessly.

Build Production-Ready Agentic Workflow

Insights

Insights

Insights

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Dify x TiDB: Supercharge Your Knowledge Pipeline with Distributed Vector Storage

Quick Start: Configuring TiDB Vector with DifyPrerequisites

I. Create a TiDB Vector cluster on TiDB Cloud：

II. Set Dify environment:

III. Upload and Process Data in the Pipeline:

Ⅳ. Build a RAG Application Workflow：

About TiDB

About Dify

Jing Yan

Jing Yan

Tencent EdgeOne

Tencent EdgeOne

Anne Zhu

Anne Zhu

Aileen Li

Aileen Li

Ready to Build the AI App of Tomorrow?

Ready to Build the AI App of Tomorrow?

Ready to Build the AI App of Tomorrow?

Ready to Build the AI App of Tomorrow?

Resources

Company

Resources

Company

Resources

Company

Resources

Company

Quick Start: Configuring TiDB Vector with Dify
Prerequisites