Today, we are excited to share that Tavily, a powerful web search API designed for AI applications, RAG, and agentic flows, is now available as a data source plugin in the Dify Marketplace.
A few weeks ago, we launched the Dify Knowledge Pipeline, a new starting point for users to orchestrate their RAG workflow. Every component is pluggable, from uploading and parsing data sources to chunking and embedding them. This flexibility lets users build the optimal RAG solution for their unstructured data, ensuring LLM responses that are both accurate and context-aware.

Below, we break down a way Tavily can fit into your Knowledge Pipeline.
Getting Started with Tavily
Setup
Install the Tavily plugin from Dify Marketplace
Get your API key from Tavily by signing up and making a free account, and add the key in Dify: Settings > Data Source > Tavily.

Build Your Pipeline
Choose a template or build from scratch.
We've created several templates for common use cases: simple documents, long technical manuals, complex PDFs, and structured tables. Start with a template, customize it for your needs, and deploy.

Here's one way: use the LLM-generated Q&A template and select Tavily as your data source.
This pipeline uses an LLM to extract key information from Tavily's web crawl and generate Q&A pairs. The processed data is stored in your knowledge base for on-demand retrieval.

A Few Tips:
Create a variable in settings for search query input.

Adjust crawl parameters (search depth, max results, topic filters, etc.) to fit your use case.

3. Configure your LLM node to use Tavily's output as context in the system prompt.

Final step: configure the Knowledge Base node.
Set your index, choose an embedding model, and adjust retrieval settings. Publish the pipeline, and you're good to go.

Putting It to Work
As an example, we crawled Tavily's help documentation. Each article is processed into Q&A pairs and stored in the knowledge base.

Now this knowledge base is available across all your Dify applications: chatbots, agents, workflows, anywhere you need to build with context.
About Tavily
Search. Extract. Crawl. The web access stack built for builders, by builders. Tavily powers the next generation of agents with a suite of tools for real-time Search, structured data Extraction, and fully-rendered Crawling — everything agents need to access and reason over the live web. Purpose-built for RAG, autonomy, and production-grade agent systems.
About Dify
Dify is an open-source platform for developing LLM applications. Its intuitive interface combines agentic AI workflows, RAG pipelines, agent capabilities, model management, observability features, and more—allowing you to quickly move from prototype to production.