Inside Dify

Dify 0.3.13 Release: Effortlessly Leverage Top Open-Source LLMs like Llama2 and ChatGLM



Aug 15, 2023

Many attentive users noticed updates to our product over the weekend—we’ve integrated many new LLMs from model suppliers, a long-awaited version V0.3.13 for many friends.

Previously, Dify has successively supported excellent global LLM representatives like OpenAI's GPT series, Anthropic's Claude series, Azure OpenAI series. This update allows everyone to easily use popular open-source LLMs including Llama2, ChatGLM, Baichuan, and Qwen-7B. For models hosted on Hugging Face and Replicate, you only need to enter the API key on Dify to easily access them.

Easily leverage leading open-source LLMs

Dify supports all models on Replicate and Hugging Face. You can easily implement these open-source LLMs to quickly build AI applications with superior performance.

Let's see how to leverage the Llama2 open-source LLM hosted on Replicate in Dify!

Explore the potential of different models on Dify.AI

Quick switching allows you to test the performance of different models. Based on the integrated LLMs in Dify, you can quickly switch between different models under the application page, and rapidly evaluate the specific performance of each model in a particular scenario within minutes. This can help you make more informed model selections based on test results for the best experience.

Reduce the cost of comparing and selecting model capabilities. In the past, when you needed to explore the capability boundaries of different models, you had to study the performance of each model, constantly adjust different parameters, and invest a lot of time and effort. But now with Dify, you just need to click to switch between models in the application's model selector to compare different model capabilities. In addition, Dify has fine-tuned each model and set the optimal system Prompt to simplify complex parameter settings. You don't need to learn the usage details of each model, just select and get the optimal model solution directly on Dify. This greatly lowers the threshold for model selection and tuning, allowing you to build LLM applications more efficiently.

Which large language model to use?

The endless new models and iteration speed this year are dazzling, a true "hundred models bloom" era. Cost of deployment, training, inference performance, model capability, etc. all play a crucial role in model selection. Below is a summary of the strengths and capabilities of some popular open-source LLMs for everyone's reference, based on evaluations by major professional institutions and official disclosures:

Llama2: As we all know, Meta's Llama is the "ancestor" of open-source models, and its capabilities even surpass GPT-3, but unfortunately it cannot be used commercially. But Meta recently released the Llama 2 series, which allows commercial use—any enterprise or developer can use it for commercial purposes, with significant improvements in scale, training data, and architecture. Compared to Llama 1, Llama 2 increased pretraining data by 40% to 200 billion units. At the same time, the input length doubled to 4096 tokens, making Llama 2 more suitable for long sequence tasks.

ChatGLM: ChatGLM-6B is a Chinese-English conversational model with 620 million parameters by Zhishu AI. It supports inference on consumer GPUs. Trained on 10 trillion Chinese-English tokens, ChatGLM is optimized through multiple techniques. Although weaker in memory and language capabilities due to the smaller model size, it lowers deployment barriers and improves efficiency. ChatGLM2-6B is the second generation version on Hugging Face, retaining features like smooth conversations while achieving comprehensive improvements in performance, efficiency and compatibility.

Baichuan: Baichuan AI developed the Baichuan models, with open source 7B and 13B versions. Baichuan-7B has 7 billion parameters trained on about 120 billion tokens, while Baichuan-13B uses larger parameters and data, with more efficient pretraining and alignment. According to officials, BaiChuan will launch a closed source 53B model next month, providing an online platform and API.

Qwen-7B: Alibaba Cloud open sourced Qwen-7B, part of its Qwen series. With 70 billion parameters, it became the second most popular text generation model after Llama2 on Hugging Face. Qwen-7B is trained on diverse, extensive pretraining data covering massive texts, books, code, etc. The rich resources provide Qwen-7B with a solid knowledge base for high adaptability on various tasks like dialogue, text generation, and Q&A.

In addition to the above, Dify will continue supporting models like Google's PaLM, BaiChuan's 53B, and fine-tuned local models.

For more features and fixes please refer to:

via @dify_ai

If you like Dify, give it a Star ⭐️.