Dify.AI's New Dataset Feature Enhancements
Oct 12, 2023
In the rapidly changing realm of data management, staying updated with the latest features and functionalities is crucial for maintaining a competitive edge. With this in mind, we are excited to roll out a series of updates aimed at enhancing the usability and efficiency of our dataset management tools. These updates introduce refined features and a user-friendly interface, ensuring an effortless and efficient interaction with datasets. Below is a comprehensive overview of the new features, and how they can contribute to a streamlined data management experience.
Referencing Dataset Documentation
Now, upon manually enabling the "Citations and Attributions" feature within the application orchestration, the output will display the referenced documentation sources, such as the names of the documents cited, and one can directly navigate to the respective dataset documentation editing page. This not only facilitates efficient document location, but also makes the modification of subsequent document segments much easier.
New Dataset API Features
The Dataset API service is a tool for efficiently managing and utilizing data documentation. With the Dify Dataset API feature, you can easily upload, real-time update, and effectively manage datasets. It’s tightly integrated with large models, further enhancing the user experience and significantly improving efficiency. Additionally, we provide examples to help everyone quickly understand and get hands-on practice.
How to Use Dataset API Feature?
Navigate to the Dataset page, where you can switch to the API page from the navigation on the left. On this page, you can view the Dataset API documentation provided by Dify and manage the credentials for accessing the Dataset API in the API key section.
Examples of Dataset API Calls
Create Empty Dataset
This method allows for the creation of an empty dataset.
Query dataset list by specifying the page number and the number of returns, aiding in dataset management and selection.
Create Document via Text
Easily import existing text data through a simple text upload interface.
Create Document via File
File upload feature further supports various file formats such as markdown, md, pdf, html, htm, xlsx, docx, csv, significantly expanding the choices.
Get Document Embedding Status (Progress)
View real-time data processing status, ensuring the timeliness and accuracy of data preparation.
Provides convenient document management features, enabling the deletion of unwanted documents as needed to maintain dataset cleanliness and effectiveness.
Dataset Document List
Provides a convenient and quick query interface, easily grasp the basic information of all documents in the dataset.
Add Document Segmentation:
The segmentation feature offers a flexible way to adjust the document structure, aiding in better organization and understanding of document content, thereby improving data usability and value.
Moreover, in this update, we have optimized some other minor features, bringing a smoother user experience for everyone, let’s take a look!
More Available Embedding Models:
The cloud version of Dify now supports the integration of open-source Embedding models hosted on Hugging Face, offering a broader selection. By switching and testing different model performances, you can find the Embedding model that best fits and performs in specific application scenarios.
Integration of GPT-3.5-turbo-instruct:
Dify has integrated the newly launched GPT-3.5-turbo-instruct model by OpenAI, which is a significant leap in improving user interaction with the model. It has been trained to address issues present in older models, capable of deeply understanding and executing user commands to provide clearer and more on-point answers, thus having a broader range of applications.