Unlock the potential of RAG and use AI gateways to help Dify applications "cheat"

Wang Haoting、Zhao Yuanxiao

Aug 28, 2025

Share on X

Dify is an open-source AI application development platform designed to help developers and non-technical personnel quickly build and operate applications based on generative AI. As of now, the number of stars for Dify's open-source has surpassed 110,000, boasting a large user base, making it one of the popular choices for building generative AI applications.

However, as Dify continues to be implemented in real production practices, feedback from cloud clients and the community has revealed issues with its built-in RAG engine, particularly its low capability for complex text chunk processing, weak retrieval functions, and insufficiently simple and intelligent configuration. These issues have directly led to suboptimal recall quality of the built-in RAG in Dify, thereby affecting the accuracy and reliability of the content generated by large models, making it difficult to meet the enterprise-level production environment's need for high-precision knowledge retrieval.

To address the concerns raised by cloud clients and community users, the Higress AI gateway acts as a key bridge, supporting Dify applications to conveniently integrate with mature RAG engines in the industry. By combining Dify's efficient orchestration capabilities with the retrieval effectiveness of professional RAG engines through the AI gateway, companies can effectively avoid the limitations of Dify's built-in RAG while retaining their existing Dify application assets, significantly improving the performance of knowledge-driven AI applications in production environments.

The Limitations of Dify's Built-in RAG Engine

By analyzing issues reported in Dify's open-source community over the last few months, along with internal Dify production practices and feedback from cloud clients, it appears that although Dify currently provides a plug-and-play built-in RAG engine, practical implementation still encounters several challenges. Commonly reported issues from cloud and community clients include:

Insufficient capability for complex document processing: Vulnerable in parsing and chunking non-structured documents that include images, charts, PDFs, etc., with limited information extraction accuracy.
Weak retrieval functionality: The built-in retrieval policy performs poorly in complex queries or large knowledge base scenarios, resulting in inadequate recall and relevance ranking, leading to key information omissions or incorrect ordering.
Configuration not sufficiently simple or intelligent: There are many configuration options such as chunking strategy and parameter adjustments, lacking adaptive optimization, thus requiring a high technical threshold for users to fine-tune, resulting in poor usability.

These issues indicate that the built-in RAG engine of Dify still has considerable room for improvement compared to widely recognized high-quality RAG engines in the industry. From an open-source perspective, enhancing the built-in RAG capabilities of Dify is an ongoing process that requires continual iteration and optimization.

Helping Dify Applications "Level Up" through the AI Gateway

Currently, numerous RAG engines are emerging in the market, but varying degrees of usability and operational effectiveness exist. Building an excellent RAG engine involves more than just vectorization, vector storage, and vector matching; it also requires high-quality content understanding and processing algorithms as well as retrieval optimization strategies, along with continual tuning.

Fortunately, many outstanding RAG engines have gradually surfaced in the market, receiving increasing recognition from developers. For example, Alibaba Cloud's Bailian knowledge base is recognized for its simple configuration and plug-and-play features, combined with ongoing optimization at the underlying level, leading to its RAG performance being acknowledged by an increasing number of enterprise users; RagFlow, known for its deep document understanding capabilities, as an open-source professional RAG engine, is favored by users with data storage security and privacy requirements. It is currently possible to deploy highly available RagFlow instances with one-click on Alibaba Cloud SAE, significantly reducing self-deployment and operational cost.

Therefore, quickly connecting more professional and high-quality RAG engines currently serves as an optimal solution.

To overcome the limitations of Dify's built-in knowledge base, the Higress AI gateway supports Dify applications to quickly integrate with external high-quality RAG engines, replacing its native functionalities with higher quality RAG capabilities, thus enabling users to enjoy a more professional text processing and information retrieval experience while utilizing Dify’s powerful Workflow and Agent orchestration capabilities.

According to users' diverse needs for processing retrieval results, the Higress AI gateway offers two flexible integration solutions:

Solution One: RAG Retrieval Proxy. The Higress AI gateway simply conducts the retrieval and returns it to the user, who independently processes the retrieval results and updates them into the Context. This solution is suitable for more complex scenarios where users have custom needs for information integration.
Solution Two: Automatic Retrieval Injection. In the LLM invocation pipeline, the Higress AI gateway automatically executes the RAG retrieval results and injects them into the Context. This option is suitable for simpler scenarios where users only care about the LLM invocation results, without needs for autonomous information processing.

Solution One: RAG Retrieval Proxy

Based on Dify's external knowledge base extension capabilities, through the Higress AI gateway acting as a proxy, it can create and connect RagFlow and Bailian knowledge bases within Dify's knowledge base. For Workflow, knowledge retrieval nodes can be used to choose the corresponding external knowledge base for retrieval results; for Agent, the corresponding external knowledge base can be selected directly to implement retrieval. For detailed introduction, please refer to AI RAG Retrieval Proxy.

This method is specifically designed for Dify applications and aligns with Dify's standard usage, enabling access to external knowledge bases through Dify's knowledge base extension capability, while conveniently observing input and output information of knowledge retrieval nodes via Dify's built-in observation capabilities.

Solution Two: Automatic Retrieval Injection

When Dify applications access LLM, utilizing the Model API proxy of the Higress AI gateway, the Higress AI gateway automatically performs RAG retrieval before invoking the LLM and writes the results into the Context of the invoked model. The Context writing methods include: appending new system prompts; adding to user-specified Prompt template positions. For detailed introduction, please refer to AI Retrieval Enhanced Generation (Advanced Version).

This method enables a transparent experience for application developers, as there is no need to manually implement complex knowledge base retrieval steps; users simply invoke the model to automatically gain plug-and-play RAG capabilities. Additionally, this method is also applicable to various platforms or frameworks such as Spring AI Alibaba, N8N, etc.

Operational Guide and Effect Display

Next, this article will detail the operational methods and actual effects of the above two solutions by using the RAG retrieval proxy solution to connect Dify with Bailian knowledge base, and the automatic retrieval injection solution to connect Dify with RagFlow as examples.

Case Study One: Dify Knowledge Retrieval Node Accessing Bailian Knowledge Base

This section will take Dify applications connecting to the existing Bailian knowledge base as an example to introduce the operational methods and actual effects of the RAG retrieval proxy solution.

AI gateway creates Bailian service and custom Agent API routing, serving as the external knowledge base API for Dify.

a. AI gateway creates Bailian retrieval service.

b. Create a custom Agent API. Click Agent API - Create Agent API; the domain name and Base Path can be customized as needed, and select custom protocol as shown below.

c. Create Agent API routing. Enter the created Agent API, click to create route, ensuring that the path suffix is /retrieval and selecting the Bailian service created in the previous steps.

AI Gateway configuration plugins.
a. Obtain Bailian API KEY. Log in to Alibaba Cloud Bailian platform API Key to acquire the API Key.
b. Configure plugins. In the gateway instance console, click on plugins - install plugins - AI, select the AI RAG Retrieval Proxy plugin, click install and configure rules, enable it, and click save for the plugin to take effect as shown below.

Dify creates Bailian knowledge base.

a. Dify knowledge base creates external API. In the Dify console, click Knowledge Base - External Knowledge Base API - add external knowledge base API, with the configuration example shown below.

b. Obtain the knowledge base ID. Go to the Bailian knowledge base console, select the knowledge base to be retrieved and obtain the ID.

c. Configure knowledge base information. In the Dify console, click Knowledge Base - connect to external knowledge base, with the configuration example shown below.

Verify retrieval connectivity. Click the knowledge base created in the above steps on the Dify knowledge base page, input sample text for recall testing, and if the text chunks are returned according to the recall settings, it indicates successful connectivity. Afterwards, the Dify applications can access the already created knowledge base for Bailian RAG retrieval in Workflow and Agent.

Case Study Two: Automatically Retrieve RagFlow before Model Invocation

Next, this section will take Dify applications connecting to the existing RagFlow knowledge base as an example to introduce the operational methods and actual effects of the automatic retrieval injection solution.

Deploy RagFlow, create the knowledge base, and upload knowledge. For enterprise-level scenarios, using Alibaba Cloud SAE one-click deployment of high-availability RagFlow service is recommended to reduce deployment and operational costs; for details, please refer to RAGFlow Community Edition - Serverless Deployment.
Create AI services and text generation scenario Model API in the AI gateway, allowing Dify applications to access the models through this API. The operations for proxying Dify model traffic through the AI gateway can refer to Dify Performance Bottlenecks? Higress AI Gateway Injects the "Soul of High Availability"!.
AI Gateway plugin configuration.
1. Create RagFlow service in the AI gateway while obtaining the FQDN and Port of the RagFlow service.

b. Obtain RagFlow API Key. Enter the RagFlow console, click on the user avatar in the upper right corner -> select API on the left -> API KEY; obtain the API Key.

c. Get RagFlow knowledge base ID. In the RagFlow knowledge base page, click on the corresponding knowledge base; the id in the URL of the webpage is the knowledge base ID.

d. Configure the gateway plugin. In the gateway instance console, click on plugins - install plugins - AI, select the AI Retrieval Enhanced Generation (Advanced Version) plugin, click install and configure rules, set them to the corresponding effective range, and input the parameters obtained in previous steps into the designated parameter positions, as shown in the configuration example below.

Debug and verify effectiveness. In the AI gateway instance console, click on the Model API where the plugin is effective and debug to verify the model's return results after adding automatic retrieval capabilities. Once confirmed, access the API in Dify to utilize the model, thereby achieving the RAG capabilities of connecting to the RagFlow knowledge base.

Summary and Outlook

Due to the limitations in the effectiveness of Dify's built-in RAG engine in production practices, many Dify application developers hope to easily connect more external knowledge bases to enrich the selection of RAG systems. The Higress AI gateway provides solutions for quickly connecting external RAG engines, combining Dify's efficient orchestration capabilities with the retrieval effectiveness of professional RAG engines. By helping Dify applications "level up," the main benefits are as follows:

Performance Leap: By integrating professional engines such as RAGFlow and Bailian knowledge base, significantly enhance the quality of knowledge chunking and retrieval accuracy.
Seamless Enhancement: Achieve configuration-based implementation without needing to alter Dify application code to obtain advanced RAG capabilities, with zero development cost.
Flexible Adaptation: Supporting selection of RAG engines for privatized deployment or SaaS services of open-source engines, meeting diverse scenario requirements.

Currently, this capability has been launched in the Alibaba Cloud Native AI Gateway. In addition, the Higress AI gateway also provides rich capabilities such as security and high-availability governance to enhance the security and availability of Dify applications. In terms of RAG, the Higress AI gateway will continue to deepen RAG capabilities and explore ongoing innovations in multi-modal, ecological expansion, and scenario expansion, helping Dify and other AI applications transition from "usable" to "high-precision, high-reliability" enterprise-level knowledge hubs.

Contact

Follow and engage with us through the following channels to stay updated on the latest developments from higress.ai.

https://medium.com/@higress_ai