Higress v2.1.8: 30 engine updates + 4 console updates

Higress Community

Oct 12, 2025

Share on X

01 Higress Engine Updates

Overview of This Release

This release includes 30 updates, covering various aspects such as feature enhancements, bug fixes, and performance optimizations.

Distribution of Updates

New Features: 13 items
Bug Fixes: 7 items
Refactoring and Optimization: 5 items
Documentation Updates: 4 items
Testing Improvements: 1 item

Key Focus Areas

This release includes 2 important updates, which should be given special attention:

feat: add rag mcp server (#2930): By introducing the RAG MCP server, it provides users with a new way to manage and retrieve knowledge, enhancing the functionality and usability of the system.
refactor(mcp): use ECDS for golang filter configuration to avoid connection drain (#2931): Using ECDS for filter configuration avoids the instability caused by directly embedding golang filter configurations, thus improving the stability and maintainability of the system, reducing unnecessary service interruptions for users.

For detailed information, please see the important feature descriptions below.

Important Feature Descriptions

Below are the detailed descriptions of important features and improvements included in this release:

1. feat: add rag mcp server

Related PR: #2930 | Contributor: @2456868764

Background

In modern applications, knowledge management and retrieval have become increasingly important. Many systems need to quickly and accurately extract and retrieve information from large amounts of text data. RAG (Retrieval-Augmented Generation) technology combines retrieval and generation models, greatly enhancing the efficiency and accuracy of knowledge management. This PR introduces a Model Context Protocol (MCP) server specifically for knowledge management and retrieval, addressing users' needs for efficient information processing. The target user group includes enterprises and developers who need to handle large amounts of text data, particularly in the fields of natural language processing (NLP) and machine learning.

Feature Description

This PR implements the RAG MCP server, adding several functional modules, including knowledge management, block management, search, and chat features. Core functionalities include:

Knowledge Management: Supports creating knowledge blocks from text.
Block Management: Provides the functionality to list and delete knowledge blocks.
Search: Supports keyword-based search functionality.
Chat Functionality: Allows users to send chat messages and receive responses. Technically, the server utilizes various external libraries, such as github.com/dlclark/regexp2, github.com/milvus-io/milvus-sdk-go/v2, and github.com/pkoukk/tiktoken-go, which provide functionalities for regular expression processing, vector database management, and text encoding. Key code changes include the addition of an HTTP client, configuration files, and multiple processing functions to ensure flexibility and configurability of the system.

Usage Instructions

The steps to enable and configure the RAG MCP server are as follows:

In the higress-config configuration file, enable the MCP server and set the corresponding path and configuration items.
Configure the basic parameters of the RAG system, such as block type, block size, and overlap.
Configure the LLM (large language model) provider and its API keys, model names, etc.
Configure the embedding model provider and its API keys, model names, etc.
Configure the vector database provider and its connection information. A sample configuration is as follows:

rag:
  splitter:
    type: "recursive"
    chunk_size: 500
    chunk_overlap: 50
  top_k: 5
  threshold: 0.5
llm:
  provider: "openai"
  api_key: "your-llm-api-key"
  model: "gpt-3.5-turbo"
embedding:
  provider: "openai"
  api_key: "your-embedding-api-key"
  model: "text-embedding-ada-002"
vectordb:
  provider: "milvus"
  host: "localhost"
  port: 19530
  collection: "test_collection"

Notes:

Ensure all configuration items are correct, especially the API keys and model names.
In production environments, it is recommended to appropriately adjust timeout parameters to accommodate different network conditions.

Feature Value

The RAG MCP server provides users with a complete knowledge management and retrieval solution, enhancing the system's intelligence and automation levels. Specific benefits include:

Improved Efficiency: Through integrated knowledge management and retrieval features, users can quickly process and retrieve large amounts of text data, saving time and resources.
Enhanced Accuracy: By combining RAG technology, the system can more accurately extract and retrieve information, reducing error rates.
Flexible Configuration: It offers rich configuration options, allowing users to make flexible adjustments based on actual needs and meet demands in different scenarios.
Strong Scalability: It supports various providers and models, making it easy for users to choose suitable components and tech stacks according to business needs.
Improved Stability: With detailed configuration validation and error handling mechanisms, it ensures the stability and robustness of the system.

2. refactor(mcp): use ECDS for golang filter configuration to avoid connection drain

Related PR: #2931 | Contributor: @johnlanni

Background

In the current implementation, Golang filter configurations are directly embedded in the HTTP_FILTER patch, which can lead to connection drain issues when configuration changes occur. The main reason is the inconsistent sorting of Go maps in the map[string]any field and listener configuration changes triggered by HTTP_FILTER updates. This problem affects system stability and user experience. The target user group includes developers and operators managing service meshes using Higress.

Feature Description

This PR separates configurations into two parts: the HTTP_FILTER only contains references to filters with config_discovery, while EXTENSION_CONFIG contains the actual Golang filter configurations. This way, configuration changes do not directly lead to connection drain. Specific implementations include updating the constructMcpSessionStruct and constructMcpServerStruct methods to return a format compatible with EXTENSION_CONFIG and updating unit tests to match the new configuration structure. The core technological innovation lies in using the ECDS mechanism to separate configurations, making configuration changes smoother.

Usage Instructions

Enabling and configuring this feature requires no additional actions, as it is handled automatically in the background. A typical usage scenario is when configuring Golang filters in Higress, the system automatically separates them into HTTP_FILTER and EXTENSION_CONFIG. Users only need to configure Golang filters as usual. It is important to ensure that all relevant configuration files are updated when upgrading to a new version and to thoroughly test in a production environment to ensure configuration changes do not introduce other issues.

Feature Value

By separating configurations and using ECDS, this feature eliminates the connection drain issue during configuration changes, significantly improving system stability and user experience. Furthermore, this design makes configuration easier to manage and maintain, reducing potential problems caused by configuration changes. This improvement is particularly important for large-scale service mesh deployments, as it can reduce service interruptions due to configuration changes, thereby improving overall system reliability and availability.

Complete Change Log

New Features (Features)

Related PR: #2926Contributor: @rinfxChange Log: This PR adds support for multimodal, function calls, and reasoning in vertex-ai, involving the introduction of regular expression libraries and improvements in processing logic.Feature Value: By adding new features, vertex-ai can better support application needs in complex scenarios, such as multimodal data processing and more flexible function calling methods, enhancing system flexibility and usability.
Related PR: #2917Contributor: @Aias00Change Log: This PR adds support for Fireworks AI, expanding the functionality of the AI agent plugin, including necessary configuration files and testing code additions.Feature Value: Increased support for Fireworks AI allows users to leverage the AI features provided by this platform, broadening the range of AI services that can be integrated into applications, and enhancing user experience.
Related PR: #2907Contributor: @Aias00Change Log: This PR upgrades wasm-go to support the outputSchema feature, involving dependency updates of jsonrpc-converter and oidc plugins.Feature Value: By supporting outputSchema, the functionality and flexibility of the wasm-go plugin are enhanced, enabling users to handle and define output data structures more conveniently.
Related PR: #2897Contributor: @rinfxChange Log: This PR adds multimodal support and reasoning functionality to the ai-proxy bedrock by extending the related code in bedrock.go.Feature Value: The newly added multimodal and reasoning support enriches the feature set of ai-proxy, enabling users to leverage advanced AI technologies to handle complex scenarios, enhancing system flexibility and usability.
Related PR: #2891Contributor: @rinfxChange Log: This PR adds the feature to configure specific detection services for different consumers in the AI content security plugin, allowing users to customize request and response checking rules based on needs.Feature Value: By supporting the setting of independent detection services for different consumers, this functionality enhances the system's flexibility and security, allowing users to have more precise control over the content review process, thereby meeting diverse security policy needs.
Related PR: #2883Contributor: @Aias00Change Log: This PR adds support for Meituan Longcat, including integration with the Longcat platform and related unit testing.Feature Value: The new support for Meituan Longcat expands the plugin's functionality range, allowing users to utilize technologies from more AI service providers, enhancing application flexibility and diversity.
Related PR: #2867Contributor: @Aias00Change Log: This PR adds support for Gzip configuration and updates the default settings. By adding gzip options in the Helm configuration file, users can customize compression parameters to optimize response performance.Feature Value: Increased support for Gzip configuration allows users to adjust the compression level of HTTP responses as needed, helping to reduce the volume of data transmitted, speeding up page load times, and enhancing user experience.
Related PR: #2844Contributor: @Aias00Change Log: This PR enhances the consistency hashing algorithm for load balancing by supporting useSourceIp, and modifies related Go code files and adds a sample configuration file.Feature Value: The new useSourceIp option allows users to perform consistent hashing load balancing based on the source IP address, which helps improve service stability and reliability under specific network conditions.
Related PR: #2843Contributor: @erasernoobChange Log: This PR adds NVIDIA Triton server support to the AI agent plugin, including relevant configuration instructions and code implementations.Feature Value: The new support for the Triton server expands the functionality set of the AI agent plugin, allowing users to leverage high-performance machine learning inference services.
Related PR: #2806Contributor: @C-zhaozhouChange Log: This PR makes ai-security-guard compatible with the MultiModalGuard interface, adding support for multimodal APIs and updating related documentation.Feature Value: By supporting multimodal APIs, the functionality of ai-security-guard is enhanced, enabling it to handle more complex content security scenarios, improving user experience and security.
Related PR: #2727Contributor: @Aias00Change Log: This PR adds end-to-end testing support for OpenAI, including test cases for non-streaming and streaming requests.Feature Value: The new end-to-end testing for OpenAI helps ensure the system remains stable and accurate when handling different types of requests, enhancing user experience.
Related PR: #2593Contributor: @XscaperrrChange Log: It adds the WorkloadSelector field to limit the scope of the EnvoyFilter, ensuring no interference with other components in the same namespace when there is an open-source istio environment.Feature Value: By limiting the EnvoyFilter to only apply to the Higress Gateway, it avoids interference with other istio gateways/sidecars in the environment, enhancing the safety and isolation of the configuration.

Bug Fixes

Related PR: #2938Contributor: @wydreamChange Log: This PR resolves the issue of attack detection failure caused by the absence of AttackLevel field support in MultiModalGuard mode, ensuring all levels of attacks can be accurately identified.Feature Value: By adding support for the AttackLevel field, system security is improved, preventing situations where high-risk level attack prompts are not intercepted, ensuring user experience and security.
Related PR: #2904Contributor: @johnlanniChange Log: Fixed the issue where the original Authorization header might be overwritten while processing HTTP requests. By saving and checking non-empty before writing to context, it ensures the accuracy and security of authentication information.Feature Value: This fix enhances system security and stability, avoiding potential authentication failures or security vulnerabilities due to lost authentication information, thereby enhancing user experience and trust.
Related PR: #2899Contributor: @Jing-zeChange Log: This PR optimizes the MCP server, including pre-parsing host modes to reduce runtime overhead and removing the unused DomainList field. Also, it fixes the SSE message format issue, particularly the handling of extra line breaks.Feature Value: By improving mode matching efficiency and memory usage, and correcting errors in SSE messages, it enhances user experience and service stability, ensuring correct and complete data transmission.
Related PR: #2892Contributor: @johnlanniChange Log: Corrected the JSON unmarshalling error when the Claude API returns the array format content and removed duplicate code structures, improving code quality and maintainability.Feature Value: This resolves the issue of message parsing failure due to incorrect data types, enhancing system stability and user experience, ensuring smooth message processing workflows for users using arrays as content formats.
Related PR: #2882Contributor: @johnlanniChange Log: Solved the SSE event chunking issue in the conversion logic of Claude stream responses, improving automatic protocol conversion and tool invocation state tracking.Feature Value: Enhances the bidirectional conversion reliability between Claude and OpenAI-compatible providers, avoiding connection blocking and improving user experience.
Related PR: #2865Contributor: @Thomas-EliotChange Log: This PR resolves the issue where the SSE connection is blocked when the SSE events are split into multiple chunks. It does this by adding a caching mechanism in the proxy mcp server scenario to ensure continuity in data flow processing.Feature Value: This fixes potential issues that could lead to interruptions in SSE connections, enhancing system stability and user experience. Users will no longer encounter incomplete data reception due to network conditions or server response methods.
Related PR: #2859Contributor: @lcfangChange Log: This PR adds a new vport element in mcpbridge, solving the issue where routing configurations become ineffective if registered service instance ports are inconsistent. Major changes include updates to CRD definitions, protobuf files, and related generated code.Feature Value: This functionality ensures that even if backend instance ports change, routing configurations for the service remain valid, thus improving system stability and compatibility, providing users with a more reliable service experience.

Refactoring

Related PR: #2933Contributor: @rinfxChange Log: Removed duplicate think tags in bedrock and vertex, reducing redundant code and improving code readability and maintainability.Feature Value: By removing unnecessary duplicate code, the overall quality and development efficiency of the project are improved, making the code structure clearer and facilitating future maintenance and expansion.
Related PR: #2927Contributor: @rinfxChange Log: This PR modifies the API name extraction logic in the ai-statistics plugin, adjusting the check condition from a fixed length of 5 to at least 3 parts, improving flexibility and compatibility.Feature Value: By loosening the restrictions on API string splitting, it enhances the system's ability to support different formats of API strings, improving adaptability and stability.
Related PR: #2922Contributor: @daixijunChange Log: This PR upgrades the Higress SDK package name used in the project from github.com/alibaba/higress to github.com/alibaba/higress/v2 to ensure compatibility with the latest version.Feature Value: By updating the package name, it ensures that the project can introduce and utilize the latest features and improvements of Higress, enhancing development efficiency and code quality.
Related PR: #2890Contributor: @johnlanniChange Log: Refactored the matchDomain function, introducing the HostMatcher structure and matching types, replacing regular expressions with simple string operations to improve performance and implementing port stripping logic.Feature Value: By optimizing host matching logic, it improves system performance and code maintainability, allowing for more accurate and efficient handling of host headers that include port numbers, enhancing user experience.

Documentation Updates

Related PR: #2915Contributor: @a6d9a6mChange Log: Fixed a broken link in README_JP.md and added missing parts in README.md to make multi-language documentation more consistent.Feature Value: Improved the accuracy and consistency of the documentation, helping users find relevant information more easily, enhancing user experience.
Related PR: #2912Contributor: @hanxiantaoChange Log: Optimized the English and Chinese documentation of the hmac-auth-apisix plugin, adding more details on configuration instructions, enhancing clarity of the documentation.Feature Value: By providing more detailed explanatory documentation, it helps developers better understand and use the hmac-auth-apisix plugin, improving user experience.
Related PR: #2880Contributor: @a6d9a6mChange Log: This PR fixes grammatical errors in README.md, README_JP.md, and README_ZH.md ensuring the accuracy and consistency of the documentation.Feature Value: By correcting language errors in the documentation, it enhances the quality and readability of the documents, helping users better understand project information.
Related PR: #2873Contributor: @CH3CHOChange Log: This PR adds methods for obtaining Higress runtime logs and configurations in the non-crashing security vulnerability issue template to assist in better investigating issues.Feature Value: By providing more detailed logs and configuration information, users can more easily diagnose and resolve issues, improving the efficiency and accuracy of problem handling.

Testing Improvements

Related PR: #2928Contributor: @rinfxChange Log: This PR updates the test code for the ai-security-guard component, adding new test cases and adjusting some existing testing logic.Feature Value: By improving the test coverage and accuracy of ai-security-guard, it enhances the overall stability and reliability of the project, helping developers better understand and maintain the related functionalities.

Release Statistics

New Features: 13 items
Bug Fixes: 7 items
Refactoring and Optimization: 5 items
Documentation Updates: 4 items
Testing Improvements: 1 item

Total: 30 changes (including 2 important updates)

02 Higress Console Updates

Overview of This Release

This release includes 4 updates, covering various aspects such as feature enhancements, bug fixes, and performance optimizations.

Distribution of Updates

New Features: 1 item
Bug Fixes: 2 items
Documentation Updates: 1 item

Key Focus Areas

This release includes 1 important update, which should be given special attention:

feat: Support using a known service in OpenAI LLM provider (#589): This feature allows users to leverage existing service resources in the OpenAI LLM provider, thus expanding the system's flexibility and availability, providing users with more options.

For detailed information, please see the important feature descriptions below.

Important Feature Descriptions

Below are the detailed descriptions of important features and improvements included in this release:

1. feat: Support using a known service in OpenAI LLM provider

Related PR: #589 | Contributor: @CH3CHO

Background

In many application scenarios, developers may want to use a custom OpenAI service instance instead of the default service. This could be due to specific security requirements, performance optimizations, or infrastructure limitations. This PR meets these needs by introducing support for known services. The target user group includes enterprise-level users and technology experts who require highly customized configurations.

Feature Description

This PR primarily implements the following functionalities: 1. Allows users to specify custom services when configuring the OpenAI LLM provider. 2. Modifies the OpenaiLlmProviderHandler class, adding buildServiceSource and buildUpstreamService methods to handle custom service logic. 3. Introduces a delete method with internal parameter in the WasmPluginInstanceService interface to support more granular control. 4. Updates the front-end internationalization resource file to include prompts related to custom services. The core technical point lies in extending the existing architecture, allowing the system to recognize and utilize user-provided custom services while maintaining backward compatibility.

Usage Instructions

Enabling and configuring this feature is straightforward. First, when creating or updating the LLM provider, select the “Custom OpenAI Service” option, and fill in the corresponding service host and service path. The system will automatically use these custom configurations to connect to the OpenAI service. Typical use cases include internally deployed OpenAI service instances or environments requiring specific security policies. Important notes include ensuring the entered URL is valid and that the service host and service path are correct. Best practices include thorough testing to ensure custom configurations work properly.

Feature Value

This new functionality significantly enhances system flexibility and configurability, allowing users to choose the most suitable OpenAI service according to their needs. This flexibility is especially important for enterprise-level users requiring highly customized configurations. Furthermore, by supporting custom services, the system can better integrate into existing infrastructure, improving overall stability and performance. This is of paramount importance for the maintenance and expansion of large application systems. Overall, this feature not only enhances user experience but also brings greater scalability and reliability to the system.

Complete Change Log

Bug Fixes

Related PR: #591Contributor: @CH3CHOChange Log: This PR fixes an issue where required fields were not properly validated when enabling the routing rewrite configuration, ensuring that both host and newPath.path must provide valid values to avoid configuration errors.Feature Value: By correcting the validation logic for routing rewrites, it prevents potential errors due to incomplete configuration, enhancing system stability and user experience.
Related PR: #590Contributor: @CH3CHOChange Log: Fixed an error in the handling logic of Route.customLabels, ensuring that built-in labels can be correctly excluded during updates.Feature Value: Resolved conflicts between custom labels and built-in labels, ensuring users' flexibility and accuracy when updating routing settings.

Documentation Updates

Related PR: #595Contributor: @CH3CHOChange Log: Removed project-irrelevant descriptions in README.md and added a code formatting guide, making the documentation more focused on the project itself.Feature Value: By updating README.md, it allows users to better understand the project structure and code standards required, helping new contributors to get started quickly.