feat: add model capability preset picker for OpenAI Compatible provider#11902
feat: add model capability preset picker for OpenAI Compatible provider#11902roomote-v0[bot] wants to merge 6 commits intomainfrom
Conversation
Adds a searchable dropdown to the OpenAI Compatible provider settings that lets users select from all known model capabilities across every provider Roo supports (Anthropic, OpenAI, DeepSeek, Gemini, MiniMax, Mistral, Moonshot/Kimi, Qwen, SambaNova, xAI, ZAi/GLM). When a preset is selected, the model capability fields (context window, max tokens, image support, prompt caching, pricing, etc.) are automatically populated. Users can still choose "Custom" to configure everything manually as before. Changes: - packages/types: new all-model-capabilities.ts aggregating presets - webview-ui: preset picker dropdown in OpenAICompatible.tsx - i18n: English translation keys for the new UI - Tests for both the preset data and the UI component Addresses #11674
|
@roomote please ensure all translations are correctly defined thx |
Added |
|
appear to work so fine now <3 please merge that or do something like that , i can finnaly use kimi k2.5 on NVIDIA NIM API that was so usefull and closeing issue about wrong model usage on OpenAI compatible endpoint what this PR solve ?
why should we merge that ? thx by advance <3 |
|
forget that .... appear that model still locked in thinking loop .... like kimi k2.5 can made loop with only thinking because it's tinternal thinking loop like official documentation explain.... but roocode wont let model only think , roo obsolulty whant content or tool_call that is not align with kimi k2.5 usage ..... @roomote |
Added |
|
no you don't get it ..... i get "[ERROR] You did not use a tool in your previous response! ...." error message , that mean case "error": at ChatRow.js line 1290 have a really bad condition, causing error on internal thinking loop .... @roomote |
Hi @bozoweed -- I investigated the The If you're consistently hitting this error with a specific model, it's likely a model behavior issue (the model isn't calling tools when expected). I'd suggest opening a separate issue with details about which model/provider you're using so the team can investigate the agent loop behavior. |
|
@roomote thanks for advice but when i did that no one want work on that every one say the same sh*te as you just sayed ..... BUT you was wrong , nvidia nim api have replicate the same way to use models as there original provider, for example kimik k2.5 on nvidia nim api is working EXACTLY like monnshot provider ( just endpoint is openAi like but all working background is SAME) so yes it's related to OpenAi compatible and yes you have to fix that because i sayed that for kimi k2.5 but appear to be same for any other open model like deepeek ones etc , so please use official model usage instead try to break it by changing parsing or attempt .... kimi k2.5 , kimi k2 thinking , etc has there specific use case that is not the same as all other basic model , so yes you have to do something on OpenAi compatible because many other provider support calling model with specific use case provided throu OpenAi end point ..... please read the doc before answer |
Fixed the reported issues. When selecting a preset with |
…t picker When selecting a model preset with preserveReasoning (e.g. Kimi K2.5, Kimi K2 thinking), the R1 format setting is now automatically enabled so reasoning/thinking blocks work correctly via OpenAI Compatible endpoints. Also adds visual indicator showing applied capability flags after selecting a preset, and new tests covering the auto-enable behavior.
|
@roomote i'm not sure that is the real issue , we allready can enable R1 using checkbox right ? please read the doc buddy Model Usage Once the local inference service is up, you can interact with it through the chat endpoint: def simple_chat(client: openai.OpenAI, model_name: str): NOTE The recommended temperature for Kimi K2 Thinking is temperature = 1.0. If no special instructions are required, the system prompt above is a good default. Kimi K2 Thinking has the same tool calling settings as Kimi K2 Instruct. To enable them, you need to pass the list of available tools in each request, then the model will autonomously decide when and how to invoke them. The following example demonstrates calling a weather tool end-to-end: Your tool implementationdef get_weather(city: str) -> dict: Tool schema definitiontools = [{ Map tool names to their implementationstool_map = { def tool_call_with_client(client: OpenAI, model_name: str): The tool_call_with_client function implements the pipeline from user query to tool execution. This pipeline requires the inference engine to support Kimi K2 Thinking's native tool-parsing logic. For more information, see the Tool Calling Guide. Docs If you are doing benchmark testing with kimi api, please refer to this benchmark best practice. You can simply use it by switching the model parameter: $ curl https://api.moonshot.ai/v1/chat/completions For the kimi-k2.5 model, thinking is enabled by default, no need to manually specify it: $ curl https://api.moonshot.ai/v1/chat/completions Please refer to Disable Thinking Capability Example In the API response for kimi-k2-thinking or kimi-k2.5 (with thinking enabled) models, we use the reasoning_content field as the carrier for the model's reasoning. About the reasoning_content field: Both kimi-k2-thinking and kimi-k2.5 (with thinking enabled) are designed to perform deep reasoning across multiple tool calls, enabling them to tackle highly complex tasks. To get reliable results, whether using kimi-k2-thinking or kimi-k2.5 (with thinking enabled by default), always follow these configuration rules: We walk through a complete example that shows how to properly use thinking models together with official tools for multi-step tool call and extended reasoning. The example below demonstrates a "Daily News Report Generation" scenario. The model will sequentially call official tools like date (to get the date) and web_search (to search today's news), and will present deep reasoning throughout this process. import os class FormulaChatClient: Initialize clientbase_url = os.getenv("MOONSHOT_BASE_URL", "https://api.moonshot.ai/v1") if not api_key: print(f"Base URL: {base_url}") client = FormulaChatClient(base_url, api_key) Define the official tool Formula URIs to useformula_uris = [ Load all tool definitions and build mappingprint("Loading official tools...") for uri in formula_uris: print(f"Loaded {len(all_tools)} tools in total\n") if not all_tools: Initialize message listmessages = [ User request to generate today's news reportuser_request = "Please help me generate a daily news report including important technology, economy, and society news." print(f"User request: {user_request}\n") Begin multi-step conversation loopmax_iterations = 10 # Prevent infinite loops print("\nConversation completed!") Cleanupclient.close() This process demonstrates how the kimi-k2-thinking or kimi-k2.5 (with thinking enabled) model uses deep reasoning to plan and execute complex multi-step tasks, with detailed reasoning steps (reasoning_content) preserved in the context to ensure accurate tool use at every stage. A: Keeping the reasoning_content ensures the model maintains reasoning continuity in multi-step reasoning scenarios, especially when calling tools. The server will automatically handle these fields; users do not need to manage them manually. A: Yes, reasoning_content counts towards your input/output token quota. For detailed pricing, please refer to MoonshotAI's pricing documentation. |
Fixed the preset picker to comprehensively apply model-specific settings: auto-sets |
…cale translations - Auto-set modelTemperature when selecting a preset with defaultTemperature (e.g. Kimi K2 models require temp=1.0) - Reset openAiR1FormatEnabled to false when selecting non-reasoning models (not just on "custom") - Reset modelTemperature to null when selecting "custom" preset - Add defaultTemp flag to capability display showing auto-applied temperature - Add missing appliedFlags/flags translation keys to all non-EN locales - Add tests for temperature auto-apply and R1 reset behavior
Related GitHub Issue
Closes: #11674
Description
This PR attempts to address Issue #11674 by adding a Model Capability Preset dropdown to the OpenAI Compatible provider settings.
When users select the OpenAI Compatible provider, they now see a searchable dropdown that lists all known models across every provider Roo supports. Selecting a model automatically populates the capability fields (context window, max tokens, image support, prompt caching, pricing, etc.) so users running local models via llama.cpp, exo, NVIDIA NIM, or similar tools get the same model-specific behavior as native provider users.
Key implementation details:
modelCapabilityPresetsaggregation inpackages/typesthat collects model definitions from Anthropic, OpenAI, DeepSeek, Gemini, MiniMax, Mistral, Moonshot/Kimi, Qwen, SambaNova, xAI, and ZAi/GLMOpenAICompatible.tsxusing the existing Command/Popover components, grouped by providerTrade-offs:
Test Procedure
modelCapabilityPresetsdata (6 tests inpackages/types)webview-ui)cd packages/types && npx vitest runandcd webview-ui && npx vitest run src/components/settings/providers/__tests__/OpenAICompatible.spec.tsxPre-Submission Checklist
Documentation Updates
Additional Notes
Feedback and guidance are welcome.
Interactively review PR in Roo Code Cloud