feat(search): Semantic Tool Search by shashi-stackone · Pull Request #149 · StackOneHQ/stackone-ai-python

shashi-stackone · 2026-02-19T13:36:55Z

Problem

Following up from #142

StackOne has over 10,000 actions across all connectors and growing, some connectors have 2,000+ actions alone. Keyword matching breaks
down when someone searches "onboard new hire" but the action is called hris_create_employee. The SDK already supports keyword-based
search, and we need to add semantic search using the action search service.

Implementation Details

SemanticSearchClient that calls StackOne's /actions/search API for natural language tool discovery
Three ways to use it:
1. search_tools() search by intent, get a Tools collection ready for OpenAI, LangChain, or any framework
2. search_action_names() lightweight lookup returning action names and scores without full tool definitions
3. Utility tools pass a SemanticSearchClient to utility_tools() and the tool_search tool becomes semantic-aware inside
  agent loops
Per-connector parallel search so results are scoped to only the connectors the user has linked
Automatic fallback to local BM25+TF-IDF hybrid search when the semantic API is unavailable
Action name normalization that strips version prefixes (e.g. bamboohr_1.0.0_bamboohr_create_employee_global →
bamboohr_create_employee)
Connector helpers (StackOneTool.connector, Tools.get_connectors()) for connector-aware filtering
Benchmark suite with 94 evaluation tasks across 8 categories — semantic search achieves 76.6% Hit@5 vs 66.0% for local search (+10.6%
improvement)

Summary by cubic

Adds semantic tool search so users can find and execute actions with natural language. Searches are scoped to connectors in the fetched tools (and optional project_ids), with a local BM25+TF‑IDF fallback.

New Features
- StackOneToolSet.search_tools() and a callable SearchTool (via get_search_tool()) for agent loops; fully replaces Utility Tools (module removed).
- SemanticSearchClient with search_action_names(); per‑connector parallel search; respects server ranking/min_similarity unless top_k is set; supports optional project_ids; can be passed into StackOneToolSet (otherwise created lazily).
- Local keyword fallback moved to local_search.ToolIndex.
- StackOneTool is directly callable; README and examples updated (search_tool_example.py, semantic_search_example.py).
Bug Fixes
- Scoped searches to available connectors so agents don’t discover tools they can’t execute.
- Fixed CI and lint issues; restored lazy semantic client creation and cleaned up outdated docs references.

^{Written for commit f6920c8. Summary will update on new commits.}

…resence

…thon only crewAI example

When utility_tools(semantic_client=...) is used, tool_search now searches only the connectors available in the fetched tools collection instead of the full StackOne catalog. This prevents agents from discovering tools they cannot execute. - Add available_connectors param to create_semantic_tool_search - Pass connectors from Tools.utility_tools() to scope searches - Update docs, examples, and README to reflect scoping - Add 4 new tests for scoping behavior Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cubic-dev-ai

2 issues found across 3 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/utility_tools.py">

<violation number="1" location="stackone_ai/utility_tools.py:342">
P2: An empty `available_connectors` set now falls back to a full-catalog search, which can surface tools the user doesn’t have access to. This contradicts the scoping behavior (“only the user’s own connectors are searched”) and likely returns incorrect results for accounts with no connectors.</violation>
</file>

<file name="stackone_ai/toolset.py">

<violation number="1" location="stackone_ai/toolset.py:407">
P2: Passing top_k directly to the local tool_search limits results before connector filtering, so fallback can return fewer than requested even when matching tools exist for the allowed connectors. Consider keeping an expanded fallback limit (e.g., top_k * N or a safe default) before filtering.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

stackone_ai/utility_tools.py

stackone_ai/toolset.py

willleeney · 2026-02-23T11:21:38Z

examples/semantic_search_example.py

+    utility = tools.utility_tools(semantic_client=toolset.semantic_client)
+
+    search_tool = utility.get_tool("tool_search")


"It feels like the nicer experience is like my_search_tool = tools.search(client=semantic_search) or my_search_tool = tools.search(client=local_search) and the my_execute_tool = tools.execute()"

what was the reason for not implementing it like this?

Yes, that approach is clean and agreed, We can implement UtilityTools(Tools) subclass approach which seems to be fully backwards compatible with exiting BM25 local search but not confident enough to make this change as part of this PR as it's a subclass of Tools so nothing should breaks but wondering should I flip this as part of this PR another?

Let me give it it a go ..

@willleeney Thanks for this suggestion, I made this change anyways now sooner than later.. Just added a UtilityTools subclass with typed search_tool and execute_tool property accessors.
The pattern is now:

utility = tools.utility_tools(semantic_client=toolset.semantic_client) result = utility.search_tool.call(query="onboard new hire")

However, kept utility_tools() as the config point rather than tools.search(client=...) to separate
concerns, backend choice (semantic vs local) is a one-time decision, while tool access
happens repeatedly. This avoids passing the client every time you access the tool.

Also adopted the min_similarity from the server and removed the client side parsing.

…rom the server

examples/semantic_search_example.py

willleeney · 2026-02-23T17:54:33Z

stackone_ai/models.py

        """Return utility tools for tool discovery and execution

-        Utility tools enable dynamic tool discovery and execution based on natural language queries
-        using hybrid BM25 + TF-IDF search.
+        Utility tools enable dynamic tool discovery and execution based on natural language queries.
+        By default, uses local hybrid BM25 + TF-IDF search. When a semantic_client is provided,
+        uses cloud-based semantic search for higher accuracy on natural language queries.

        Args:
-            hybrid_alpha: Weight for BM25 in hybrid search (0-1). If not provided, uses
-                ToolIndex.DEFAULT_HYBRID_ALPHA (0.2), which gives more weight to BM25 scoring
-                and has been shown to provide better tool discovery accuracy
-                (10.8% improvement in validation testing).
+            hybrid_alpha: Weight for BM25 in hybrid search (0-1). Only used when
+                semantic_client is not provided. If not provided, uses DEFAULT_HYBRID_ALPHA (0.2),
+                which gives more weight to BM25 scoring.
+            semantic_client: Optional SemanticSearchClient instance. Pass
+                toolset.semantic_client to enable cloud-based semantic search.

        Returns:
-            Tools collection containing tool_search and tool_execute
+            UtilityTools collection with search_tool and execute_tool accessors

        Note:
            This feature is in beta and may change in future versions
+
+        Example:
+            # Semantic search (pass semantic_client explicitly)
+            toolset = StackOneToolSet()
+            tools = toolset.fetch_tools()
+            utility = tools.utility_tools(semantic_client=toolset.semantic_client)
+            result = utility.search_tool.call(query="onboard new hire")
+
+            # Local BM25+TF-IDF search (default, no semantic_client)
+            utility = tools.utility_tools()
+            result = utility.search_tool.call(query="onboard new hire")
        """
-        from stackone_ai.utility_tools import (
-            ToolIndex,
-            create_tool_execute,
-            create_tool_search,
-        )
+        from stackone_ai.utility_tools import create_tool_execute

-        # Create search index with hybrid search
-        index = ToolIndex(self.tools, hybrid_alpha=hybrid_alpha)
+        if semantic_client is not None:
+            from stackone_ai.utility_tools import create_semantic_tool_search
+
+            search_tool = create_semantic_tool_search(
+                semantic_client, available_connectors=self.get_connectors()
+            )
+            execute_tool = create_tool_execute(self)
+            return UtilityTools([search_tool, execute_tool])

-        # Create utility tools
+        # Default: local BM25+TF-IDF search
+        from stackone_ai.utility_tools import ToolIndex, create_tool_search
+
+        index = ToolIndex(self.tools, hybrid_alpha=hybrid_alpha)
        filter_tool = create_tool_search(index)
        execute_tool = create_tool_execute(self)

-        return Tools([filter_tool, execute_tool])
+        return UtilityTools([filter_tool, execute_tool])


I think that we should have the argument that is passed to utility_tools be either search_method: str or search_client: SearchClient. I think that search_method: str = "bm25" works best. We know that if they pass "semantic" then we can just create the semantic search client here or inside the create semantic search tool function

That is great idea, I have added the search_method with the value of the bm25 and senamtic .. It makes things way simpler.

willleeney · 2026-02-23T17:55:17Z

stackone_ai/models.py

        filter_tool = create_tool_search(index)
        execute_tool = create_tool_execute(self)

-        return Tools([filter_tool, execute_tool])
+        return UtilityTools([filter_tool, execute_tool])


filter_tool should be search_tool to match the semantic definition

In order to make it sync with bm25 I used filter_tool but I think its right time to move to the seerch_tool as its being the standard.

…ndard way

willleeney · 2026-02-24T13:10:43Z

examples/utility_tools_example.py

-        # Search for employee management tools
-        result = filter_tool.call(query="manage employees create update list", limit=5, minScore=0.0)
+    # Search for employee management tools
+    result = utility_tools.search_tool.call(query="manage employees create update list", limit=5)


why do we need search_tool.call() not search_tool()?

We didn't have __call__ to StackOneTool s but added as it make sense and reads the code much better .. Updated examples to use tool(query="...") instead of tool.call(query="..."). .call() and .execute()
still work as before.

willleeney · 2026-02-24T13:13:48Z

stackone_ai/models.py

+    """Utility tools collection with typed accessors for search and execute tools."""
+
+    @property
+    def search_tool(self) -> StackOneTool:


w/r to the last comment which says search_tool() should make the call.

we should make this get_search_tool() which should remove tool = self.get_tool("tool_search") call and put the logic in here?

Done, renamed search_tool property to get_search_tool() method with inline lookup . Same for execute_tool aligned with naming

willleeney · 2026-02-24T13:14:15Z

stackone_ai/models.py

+        """
+        tool = self.get_tool("tool_execute")
+        if tool is None:
+            raise StackOneError("tool_execute not found in this UtilityTools collection")
+        return tool


same here as with search

Done as well

willleeney · 2026-02-24T13:15:48Z

stackone_ai/utility_tools.py

+                "default": 5,
+                "nullable": True,
+            },
+            "minSimilarity": {


should this parameter not be min_similarity to match the api? + same with top_k?

Good catch. renamed all tool parameters to snake_case to match the API: minSimilarity, tomin_similarity, minScore to min_score, limit to top_k .. I also updated the existing BM25 and semantic variants for so that it remains consistent..

willleeney · 2026-02-24T13:18:20Z

tests/test_utility_tools.py


        # Search for tools
-        result = filter_tool.execute(
+        result = search_tool.execute(


search_tool.execute() or search_tool.search() ?

I am not entirely sure about this .execute() is the standard StackOneTool interface and all tools use it. we just added __call__ which makes use get_search_tool()(query="..."), so users won't interact with .execute() directly.

i think .execute is the standard for executing a specific tool

cubic-dev-ai

3 issues found across 8 files (changes from recent commits).

Prompt for AI agents (all issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/utility_tools.py">

<violation number="1" location="stackone_ai/utility_tools.py:228">
P2: The new argument parsing drops backward compatibility for `limit`/`minScore`, so existing callers will silently get default values instead of their requested limits/scores. Consider accepting the legacy keys as fallbacks.</violation>

<violation number="2" location="stackone_ai/utility_tools.py:344">
P2: The semantic tool search now ignores legacy `limit`/`minSimilarity` arguments, so existing integrations will fall back to defaults. Add backward-compatible fallbacks to preserve behavior.</violation>
</file>

<file name="stackone_ai/models.py">

<violation number="1" location="stackone_ai/models.py:642">
P2: Removing the `search_tool`/`execute_tool` properties is a breaking API change: `utility.search_tool` now returns a method rather than a StackOneTool, so existing integrations calling `.search_tool.call(...)` will fail. Consider keeping property aliases that return `get_search_tool()`/`get_execute_tool()` to preserve compatibility.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

stackone_ai/utility_tools.py

stackone_ai/models.py

shashi-stackone · 2026-02-25T11:32:48Z

Thanks @willleeney for great feedback on this PR and suggesting some fundamental changes in the SDKs API which aligned to future direction. I think its good time to make these changes and also making a note to update the usage in the integration where the SDK usage is already integrated (or will be integrated in future) e.g ADK, Pydantic or other places.

cubic-dev-ai

1 issue found across 1 file (changes from recent commits).

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="stackone_ai/toolset.py">

<violation number="1" location="stackone_ai/toolset.py:632">
P2: Passing the uninitialized `_semantic_client` breaks semantic utility search after `fetch_tools()`. `utility_tools(search_method="semantic")` now raises because `_semantic_client` remains `None` unless the property was accessed earlier. Restore lazy initialization here so Tools gets a valid client.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

stackone_ai/toolset.py

willleeney · 2026-02-26T10:41:41Z

tests/test_utility_tools.py


        # Search for tools
-        result = filter_tool.execute(
+        result = search_tool.execute(


willleeney · 2026-02-26T10:56:29Z

README.md

+from stackone_ai import StackOneToolSet
+
+toolset = StackOneToolSet()
+
+# Search by intent — returns Tools collection ready for any framework
+tools = toolset.search_tools("manage employee records", account_ids=["your-account-id"], top_k=5)
+openai_tools = tools.to_openai()
+
+# Lightweight: inspect results without fetching full tool definitions
+results = toolset.search_action_names("time off requests", top_k=5)


This is good. This is the abstraction that we want.

I like that we can inject the semantic client at toolset = StackOneToolSet(semantic_client).
I also think it's good that we have account_ids as a parameter for search_tools.

My last area of confusion is that in the examples we have the ability to do this it like this (and this is what the examples suggest.

toolset = StackOneToolSet() all_tools = toolset.fetch_tools(account_ids=_account_ids) utility = all_tools.utility_tools(search_method="semantic") search_result = utility_tools.get_search_tool()(query="list all employees", top_k=1) tools_found = search_result.get("tools", [])

This feels chaotic. We have tools which are the available tools given the amount ids (this makes sense). But then we have a separate abstraction for utility_tools and for search and execute. Can we just get rid of the utility_tools layer and instead do all_tools.get_search_tool(search_method='semantic')?

Also why is search_result here not the same as tools from toolset.search_tools() and so we then have to do some get("tools") on the search result.

My suggestion is to have this instead:

all_tools = toolset.fetch_tools(account_ids=_account_ids) search_tool = toolset.get_search_tool(semantic_client=...) tools = search_tool("manage employee records")

hence removing the extract utility tools abstraction as it doesn't feel needed on top of the StackOneToolSet abstraction + aligning the search_tool.call() with toolset.search_tools()

shashi-stackone · 2026-02-27T12:16:02Z

Hey @willleeney Thanks for suggesting above change. After removing the utility_tools abstraction SDK looking so clean. Please have look at the latest changes and let me know if there still chances of improvements ..

As we are planning further refactor I am also thinking to include the following either as pert of this PR or furhter PRS

Adding tool parameter schema needed for each Agent frameworks (Explain later if needed)
Adding more conversations in the SDKs to_pydantic, to_adk, to_dspy etc natievly so that conversion become one liner
Explore more enhancements as we go

shashi-stackone and others added 30 commits February 18, 2026 09:51

Senamtic Search on action in Python AI SDK

c7ad71f

Filter tools based on the SDK auth config and connector

0210c1f

Use the local benchmark from the ai-generations

b1105fa

Add Semantinc search bench mark with local benchmarks

d49f52b

Fix CI lint errors

680fa8e

Fix the lint in the benchmark file

1ee842b

Formalise the docs and code

d6fba69

Keep semantic search minimal in the README

3eb0641

Remove the old benchmark data

fd37d93

implement PR feedback suggestions from cubic

f5ef955

fix nullable in the semantic tool schema

b7b522f

limit override

e9c6b86

handle per connector calls to avoid the guesswork

34e1ca6

simplify utility_tools API by inferring semantic search from client p…

82082cb

…resence

Benchmark update and PR suggestions

8a74517

update the README gst

85b0395

Note on the fetch tools for actions that user expect to discover

79c762a

Update examples and improve the semantic seach

6ee1adf

Fix ruff issues

7a65367

Document the semantic search feature in the python files and example

64a0a60

Respect the backend results unless top_k specified explicitly, add py…

4083642

…thon only crewAI example

move the crewAI tools conversation back in the example

b926db1

CI Trigger

d2dd2f5

Fix unit tests with updated top_k behavior

719b391

Update PR with correct approach mentioned in the PR comments

b360b00

Update example and remove unwated crewai examples

7b77f33

Remove the crewai reference from the README

bab931b

Fix the Ruff CI issue

5eaa3c5

Add back creai intefration and test integration

173121d

Cubic suggestions

c9c0358

cubic-dev-ai bot reviewed Feb 19, 2026

View reviewed changes

stackone_ai/utility_tools.py Outdated Show resolved Hide resolved

stackone_ai/toolset.py Outdated Show resolved Hide resolved

Optinally support project_ids in the SDK search

71457af

Shashikant86 mentioned this pull request Feb 22, 2026

feat(search): add semantic search for AI-powered tool discovery StackOneHQ/stackone-ai-node#321

Closed

4 tasks

willleeney reviewed Feb 23, 2026

View reviewed changes

shashi-stackone added 2 commits February 23, 2026 13:36

Update the client to use PR suggested client and use min_similarity f…

5bf1cc6

…rom the server

CI Fix

9fe1e40

willleeney reviewed Feb 23, 2026

View reviewed changes

examples/semantic_search_example.py Show resolved Hide resolved

willleeney reviewed Feb 23, 2026

View reviewed changes

shashi-stackone added 2 commits February 24, 2026 11:27

Implement PR sugggestions and use the search and execute tools as sta…

010a275

…ndard way

update example docs

8137538

willleeney reviewed Feb 24, 2026

View reviewed changes

Update SDK as per PR suggestions

90d8aa3

cubic-dev-ai bot reviewed Feb 25, 2026

View reviewed changes

stackone_ai/utility_tools.py Outdated Show resolved Hide resolved

stackone_ai/utility_tools.py Outdated Show resolved Hide resolved

stackone_ai/models.py Outdated Show resolved Hide resolved

Fix available connector early return

3d6000f

Fix semantic search creation in fetch tools

a0dd833

cubic-dev-ai bot reviewed Feb 26, 2026

View reviewed changes

stackone_ai/toolset.py Outdated Show resolved Hide resolved

Fix semantic search creation in fetch tools revert back to lazy

86c9c64

willleeney requested changes Feb 26, 2026

View reviewed changes

shashi-stackone added 4 commits February 26, 2026 17:59

get rid of the utility tools completely as discussed

ce3443c

Remove the reference of the semantic search

53828e5

Fix CI and lint issues

b8b331a

Pass semantic Client to the toolset

f6920c8

		utility = tools.utility_tools(semantic_client=toolset.semantic_client)

		search_tool = utility.get_tool("tool_search")

Conversation

shashi-stackone commented Feb 19, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Implementation Details

Summary by cubic

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shashi-stackone commented Feb 25, 2026

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shashi-stackone commented Feb 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shashi-stackone commented Feb 19, 2026 •

edited by cubic-dev-ai bot

Loading