Ask any question to your OpenCTI data with our Chatbot

Oct 14, 2025 • 7 min read

Alexis Zubiolo

Senior Data Engineer

Constant Bridon

VP of Data Engineering

Cyber Threat Intelligence is now abundant, and as regular information, it might be overwhelming and impossible to actually process. OpenCTI acts a unique point of centralization and automatic ingestion of CTI feeds, but once stored, this data requires specific knowledge to be digested. This is like being in a library, able to read any book, but without understanding the classification acronyms and codes to find the specific genre we are looking for.

Because human beings and therefore OpenCTI users prefer using natural language to work, we’ve previously worked on NLQ, a feature allowing anyone to translate a natural language query into a set of filters on the platform. Now the idea is to go a step further, and ease the integration of the data into a conversation, just like you’d be discussing the latest cyber attack with a colleague.

Except here, the colleague is an actual software with a very peculiar way of retrieving the data, the GraphQL API, now given a voice and a name: ArianeAi, your AI assistant. And you may not have the time nor the motivation to learn to speak this esoteric language. So the great ambition boils down to a very shortly said problem : Is it possible to translate any question asked to the platform into a graphQL query, and generate a relevant, user friendly answer from the result ?

TL;DR

What it is:
- A conversational capability within OpenCTI frontend
- Chatbot AI answers are based on data within the platform
What it is not:
- A CTI or Cybersecurity buddy that answers general questions
- Another means of updating or creating data in OpenCTI

Onboarding the Agentic AI hype train

Thanks to the NLQ feature, we already know that extracting relevant information from a natural language query is feasible, and mapping it to a given structure (say the filters in an OpenCTI page) can be done with a thorough chain of steps. The “chain of steps” approach can also be applied to the problem at hand, but might be a long and cumbersome work of expliciting all the different combinations, takin into account both graphQL grammar and OpenCTI specific implementation. Given the latest results, it felt natural to look to the side of agentic paradigm, turning the problem upside down. Instead of describing atomic tasks and order them into a rigid workflow, we decided to try and describe all the different capabilities needed to perform the task and feed them to an “agent”, that would autonomously select and use the relevant tool at a given step of its process.

Building the foundation: Model Context Protocol (MCP)

To start with, there is a need to describe those tools into a standard a common framework, to make sure they can be used by any agentic implementation. The leading standard at the time of writing is Model Context Protocol (MCP) with a wide variety of open sources servers, some dealing with the problem at hand. However, we quickly figured out there was not an out-of-the-box solution to be used, and we decided to implement our own with Filigran’s open source DNA : you can have a look here.

As you can see in the source code, we come up with tools to understand a given graphQL schema, generate a query that is working, and execute it in an OpenCTI environment. Next step is then to connect an agent with those tools, and hope for magic to happen. But as Uncle Vernon once said “there is no such thing as magic”, and such a naive implementation proved to be under performing both in terms of processing time (answer generation was long) and relevance (answer ware not always on point, especially for irrelevant questions). In order to have a finer control over the query generation, we understood that we could use agents, but into a more granular flow.

A Granular Flow

To address the challenges described earlier, we have implemented a logical AI workflow that guides the agent step-by-step in answering user questions. This granular approach breaks down the process as follows:

1: Process the question

Determine whether the question is relevant
Identify if any entities are explicitly named, and verify their presence in the data
Classify the type of entity involved (e.g., malware, attack pattern)

2: Build and execute the query

Select the relevant entity types based on the question (such as malware or vulnerability)
Retrieve field definitions, filters, and sorting options using MCP tools to construct the query.
Execute the query

3: Interpret the results and write an answer

Each of these steps is orchestrated by ArianeAi, ensuring the question flows logically from interpretation to execution.

The following chart provides a visual overview of how the agent constructs answers through this multi-step process.

Why granularity matters

This modular flow offers significant flexibility. It enables combining AI components (agents, MCP tools) with customized logic like authentication, rate limiting, and URL generation tailored to the OpenCTI chatbot’s needs. Moreover, depending on the complexity of each step, different model sizes can be employed—smaller, faster LLMs for more routine tasks, and larger, more powerful models for complex interpretation.

According to our internal benchmarks, this approach delivers better and faster results compared to relying on a single agent attempting to directly use the appropriate MCP tool(s). By decomposing the problem, each step can be optimized independently, improving efficiency and accuracy.

Where we stand, and what’s next?

This first iteration is only the beginning, and we are just scratching the surface of what Agentic AI can achieve within Filigran’s XTM suite. For now, the chatbot operates using OpenAI’s gpt-4.1-mini, but our goal is to run it on our own infrastructure. Early experiments with open-weight models like Qwen3 and gpt-oss have shown encouraging results.

Current limitations

Large language models, however, come with inherent limitations. For example, they do not count well. A well-known example is that many models (even big ones) fail to answer simple questions like “how many Rs are there in Strawberry?”. Hence, OpenCTI’s chatbot struggles to count elements in very long lists and will not do well at questions such as “what are the top 5 targeted countries?”, “what is the most active threat actor?”. Not to mention limitations due to the LLMs context window. We have identified ways to tackle this issue and will implement them in upcoming iterations.

The underlying data structure in OpenCTI adds another layer of complexity, with numerous entity types and fields potentially relevant to each user query. Sometimes, it is hard for the model to determine exactly what needs to be queried and how to interpret returned results for nuanced questions.

Future capabilities

Currently, the chatbot is limited to reading data; it cannot create, modify, or delete information. Before expanding these capabilities, our priority with ArianeAi is to ensure the model consistently interprets and answers questions correctly. In upcoming releases, there will be the possibility to create reports with the chatbot.

Conclusion

Integrating Agentic AI into OpenCTI marks a significant step toward making cyber threat intelligence more accessible and actionable. By leveraging a granular agentic workflow, we’ve enabled natural language queries to be translated into precise GraphQL queries, delivering human-readable insights in a chatbot right into the OpenCTI application, 24/7.

The adoption of the Model Context Protocol (MCP) and custom tools has enhanced the chatbot’s ability to navigate OpenCTI’s complex data structure, despite challenges like LLM limitations in counting or handling nuanced queries. While the current iteration, powered by gpt-4.1-mini, excels at reading and interpreting data, future releases, such as OpenCTI 6.9, will introduce capabilities like report creation. Ongoing experiments with open-weight models like Qwen3 and gpt-oss signal a move toward self-hosted infrastructure, promising greater scalability and flexibility. This is just the beginning, as we continue to refine and expand the chatbot’s potential within Filigran’s XTM suite. Come and give it a try, we’d be thrilled to hear your feedbacks!

Enjoy and feel free to ask any questions about it on our Slack community channe l !

Explore related topics and insights