At the beginning of 2025, EclecticIQ, Łukasiewicz-AI and NRD Cyber Security launched the CTI-AI project, which aims to enhance the cyber resilience of critical infrastructures. It aims to optimize the scalability of cybersecurity resources and advance the maturity of SOC and CTI operations. It is also expected to contribute to the AI transition of National and Sectoral SOCs, safeguarding critical infrastructures, and bolstering national security frameworks.
The project also enables valuable content and insights about CTI and AI to be shared with the wider SOC community in Europe via specialized events, blog posts, and updates. Almost a year has passed since the kick-off session, and we focused on developing functionality within three aspects of the CTI lifecycle:
1. Intelligence structuring & automation
- Automatic MITRE ATT&CK TTP tagging: We aim to reduce manual effort and improve consistency in tagging CTI reports. The work which has been carried out involved testing multiple model architectures and various embeddings. This culminated in the successful construction of a Proof-of-Concept (PoC) application that uses a similarity-based approach for extracting MITRE TTPs from CTI reports.
- STIX 2.1 relationship mapping: The goal was to convert unstructured CTI reports into structured STIX 2.1 intelligence by automatically identifying relationships like uses, targets, and indices. The effort included implementing both multistage and single-stage LLM mapping approaches, while also pinpointing annotation of inconsistency and the lack of ground truth as primary challenges. This resulted in the creation of an annotation application, which speeds up analysts’ work and provides good-quality data for AI model training and assessment.
2. AI-driven insights and retrieval
- AI trend & interest digest: Designed to provide personalized, periodical digests for different roles in the organization, the work focused on designing and creating a workflow for CTI reports scoring and personalization. This involved implementing ranking methods and validating a user-profiling approach for tailoring recommendations. A prototype application was created that successfully recommends the most relevant reports for the user based on their profile.
- Retrieval agent: The core objective was to improve relevance and speed of CTI retrieval and reduce token usage for LLMs through a hybrid search system. The work successfully built a pipeline featuring entity extraction, ontology mapping, and query expansion, implementing a hybrid retrieval model (semantic + lexical + keyword), and validating its performance using a newly created evaluation dataset.
3. Prioritization & operational enablement
- Attack-likelihood scoring: To support prioritization, proactive defense, and early threat awareness, we aimed to build a system to assign probabilistic risk scores to threat objects. This was conceptualized as requiring the definition and integration of multiple data sources (TIP + external feeds) and the design of a complete ETL pipeline (ingestion, normalization, enrichment, feature engineering) to build an initial model incorporating temporal dynamics and an explainability layer. In the first cycle, our focus has been on analyzing available scientific research, designing an appropriate approach, and conducting preliminary data analysis.
- Detection-rule generation: To accelerate the creation of operational detection content from available intelligence, this feature focused on the automated generation of detection rules. In cycle 1, we have focused on creating the conception by defining the TIP input schema, researching deterministic field-to-rule mapping templates, and designing an end-to-end workflow encompassing data extraction, normalization, mapping, and essential syntactic/functional validation.
More on the CTI-AI project: https://www.nrdcs.eu/cti-ai