Édition_2024 / 14_11_2024 SÉCURITÉ & QUALITÉ DU CODE DEVCON#23 L’approche RAG pour la cyber-sécurité David Pilato @dadoonet

DEVCON#23 : MERCI AUX PARTENAIRES DE LA DEVCON et du HORS-SERIE SECURITE MERCI À PARTENAIRE SPÉCIAL HARDWARE

Elasticsearch You Know, for Search

These are not the droids you are looking for.

GET /_analyze { “char_filter”: [ “html_strip” ], “tokenizer”: “standard”, “filter”: [ “lowercase”, “stop”, “snowball” ], “text”: “These are <em>not</em> the droids you are looking for.” }

These are <em>not</em> the droids you are looking for. { “tokens”: [{ “token”: “droid”, “start_offset”: 27, “end_offset”: 33, “type”: “<ALPHANUM>”, “position”: 4 },{ “token”: “you”, “start_offset”: 34, “end_offset”: 37, “type”: “<ALPHANUM>”, “position”: 5 }, { “token”: “look”, “start_offset”: 42, “end_offset”: 49, “type”: “<ALPHANUM>”, “position”: 7 }]}

Elasticsearch You Know, for Search

Elasticsearch You Know, for Vector Search

Example: 1-dimensional vector Character Vector [ 1 ] ] Realistic

[ Embeddings represent your data Cartoon 1

represent different data aspects Human Character Vector [ 1, 1 Realistic Cartoon ] ] Machine

[ Multiple dimensions 1, 0

is grouped together Human Character Vector [ 1.0, 1.0 1.0, 0.0 Realistic Cartoon [ 1.0, 0.8 1.0, 1.0 [ 1.0, 1.0 ] ] ] ] ]

Machine

[ [ Similar data

Vector search ranks objects by similarity (~relevance) to the query Human Rank Query 1 Realistic Cartoon 2 3 4 5 Machine Result

Similarity Human q cos(θ) = d1 d2 Realistic θ q⃗ × d ⃗ | q⃗ | × | d |⃗ _score = 1 + cos(θ) 2

Similarity: cosine (cosine) θ Similar vectors θ close to 0 cos(θ) close to 1 1+1 _score = =1 2 θ Orthogonal vectors θ close to 90° cos(θ) close to 0 1+0 _score = = 0.5 2 θ Opposite vectors θ close to 180° cos(θ) close to -1 1−1 _score = =0 2

LLM opportunities and limits your question one answer your question GAI / LLM : public internet data

Retrieval Augmented Generation your question the right answer your question + context window GAI / LLM public internet data your business data documents images audio

Attack Discovery 100s of alerts Configurable Anonymization Summary Prompt + Alert Context Alert Context Elastic Detections Other Detections Handful of Discoveries mapped across MITRE ATT&CK

Elastic AI Assistant Prebuilt/Custom Prompt + Prebuilt/Custom Prompt Context Window Knowledge Base / User Data User data Alerts Elastic Provided Content Response

Demo Attack Discovery

ne w in 8. 16

ne w in 8. 16

ne w in 8. 16

Elastic Security Labs https://www.elastic.co/security-labs

ne w in 8. 16

Retrieval Augmented Generation your question the right answer your question + context window GAI / LLM public internet data your business data documents images audio

Retrieval Augmented Generation your question the right answer your question + context window Locally hosted LLM your business data documents images audio

ne w in 8. 16

Édition_2024 / 14_11_2024 SÉCURITÉ & QUALITÉ DU CODE DEVCON#23 L’approche RAG pour la cyber-sécurité David Pilato @dadoonet