La recherche à l’ère de l’IA

A presentation at DevQuest in June 2025 in 79000 Niort, France by David Pilato

Slide 1

Slide 1

Search a new era David Pilato @dadoonet @pilato.fr

Slide 2

Slide 2

Elasticsearch You Know, for Search

Slide 3

Slide 3

Slide 4

Slide 4

Slide 5

Slide 5

These are not the droids you are looking for.

Slide 6

Slide 6

GET /_analyze { “char_filter”: [ “html_strip” ], “tokenizer”: “standard”, “filter”: [ “lowercase”, “stop”, “snowball” ], “text”: “These are <em>not</em> the droids you are looking for.” }

Slide 7

Slide 7

“char_filter”: “html_strip” These are <em>not</em> the droids you are looking for. These are not the droids you are looking for.

Slide 8

Slide 8

“tokenizer”: “standard” These are not the droids you are looking for. These are not the droids you are looking for

Slide 9

Slide 9

“filter”: “lowercase” These are not the droids you are looking for these are not the droids you are looking for

Slide 10

Slide 10

“filter”: “stop” These are not the droids you are looking for these are not the droids you are looking for droids you looking

Slide 11

Slide 11

“filter”: “snowball” These are not the droids you are looking for these are not the droids you are looking for droids you droid you looking look

Slide 12

Slide 12

These are <em>not</em> the droids you are looking for. { “tokens”: [{ “token”: “droid”, “start_offset”: 27, “end_offset”: 33, “type”: “<ALPHANUM>”, “position”: 4 },{ “token”: “you”, “start_offset”: 34, “end_offset”: 37, “type”: “<ALPHANUM>”, “position”: 5 }, { “token”: “look”, “start_offset”: 42, “end_offset”: 49, “type”: “<ALPHANUM>”, “position”: 7 }]}

Slide 13

Slide 13

Semantic search ≠ Literal matches

Slide 14

Slide 14

Elasticsearch You Know, for Search

Slide 15

Slide 15

Elasticsearch You Know, for Vector Search

Slide 16

Slide 16

What is a Vector ?

Slide 17

Slide 17

Example: 1-dimensional vector Character Vector [ 1 ] ] Realistic

[ Embeddings represent your data Cartoon 1

Slide 18

Slide 18

represent different data aspects Human Character Vector [ 1, 1 Realistic Cartoon ] ] Machine

[ Multiple dimensions 1, 0

Slide 19

Slide 19

is grouped together Human Character Vector [ 1.0, 1.0 1.0, 0.0 Realistic Cartoon [ 1.0, 0.8 1.0, 1.0 [ 1.0, 1.0 ] ] ] ] ]

Machine

[ [ Similar data

Slide 20

Slide 20

Vector search ranks objects by similarity (~relevance) to the query Human Rank Query 1 Realistic Cartoon 2 3 4 5 Machine Result

Slide 21

Slide 21

How do you index vectors ?

Slide 22

Slide 22

Architecture of Vector Search

Slide 23

Slide 23

Choice of Embedding Model Start with Off-the Shelf Models Extend to Higher Relevance ●Text data: Hugging Face (like Microsoft’s E5 ●Apply hybrid scoring ) ●Images: OpenAI’s CLIP ●Bring Your Own Model: requires expertise + labeled data

Slide 24

Slide 24

Problem training vs actual use-case

Slide 25

Slide 25

dense_vector field type PUT ecommerce { “mappings”: { “properties”: { “description”: { “type”: “text” } “desc_embedding”: { “type”: “dense_vector” } } } }

Slide 26

Slide 26

Data Ingestion and Embedding Generation POST /ecommerce/_doc { “_id”:”product-1234”, “product_name”:”Summer Dress”, “description”:”Our best-selling…”, “Price”: 118, “color”:”blue”, “fabric”:”cotton”, “fabric”:”cotton” “desc_embedding”:[0.452,0.3242,…], } “desc_embedding”:[0.452,0.3242,…] } “img_embedding”:[0.012,0.0,…] } Source data POST /ecommerce/_doc

Slide 27

Slide 27

co m m er ci With Elastic ML al { } Source data { } “_id”:”product-1234”, “product_name”:”Summer Dress”, “description”:”Our best-selling…”, “Price”: 118, “color”:”blue”, “fabric”:”cotton”, POST /ecommerce/_doc “_id”:”product-1234”, “product_name”:”Summer Dress”, “description”:”Our best-selling…”, “Price”: 118, “color”:”blue”, “fabric”:”cotton”, “desc_embedding”:[0.452,0.3242,…]

Slide 28

Slide 28

How do you search vectors ?

Slide 29

Slide 29

Architecture of Vector Search

Slide 30

Slide 30

knn query GET /ecommerce/_search { “query” : { “bool”: { “must”: [{ “knn”: { “field”: “desc_embbeding”, “query_vector”: [0.123, 0.244,…] } }], “filter”: { “term”: { “department”: “women” } } } } }, “size”: 10

Slide 31

Slide 31

knn query (with Elastic ML co m m er ci al GET /ecommerce/_search { “query” : { “bool”: { “must”: [{ “knn”: { “field”: “desc_embbeding”, “query_vector_builder”: { “text_embedding”: { “model_text”: “summer clothes”, “model_id”: <text-embedding-model> } } } }], “filter”: { “term”: { “department”: “women” } } } }, “size”: 10 } ) Transformer model

Slide 32

Slide 32

ne w semantic_text field type PUT ecommerce { “mappings”: { “properties”: { “description”: { “type”: “text”, “copy_to”: [ “desc_embedding” ] } “desc_embedding”: { “type”: “semantic_text” } } } } POST ecommerce/_doc { “description”: “Our best-selling…” } GET ecommerce/_search { “query”: { “semantic”: { “field”: “desc_embedding” “query” : “I’m looking for a red dress for a DJ party” }}} fro m 8. 15

Slide 33

Slide 33

Architecture of Vector Search

Slide 34

Slide 34

But how does it really work?

Slide 35

Slide 35

Similarity Human q cos(θ) = d1 d2 Realistic θ q⃗ × d ⃗ | q⃗ | × | d |⃗ _score = 1 + cos(θ) 2

Slide 36

Slide 36

Similarity: cosine (cosine) θ Similar vectors θ close to 0 cos(θ) close to 1 1+1 _score = =1 2 θ Orthogonal vectors θ close to 90° cos(θ) close to 0 1+0 _score = = 0.5 2 θ Opposite vectors θ close to 180° cos(θ) close to -1 1−1 _score = =0 2

Slide 37

Slide 37

Similarity: Dot Product (dot_product or max_inner_product) q⃗ × d ⃗ = | q⃗ | × cos(θ) × | d |⃗ q d θ | q⃗ | × co s (θ ) 1 + dot_ product(q, d) scorefloat = 2 0.5 + dot product(q, d) _scorebyte = 32768 × dims

Slide 38

Slide 38

Similarity: Euclidean distance (l2_norm) y 2 n i (x ∑ 1 i= − y i) q l2_normq,d = y1 d x1 y2 x2 n ∑ i=1 (xi − yi) 1 _score = 1 + (l2_normq,d )2 x 2

Slide 39

Slide 39

Brute Force

Slide 40

Slide 40

Hierarchical Navigable Small Worlds (HNSW One popular approach HNSW: a layered approach that simplifies access to the nearest neighbor Tiered: from coarse to fine approximation over a few steps Balance: Bartering a little accuracy for a lot of scalability ) Speed: Excellent query latency on large scale indices

Slide 41

Slide 41

Scalar Quantization Ela s 8.14 ticsea def rch aul t float32 Recall: High Precision: High Rescore: Likely Not Needed + Full RAM Required int8 int4 bit Recall: Good Precision: Good Oversampling: Moderate Recall: Low Precision: Low Oversampling: Needed Recall: Bad Precision: Bad Oversampling: Needed Rescore: Reasonable Rescore: may be slower Rescore: Expensive and Limiting 4X RAM Savings 8X RAM Savings 32X RAM Savings

Slide 42

Slide 42

BBQ aka Better Binary Quantization float32 int8 int4 bit : BBQ 32X RAM savings. Faster & more accurate than Product Quantization BBQ*

Slide 43

Slide 43

Memory required 100M vectors? Only 12GB!?! One single node.

Slide 44

Slide 44

Benchmarketing

Slide 45

Slide 45

https://djdadoo.pilato.fr/

Slide 46

Slide 46

https://github.com/dadoonet/music-search/

Slide 47

Slide 47

Elasticsearch You Know, for Hybrid Search

Slide 48

Slide 48

Hybrid scoring Term-based score Linear Combination manual boosting Vector similarity score Combine

Slide 49

Slide 49

Manual boosting GET ecommerce/_search { “query” : { “bool” : { “must” : [{ “match”: { “description”: { “query”: “summer clothes” } } },{ “semantic”: { “field”: “desc_embbeding”, “query”: “summer clothes”, “boost”: 100.0 } }] } } }

Slide 50

Slide 50

PUT starwars { “mappings”: { “properties”: { “text.tokens”: { “type”: “sparse_vector” } } } “These are not the droids you are looking for.”, } “Obi-Wan never told you what happened to your father.” GET starwars/_search { “query”:{ “sparse_vector”: { “field”: “text.tokens”, “query_vector”: { “lucas”: 0.50047517, “ship”: 0.29860738, “dragon”: 0.5300422, “quest”: 0.5974301, … } } } }

Slide 51

Slide 51

ELSER Elastic Learned Sparse EncodER sparse_vector Not BM25 or (dense) vector Sparse vector like BM25 Stored as inverted index Co m m er ci al

Slide 52

Slide 52

Hybrid ranking ranking 2 ranking 3 Term-based score Dense vector score Sparse vector score Reciprocal Rank Fusion (RRF blend multiple ranking methods Combine ) ranking 1

Slide 53

Slide 53

Reciprocal Rank Fusion (RRF D set of docs R set of rankings as permutation on 1..|D| k - typically set to 60 by default Dense Vector r(d) k+r(d) A 1 1 B 0.7 C D Score r(d) k+r(d) 61 C 1,341 1 61 2 62 A 739 2 62 0.5 3 63 F 732 3 63 0.2 4 64 G 192 4 64 0.01

=

= +

E Doc 5 65 H 183 5 65 ) Score

Doc BM25 Doc RRF Score A 1/61 1/62 0,0325 C 1/63 1/61 0,0323 B 1/62 0,0161 F 1/63 0,0159 D 1/64 0,0156

Slide 54

Slide 54

GET index/_search { “retriever”: { “rrf”: { “retrievers”: [{ “standard” { “query”: { “match”: {…} } } },{ “standard” { “query”: { “sparse_vector”: {…} } } },{ “knn”: { … } } ] } } } Hybrid Ranking BM25f + Sparse Vector + Dense Vector co m m er ci al

Slide 55

Slide 55

ChatGPT Elastic and LLM

Slide 56

Slide 56

Gen AI Search engines

Slide 57

Slide 57

LLM opportunities and limits your question one answer your question GAI / LLM : public internet data

Slide 58

Slide 58

Slide 59

Slide 59

Retrieval Augmented Generation your question the right answer your question + context window GAI / LLM public internet data your business data documents images audio

Slide 60

Slide 60

Demo Elastic Playground

Slide 61

Slide 61

Slide 62

Slide 62

Elasticsearch You Know, for Semantic Search

Slide 63

Slide 63

Search a new era David Pilato @dadoonet @pilato.fr