Elasticsearch

A presentation at BBL Ekino (private event) in December 2024 in by David Pilato

Slide 1

Slide 1

BBL at

Slide 2

Slide 2

$ curl -XPOST https://localhost:9200/speaker/_doc -d ‘{ “name” : “David Pilato”, “jobs” : [ { “name” : “SRA Europe (SSII)”, “date” : “1995” }, { “name” : “SFR”, “date” : “1997” }, { “name” : “e-Brands / Vivendi”, “date”: “2000” }, { “name” : “DGDDI (douane)”, “date” : “2005” }, { “name” : “elastic”, “date” : “2013” } ], “motivations” : [ “family”, “job”, “deejay” ], “blog” : “https://david.pilato.fr/”, “twitter” : [ “@dadoonet”, “@elasticfr” ], “bluesky” : [ “@pilato.fr” ], “email” : “david@elastic.co” }’ -H ‘Content-Type: application/json’

Slide 3

Slide 3

One Search AI Platform Two Out-of-the-Box Solutions The Freedom to Build Anything Out-of-the-Box Solutions Elastic Observability Build Your Own Elastic Security Elastic Search The Elastic Search AI Platform Ingest Secure & Scalable Storage AI / ML Search Visualization Workflow Automation

Slide 4

Slide 4

st a rt -lo ca l

Slide 5

Slide 5

el a st ic cl ou d

Slide 6

Slide 6

se rv er le ss

Slide 7

Slide 7

A typical search implementation… CREATE TABLE user ( name VARCHAR(100), comments VARCHAR(1000) ); INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); INSERT INTO user VALUES (‘Malloum Laya’, ‘Worked with David at french customs service’); INSERT INTO user VALUES (‘David Gageot’, ‘Engineer at Doctolib’); INSERT INTO user VALUES (‘David David’, ‘Who is that guy?’); David

Slide 8

Slide 8

Search on term INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name=”David”; Empty set (0,00 sec) David

Slide 9

Slide 9

Search like INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%David%”; +———————+———————————+ | name | comments | +———————+———————————+ | David Pilato | Developer at elastic | | David Gageot | Engineer at Doctolib | | David David | Who is that guy? | +———————+———————————+ David

Slide 10

Slide 10

Search for terms INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%David Pilato%”; +———————+———————————+ | name | comments | +———————+———————————+ | David Pilato | Developer at elastic | +———————+———————————+ David Pilato

Slide 11

Slide 11

Search with inverted terms INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%Pilato David%”; Empty set (0,00 sec) SELECT * FROM user WHERE name LIKE “%Pilato%David%”; Empty set (0,00 sec) Pilato David

Slide 12

Slide 12

Search for terms INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%David%” AND name LIKE “%Pilato%”; +———————+———————————+ | name | comments | +———————+———————————+ | David Pilato | Developer at elastic | +———————+———————————+ Pilato David

Slide 13

Slide 13

Search in two fields INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%David%” OR comments LIKE “%David%”; +———————+——————————————————————-+ | name | comments | +———————+——————————————————————-+ | David Pilato | Developer at elastic | | Malloum Laya | Worked with David at french customs service | | David Gageot | Engineer at Doctolib | | David David | Who is that guy? | +———————+——————————————————————-+ David

Slide 14

Slide 14

Slide 15

Slide 15

Search with typos INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%Dadid%”; Empty set (0,00 sec) Dadid

Slide 16

Slide 16

Search with typos INSERT INTO user VALUES INSERT INTO user VALUES customs service’); INSERT INTO user VALUES INSERT INTO user VALUES (‘David Pilato’, ‘Developer at elastic’); (‘Malloum Laya’, ‘Worked with David at french (‘David Gageot’, ‘Engineer at Doctolib’); (‘David David’, ‘Who is that guy?’); SELECT * FROM user WHERE name LIKE “%adid%” OR name LIKE “%D_did%” OR name LIKE “%Da_id%” OR name LIKE “%Dad_d%” OR name LIKE “%Dadi%”; +———————+———————————+ | name | comments | +———————+———————————+ | David Pilato | Developer at elastic | | David Gageot | Engineer at Doctolib | | David David | Who is that guy? | +———————+———————————+ Dadid

Slide 17

Slide 17

Slide 18

Slide 18

User Interface

Slide 19

Slide 19

What is a search engine? ● Index engine (indexing documents) ● Search engine (within the created indices)

Slide 20

Slide 20

Demo time!

Slide 21

Slide 21

Elasticsearch You Know, for Search

Slide 22

Slide 22

GET /_analyze { “char_filter”: [ “html_strip” ], “tokenizer”: “standard”, “filter”: [ “lowercase”, “stop”, “snowball” ], “text”: “These are <em>not</em> the droids you are looking for.” }

Slide 23

Slide 23

“char_filter”: “html_strip” These are <em>not</em> the droids you are looking for. These are not the droids you are looking for.

Slide 24

Slide 24

“tokenizer”: “standard” These are not the droids you are looking for. These are not the droids you are looking for

Slide 25

Slide 25

“filter”: “lowercase” These are not the droids you are looking for these are not the droids you are looking for

Slide 26

Slide 26

“filter”: “stop” These are not the droids you are looking for these are not the droids you are looking for droids you looking

Slide 27

Slide 27

“filter”: “snowball” These are not the droids you are looking for these are not the droids you are looking for droids you droid you looking look

Slide 28

Slide 28

These are <em>not</em> the droids you are looking for. { “tokens”: [{ “token”: “droid”, “start_offset”: 27, “end_offset”: 33, “type”: “<ALPHANUM>”, “position”: 4 },{ “token”: “you”, “start_offset”: 34, “end_offset”: 37, “type”: “<ALPHANUM>”, “position”: 5 }, { “token”: “look”, “start_offset”: 42, “end_offset”: 49, “type”: “<ALPHANUM>”, “position”: 7 }]}

Slide 29

Slide 29

Elasticsearch You Know, for Vector Search

Slide 30

Slide 30

Example: 1-dimensional vector Character Vector [ 1 ] ] Realistic

[ Embeddings represent your data Cartoon 1

Slide 31

Slide 31

represent different data aspects Human Character Vector [ 1, 1 Realistic Cartoon ] ] Machine

[ Multiple dimensions 1, 0

Slide 32

Slide 32

is grouped together Human Character Vector [ 1.0, 1.0 Realistic Cartoon 1.0, 0.0 [ 1.0, 0.8 ] ] ]

Machine

[ Similar data

Slide 33

Slide 33

Vector search ranks objects by similarity (~relevance) to the query Human Rank Query 1 Realistic Cartoon 2 3 4 5 Machine Result

Slide 34

Slide 34

Data Ingestion and Embedding Generation POST /_doc { “_id”:”product-1234”, “product_name”:”Summer Dress”, “description”:”Our best-selling…”, “Price”: 118, “color”:”blue”, “fabric”:”cotton”, “fabric”:”cotton” } “desc_embedding”:[0.452,0.3242,…], “desc_embedding”:[0.452,0.3242,…] } “img_embedding”:[0.012,0.0,…] } Source data POST /_doc

Slide 35

Slide 35

Vector Query GET product-catalog/_search { “query” : { “bool”: { “must”: [{ “knn”: { “field”: “desc_embbeding”, “num_candidates”: 50, “query_vector”: [0.123, 0.244,…] } }], “filter”: { “term”: { “department”: “women” } } } } }, “size”: 10

Slide 36

Slide 36

Similarity Human q cos(θ) = d1 d2 Realistic θ q⃗ × d ⃗ | q⃗ | × | d |⃗ _score = 1 + cos(θ) 2

Slide 37

Slide 37

Similarity: cosine (cosine) θ Similar vectors θ close to 0 cos(θ) close to 1 1+1 _score = =1 2 θ Orthogonal vectors θ close to 90° cos(θ) close to 0 1+0 _score = = 0.5 2 θ Opposite vectors θ close to 180° cos(θ) close to -1 1−1 _score = =0 2

Slide 38

Slide 38

https://djdadoo.pilato.fr/

Slide 39

Slide 39

https://github.com/dadoonet/music-search/

Slide 40

Slide 40

One Search AI Platform Two Out-of-the-Box Solutions The Freedom to Build Anything Out-of-the-Box Solutions Elastic Observability Build Your Own Elastic Security Elastic Search The Elastic Search AI Platform Ingest Secure & Scalable Storage AI / ML Search Visualization Workflow Automation

Slide 41

Slide 41

www.meetup.com/ElasticFR @elasticfr discuss.elastic.co

Slide 42

Slide 42

Thank You