🎹🎻🎸 Et si nous cherchions des morceaux de musique 🎼🎶 ?

A presentation at Montreal JUG in February 2024 in Montreal, QC, Canada by David Pilato

Slide 1

Slide 1

Searching for similar music tracks 🎹🎻🎸 David Pilato | @dadoonet

Slide 2

Slide 2

Elasticsearch You Know, for Vector Search

Slide 3

Slide 3

Example: 1-dimensional vector Character Vector [ 1 ] ] Realistic

[ Embeddings represent your data Cartoon 1

Slide 4

Slide 4

represent different data aspects Human Character Vector [ 1, 1 Realistic Cartoon ] ] Machine

[ Multiple dimensions 1, 0

Slide 5

Slide 5

is grouped together Human Character Vector [ 1.0, 1.0 Realistic Cartoon 1.0, 0.0 [ 1.0, 0.8 ] ] ]

Machine

[ Similar data

Slide 6

Slide 6

Vector search ranks objects by similarity (~relevance) to the query Human Rank Query 1 Realistic Cartoon 2 3 4 5 Machine Result

Slide 7

Slide 7

Similarity: cosine (cosine) Human q cos(θ) = d1 d2 Realistic θ q⃗ × d ⃗ | q⃗ | × | d |⃗ _score = 1 + cos(θ) 2

Slide 8

Slide 8

Similarity: cosine (cosine) 1+1 _score = =1 2 1+0 _score = = 0.5 2 1−1 _score = =0 2

Slide 9

Slide 9

Similarity: Dot Product (dot_product) q⃗ × d ⃗ = | q⃗ | × cos(θ) × | d |⃗ q d θ | q⃗ | × co s (θ ) 1 + dot_ product(q, d) scorefloat = 2 0.5 + dot product(q, d) _scorebyte = 32768 × dims

Slide 10

Slide 10

Similarity: Euclidean distance (l2_norm) y 2 n i (x ∑ 1 i= − y i) q l2_normq,d = y1 d x1 y2 x2 n ∑ i=1 (xi − yi) 1 _score = 1 + (l2_normq,d )2 x 2

Slide 11

Slide 11

How do you index vectors ?

Slide 12

Slide 12

Data Ingestion and Embedding Generation POST /_doc { “_id”:”product-1234”, “product_name”:”Summer Dress”, “description”:”Our best-selling…”, “Price”: 118, “color”:”blue”, “fabric”:”cotton”, “fabric”:”cotton” } “desc_embedding”:[0.452,0.3242,…], “desc_embedding”:[0.452,0.3242,…] } “img_embedding”:[0.012,0.0,…] } Source data POST /_doc

Slide 13

Slide 13

How do you search vectors ?

Slide 14

Slide 14

Vector Query GET product-catalog/_search { “knn”: { “field”: “desc_embbeding”, “k”: 5, “num_candidates”: 50, “query_vector”: [0.123, 0.244,…], “filter”: { “term”: { “department”: “women” } } } }, “size”: 10

Slide 15

Slide 15

https://djdadoo.pilato.fr/

Slide 16

Slide 16

https://github.com/dadoonet/music-search/

Slide 17

Slide 17

Searching for similar music tracks 🎹🎻🎸 David Pilato | @dadoonet