Elasticsearch Query Language: ES|QL

A presentation at BBL Infrabel (private event) in November 2024 in Brussels, Belgium by David Pilato

Slide 1

Slide 1

Elasticsearch Query Language ES|QL o s e d li David Pilato - @dadoonet Developer | Evangelist & s m e d

Slide 2

Slide 2

Lucene Painless DSL Canvas KQL EQL Vega SQL Elastic and Kibana support a number of query languages

Slide 3

Slide 3

A brief history of Elasticsearch’s analytical capabilities 2010 2013 2014 2015 2023 Elasticsearch 0.9 Elasticsearch < 0.90 Elasticsearch 1.0 Elasticsearch 2.0 Elasticsearch 8.11 Facets Facet terms-stats Aggregations Pipeline aggregations ES|QL

Slide 4

Slide 4

ES|QL • Language • Engine • Visualization

Slide 5

Slide 5

ES|QL the language

Slide 6

Slide 6

ES|QL Features • Unstructured and structured data • Piped query language • SQL-like filtering and data manipulation • Lookups

Slide 7

Slide 7

ES|QL commands Source (From, Row) Filter (Where) Processing (Eval) Aggregation (Stats) TopN Sort + Limit) Expansion (Enrich , MV_Exand) ( + + + Extraction (Dissect, Grok) 75 functions: • • • • • • • 10 aggregate 20 math 10 string 7 date-time 15 conversion 4 conditionals 12 multi-value / mv_

Slide 8

Slide 8

ES|QL the engine

Slide 9

Slide 9

The new ES|QL execution engine was designed with performance in mind — it operates on blocks at a time instead of per row, targets vectorization and cache locality, and embraces specialization and multi-threading. It is a separate component from the existing Elasticsearch aggregation framework with different performance characteristics.

Slide 10

Slide 10

Query planner ✓ Flexible distributed execution ✓ Allow multiple roundtrips ES|QL Query Parsing Unresolved AST Resolved/Logical Plan Analysis Optimized Plan Planning Physical Plan Local Replanning Execution Results

Slide 11

Slide 11

Compute engine ✓ Tabular data representation ✓ From 1 thread per shard to many ✓ Spilling to disk if needed ✓ Streaming of data across nodes

Slide 12

Slide 12

Vectorization “convert from a scalar implementation, which processes a single pair of operands at a time, to a vector implementation, which processes one operation on multiple pairs of operands at once. “ for (i = 0; i < n; i++) c[i] = a[i] + b[i]; https://en.wikipedia.org/wiki/Automatic_vectorization

Slide 13

Slide 13

Benchmarks https://elasticsearch-benchmarks.elastic.co/#tracks/esql/nightly/default/30d

Slide 14

Slide 14

o e d li ES|QL in action https://github.com/dadoonet/esql-demo s & s m e d

Slide 15

Slide 15

PROJECTIONS Each language client will offer a selection of projections relevant to that language ecosystem. RESULT DATA Ways to consume ES|QL results Users can consume raw data directly from the server output in one of several formats. DataFrame Object / Dict Cursor For mapping domain objects within a client application For incremental consumption of results, with implicit pagination For data science and analytics; integration with frameworks like Pandas Text CSV JSON Human-readable format ideal for interactive work, CLIs, etc Raw CSV data to load directly into spreadsheets and ETL processes Structured response containing metadata and data in a 2D value array Bring your own Custom projections built atop raw server output Apache Arrow Dataframe IPC format

Slide 16

Slide 16

Object API https://github.com/dadoonet/elasticsearch-java-client-demo String query = “”” FROM persons | WHERE name == “David” | KEEP name | LIMIT 1 “”“; Iterable<Person> persons = client.esql() .query(ObjectsEsqlAdapter.of(Person.class), query); for (Person person : persons) { assertNull(person.getId()); assertNotNull(person.getName()); }

Slide 17

Slide 17

ResultSet JDBC API https://github.com/dadoonet/elasticsearch-java-client-demo String query = “”” FROM persons | WHERE name == “David” | KEEP name | LIMIT 1 “”“; try (ResultSet resultSet = client.esql() .query(ResultSetEsqlAdapter.INSTANCE, query)) { assertTrue(resultSet.next()); assertEquals(“David”, resultSet.getString(1)); }

Slide 18

Slide 18

POST /_query 8. 16 { “query”: “”” from logs-* | stats x = ?function(?field) by ?breakdownField A better dashboard experience with named parameters | where x >= ?value “”“, “params”: [ {“function” : {“identifier” : “avg”}}, {“field” : {“identifier” : “network.bytes”}}, {“breakdownField” : {“identifier” : “agent.name”}}, {“value”: 1000} ] }

Slide 19

Slide 19

TD B

Slide 20

Slide 20

TD B

Slide 21

Slide 21

TD B

Slide 22

Slide 22

  1. 17 Coming next WHERE MATCH(actors, “Marlon*”) WHERE QSTR(“bytes:[1024 TO 2048]”)

Slide 23

Slide 23

  1. 18 Coming next WHERE KQL(“bytes>=1024”)

Slide 24

Slide 24

TB joinType JOIN indexName (AS qualifier)? condition? joinType: LOOKUP | LEFT | RIGHT | INNER condition: ON identifier == identifier | USING identifier JOINS! INLINESTATS total_visits = COUNT() FROM employees | SORT emp_no | LOOKUP JOIN languages_lookup ON language_code | KEEP emp_no, language_name ● No need to create an enrich policy ● A drag and drop experience in the UI D

Slide 25

Slide 25

Elasticsearch Query Language ES|QL o s e d li David Pilato - @dadoonet Developer | Evangelist & s m e d