ELASTIC ENTERPRISE SEARCH
Search everything, anywhere Easily implement powerful, modern search experiences across your website, app, or digital workplace. Search it all, simply.
Slide 6
ELASTIC OBSERVABILITY
Unified visibility across your entire ecosystem Bring your logs, metrics, and traces together into a single stack so you can monitor, detect, and react to events with speed.
Slide 7
ELASTIC SECURITY
Security how it should be: open Elastic Security integrates endpoint security and SIEM to give you prevention, collection, detection, and response capabilities for unified protection across your infrastructure.
Slide 8
Slide 9
Slide 10
Parsing a stream and getting content and metadata
static void extractTextAndMetadata(InputStream stream) throws Exception { BodyContentHandler handler = new BodyContentHandler(); Metadata metadata = new Metadata(); try (stream) { new DefaultParser().parse(stream, handler, metadata, new ParseContext()); String extractedText = handler.toString(); String title = metadata.get(TikaCoreProperties.TITLE); String keywords = metadata.get(TikaCoreProperties.KEYWORDS); String author = metadata.get(TikaCoreProperties.CREATOR); } }
Slide 11
ingest-attachment plugin extracting from BASE64 or CBOR
11
Slide 12
An ingest pipeline
Slide 13
ingest-attachment processor plugin using Tika behind the scene
Slide 14
Demo
https://cloud.elastic.co 14
Slide 15
FSCrawler You know, for files…
15
Slide 16
Slide 17
Disclaimer This project is a community project. It is not officially supported by Elastic. Support is only provided by FSCrawler community on discuss and stackoverflow. http://discuss.elastic.co/ https://stackoverflow.com/questions/tagged/fscrawler
Slide 18
FSCrawler Architecture
FSCrawler Local Dir
JSON (noop)
Mount Point
XML
SSH / SCP / FTP
Apache Tika
ES 6/7/8
HTTP Rest Inputs
Filters
Outputs
Slide 19
FSCrawler Key Features
• • •
Much more formats than ingest attachment plugin OCR (Tesseract) Much more metadata than ingest attachment plugin (See https://fscrawler.readthedocs.io/en/latest/admin/fs/elasticsearch.html#generated-fields)
•
Extraction of non standard metadata
Slide 20
Demo
https://cloud.elastic.co 20
Slide 21
FSCrawler even better with a UI
21
Slide 22
FSCrawler Architecture
FSCrawler Local Dir
JSON (noop)
Mount Point
XML
SSH / SCP / FTP
Apache Tika
WP 7/8
Filters
Outputs
ES 6/7/8
HTTP Rest Inputs
Slide 23
Demo
https://cloud.elastic.co 23
Slide 24
Be t 8. a 2
Network drives connector package for Enterprise Search
https://github.com/elastic/enterprise-search-network-drives-connector/
Slide 25
شكرا لك PR are warmly welcomed!
https://github.com/dadoonet/fscrawler