Julien Nioche
Director
Bristol, United Kingdom
Actions
My expertise is in document engineering with a strong focus on open-source tools. I have designed and implemented solutions for Information Retrieval, Text Analysis, Information Extraction, Machine Learning, Web Crawling and Big Data in general for DigitalPebble's clients.
I have been involved in open-source software for more than 20 years and am the author of projects which are used by numerous organizations worldwide.
Crawl the web on a large scale with StormCrawler and Elasticsearch
StormCrawler is a popular and mature open-source web crawler. It is written in Java and is both lightweight and scalable, thanks to the distribution layer based on Apache Storm. One of the attractions of the crawler is that it is extensible and modular, as well as versatile. In this presentation, we will have a closer look at the Elasticsearch module of StormCrawler and see how it is being used in production by various organizations, sometimes on a very large scale.
Please note that Sessionize is not responsible for the accuracy or validity of the data provided by speakers. If you suspect this profile to be fake or spam, please let us know.
Jump to top