Elasticsearch Evaluation White Paper Released: Promising for Big Data

We believe that Elasticsearch is a product that everyone working in the field of big data will want to take a look at.

There are many new technologies emerging around search, and we’ve been investigating several of them for our clients. Search has never been “easy” but Elasticsearch attempts to make it at least easier. Elasticsearch is billed to be “built for the cloud,” and with so many companies moving into the cloud, it seems like a natural that search would move there too.  This paper is designed to show you just how Elasticsearch works by setting up a cluster and feeding it data.  We also let you know what tools we use so you can test out the technology and we include a rough sketch of code as well. Finally, we make conclusions about how Elasticsearch can help with problems like Big Data and other search related uses.

Elasticsearch is an open source technology developed by one developer, Shay Bannon. This paper is simply a first look at elasticsearch and is not associated with an additonal product or variation of elaticsearch. The appeal for big data is due to elasticsearch’s wonderful ability to scale with growing content, which has largely been associated with the “big data problem” we all keep hearing about. It’s very easy to add new nodes and it handles the balancing of your data across the available nodes. It handles the failure of nodes in a graceful way that is important in a cloud environment. And lastly, we simply evaluate and test the technology. We really don’t believe there is a one size fits all technology in the realm of enterprise search, it is really highly dependent upon your systems, how many documents you have, how much unstructured data you have, and how you want your site to function. But that said– in terms of storing big data, it is as capable as any Lucene based product; it can handle a much larger load that the current Solr release as the notion of breaking the index up into smaller chunks is “baked in” to the product.

Here is an except from the paper:

“Products like Elasticsearch that lack a document processing component entirely become more attractive. In fact, most projects that involve a data set large enough to qualify as “big data”³² are building their own document processing stages anyway as part of their ETL cycle.”

If you are interested in downloading this free White Paper, sign up with us here.

If you would like help using Elasticsearch with your search project, contact us.