Elasticsearch – Data Processing and Exploring the Cluster

In simple word batch processing means we will add many documents in a single request. Batch processing is done by a special format for the request body using what’s called the bulk API. bulk API is not limited to adding documents, it can be used to update or delete documents.

Because we can perform multiple operations within the same request using bulk API we must specify what kind of operation we want to perform for each document. “{ “index”: {“_id”: “103”}}” this is the action part and following is the document. Notice that within the results there is an items property containing an array of objects. Each object in this array is similar to what we get in return when running a single query, its a status object indicating if the operation executed successfully or not.

We can run multiple operations, Like the below image we can see we are executing two operations one is updating a document and another is deleting a specified document.

Importing data using curl
Use this curl command to import data from the JSON file. Be present in the same directory to import data.

Exploring the Cluster
We will be using the “_cat” API which will provide us human-readable information about the cluster. To check the number of shards are running

Check the health of the cluster, i.e. that gives the overall information like how many clusters are there, how many nodes are running

Information about the node, specifically about the uptime we can remove the h=uptime part to get a more general information

To check how the shards are being allocated