Upload bulk CSV data to ElasticSearch using Python
This post shows how to upload data from a csv file to ElasticSearch using Python ElasticSearch Client - Bulk helpers.
It is assumed that you already have setup ElasticSearch and have a Python environment ready along with some IDE, if not the below link might help you.
If you would like to upload a JSON file instead of a CSV file, then the below post might help you.
Python ElasticSearch Client
This requires to install Python Elasticsearch Client mentioned here - Python Elasticsearch Client Installation or just run the below command from your Python console.
pip install elasticsearch
Uploading bulk data from .CSV file to ElasticSearch using Python code###
Below are the steps I have performed to do the same.
- Read the data from .CSV file to a Panda's dataframe.
- Create JSON string from dataframe by iterating through all the rows and columns
- Convert JSON string to JSON object.
- Upload the JSON object using the Python ElasticSearch Client - bulk helpers
Below is the Python script
import sys
import json
from pprint import pprint
from elasticsearch import Elasticsearch
es = Elasticsearch(
['localhost'],
port=9200
)
MyFile= open("C:\ElasticSearch\shakespeare_6.0.json",'r').read()
ClearData = MyFile.splitlines(True)
i=0
json_str=""
docs ={}
for line in ClearData:
line = ''.join(line.split())
if line != "},":
json_str = json_str+line
else:
docs[i]=json_str+"}"
json_str=""
print(docs[i])
es.index(index='shakespeare', doc_type='Blog', id=i, body=docs[i])
i=i+1
Screenshot: Output of the command running in Python
We can check the uploaded data using the below Python code.
es = Elasticsearch(
['localhost'],
port=9200
)
es = Elasticsearch(ES_CLUSTER)
with open("C:\ElasticSearch\shakespeare_6.0.json") as json_file:
json_docs = json.load(json_file)
es.bulk(ES_INDEX, ES_TYPE, json_docs)
Screenshot: Output of the command running in Python
It can also be verified from Kibana Dev console (if Kibana is already installed)
Kibana GET command
Screenshot: With Kibana GET command and output in the right side
I hope this post might have helped you. Please comment and let me know your thoughts!!