Python Elasticsearch DSL 的使用

一、Elasticsearch的基本概念

  • Index:Elasticsearch用來存儲數據的邏輯區域,它類似於關係型數據庫中的database 概念。一個index可以在一個或者多個shard上面,同時一個shard也可能會有多個replicas。
  • Document:Elasticsearch裡面存儲的實體數據,類似於關係數據中一個table裡面的一行數據。 document由多個field組成,不同的document裡面同名的field一定具有相同的類型。document裡面field可以重複出現,也就是一個field會有多個值,即multivalued。
  • Document type:為了查詢需要,一個index可能會有多種document,也就是document type. 它類似於關係型數據庫中的 table 概念。但需要注意,不同document裡面同名的field一定要是相同類型的。
  • Mapping:它類似於關係型數據庫中的 schema 定義概念。存儲field的相關映射信息,不同document type會有不同的mapping。

二、Python Elasticsearch DSL使用簡介

1、安裝

<code>$ pip install elasticsearch-dsl/<code>

2、創建索引和文檔

<code>from datetime import datetime
from elasticsearch_dsl import DocType, Date, Integer, Keyword, Text
from elasticsearch_dsl.connections import connections
# Define a default Elasticsearch client
connections.create_connection(hosts=['localhost'])
class Article(DocType):

title = Text(analyzer='snowball', fields={'raw': Keyword()})
body = Text(analyzer='snowball')
tags = Keyword()
published_from = Date()
lines = Integer()
class Meta:
index = 'blog'
def save(self, ** kwargs):
self.lines = len(self.body.split())
return super(Article, self).save(** kwargs)
def is_published(self):
return datetime.now() >= self.published_from
# create the mappings in elasticsearch
Article.init()/<code>

創建了一個索引為blog,文檔為article的Elasticsearch數據庫和表。必須執行Article.init()方法。 這樣Elasticsearch才會根據你的DocType產生對應的Mapping。否則Elasticsearch就會在你第一次創建Index和Type的時候根據你的內容建立對應的Mapping。

現在我們可以通過Elasticsearch Restful API來檢查

<code>http GET http://127.0.0.1:9200/blog/_mapping/
{"blog":
\t{"mappings":
\t\t{"article":
\t\t\t{"properties":{
\t\t\t\t"body":{"type":"text","analyzer":"snowball"},
\t\t\t\t"lines":{"type":"integer"},
\t\t\t\t"published_from":{"type":"date"},
\t\t\t\t"tags":{"type":"keyword"},
\t\t\t\t"title":{"type":"text","fields":{"raw":{"type":"keyword"}},"analyzer":"snowball"}
\t\t\t}
\t\t}}
\t}
}/<code>

三、使用Elasticsearch進行CRUD操作

1、Create an article

<code># create and save and article
article = Article(meta={'id': 1}, title='Hello elasticsearch!', tags=['elasticsearch'])

article.body = ''' looong text '''
article.published_from = datetime.now()
article.save()/<code>

=>Restful API

<code>http POST http://127.0.0.1:9200/blog/article/1 title="hello elasticsearch" tags:='["elasticsearch"]'
HTTP/1.1 201 Created
Content-Length: 73
Content-Type: application/json; charset=UTF-8
{
"_id": "1",
"_index": "blog",
"_type": "article",
"_version": 1,
"created": true
}/<code>

2、Get a article

<code>article = Article.get(id=1)
# 如果獲取一個不存在的文章則返回None
a = Article.get(id='no-in-es')
a is None
# 還可以獲取多個文章
articles = Article.mget([1, 2, 3])/<code>

=>Restful API

<code>http GET http://127.0.0.1:9200/blog/article/1
HTTP/1.1 200 OK
Content-Length: 141
Content-Type: application/json; charset=UTF-8
{
"_id": "1",
"_index": "blog",
"_source": {
"tags": [
"elasticsearch"
],
"title": "hello elasticsearch"
},
"_type": "article",
"_version": 1,
"found": true
}/<code>

3、Update a article

<code>article = Article.get(id=1)
article.tags = ['elasticsearch', 'hello']
article.save()
# 或者
article.update(body='Today is good day!', published_by='me')/<code>

=>Restful API

<code>http PUT http://127.0.0.1:9200/blog/article/1 title="hello elasticsearch" tags:='["elasticsearch", "hello"]'
HTTP/1.1 200 OK
Content-Length: 74
Content-Type: application/json; charset=UTF-8
{
"_id": "1",
"_index": "blog",
"_type": "article",
"_version": 2,
"created": false
}/<code>

4、Delete a article

<code>article = Article.get(id=1)
article.delete()/<code>

=> Restful API

<code>http DELETE http://127.0.0.1:9200/blog/article/1
HTTP/1.1 200 OK
Content-Length: 71
Content-Type: application/json; charset=UTF-8
{
"_id": "1",
"_index": "blog",
"_type": "article",
"_version": 4,
"found": true
}
http HEAD http://127.0.0.1:9200/blog/article/1
HTTP/1.1 404 Not Found
Content-Length: 0
Content-Type: text/plain; charset=UTF-8/<code>


分享到:


相關文章: