前言 前面我已經搭建好了ElasticSearch服務,並完成了MySQL到ElasticSearch的數據遷移; 使用ElasticSearch的初衷就是為了大數據搜索,本文將介紹ElaticSearch中各種查詢方法; 一、精確查詢(termQuery) termQuery不會對查詢條件進行分詞 ...
前言
前面我已經搭建好了ElasticSearch服務,並完成了MySQL到ElasticSearch的數據遷移;
使用ElasticSearch的初衷就是為了大數據搜索,本文將介紹ElaticSearch中各種查詢方法;
一、精確查詢(termQuery)
term精確查詢並不會對查詢條件進行分詞,類似於MySQL中 select * from table where 欄位='xx值';
GET hotel/_search
{
"query": {
"term": {
"brand": {
"value": "萬豪"
}
}
},
"from": 0,
"size": 20
}
//按照品牌精確查詢 @Override public Map<String, Object> brandTermQuery(int current, int size, Map<String, Object> searchParam) { //按品牌精確查詢實現 //1.獲取前端參數 String brand = (String) searchParam.get("brand"); //響應前端的Map Map<String, Object> resultMap = new HashMap<>(); //2.構建查詢條件 //查詢請求 SearchRequest hotelSearchRequest = new SearchRequest("hotel"); //請求體 SearchSourceBuilder hotelSearchSourceBuilder = new SearchSourceBuilder(); //如果查詢條件為空就查詢所有 if (StringUtils.hasText(brand)) { //請求體-查詢部分 TermQueryBuilder hotelTermQueryBuilder = QueryBuilders.termQuery("brand", brand); hotelSearchSourceBuilder.query(hotelTermQueryBuilder); } //請求體-分頁部分 hotelSearchSourceBuilder.from((current - 1) * size); hotelSearchSourceBuilder.size(size); //查詢請求-封裝請求體 hotelSearchRequest.source(hotelSearchSourceBuilder); //3.去查詢 try { SearchResponse hotelSearchResponse = restHighLevelClient.search(hotelSearchRequest, RequestOptions.DEFAULT); //4.處理查詢結果集 SearchHits hotelSearchResponseHits = hotelSearchResponse.getHits(); //獲取命中總條目 Long totalHotelHits = hotelSearchResponseHits.getTotalHits().value; //獲取命中的每1個條 SearchHit[] hoteHits = hotelSearchResponseHits.getHits(); //前端 ArrayList<HotelEntity> hotelEntitieList = new ArrayList<>(); if (hoteHits != null || hoteHits.length > 0) { for (SearchHit hoteHit : hoteHits) { String sourceAsString = hoteHit.getSourceAsString(); //字元串轉換成Java對象 HotelEntity hotelEntity = JSON.parseObject(sourceAsString, HotelEntity.class); hotelEntitieList.add(hotelEntity); } } //前端展示 resultMap.put("list", hotelEntitieList); resultMap.put("totalResultSize", totalHotelHits); //設置分頁相關 resultMap.put("current", current); resultMap.put("totalPage", (totalHotelHits + size - 1) / size); } catch (IOException e) { throw new RuntimeException(e); } return resultMap; }HotelServiceImpl.java
二、中文分詞器
如果設置了欄位的type為keyword,就可以對該欄位使用term精確查詢;
如果設置了欄位的type為text,
當用戶進行term查詢時,ES會將當前查詢條件當做1個term(詞條),和當前倒排索引中term(詞條)進行匹配?
匹配成功則會查詢到數據,如果倒排索引中不存在該term(詞條)則查詢不到數據。
那我們如何對text類型的欄位進行term查詢呢?
這就需要利用中文分詞器對文檔中的內容進行中文分詞, 重構ES的倒排索引的結構,把整個文檔分詞成為若幹中文term(詞條)
1.ElasticSearch內置分詞器
在ElasticSearch預設內置了多種分詞器:
- Simple Analyzer - 按照非字母切分(符號被過濾)
- Whitespace Analyzer - 按照空格切分,不轉小寫
- Keyword Analyzer - 不分詞,直接將輸入當作輸出
2.預設分詞無法對中文分詞
看看ES是預設使用Standard Analyzer分詞器對文檔內容進行分詞;
GET _analyze
{
"text": "北京市東城區萬豪酒店"
}
3.
#因為啟動es時候 已經做好的目錄掛載 容器內部:/usr/share/elasticsearch/plugins 宿主機:/mydata/elasticsearch/plugins 所以只需要將文件複製到/mydata/elasticsearch/plugins 目錄下即可
docker restart elasticsearch
3.3.測試
GET /_analyze { "analyzer": "ik_max_word", "text": "北京市東城區萬豪酒店" }
IK分詞器有兩種分詞模式it_max_word
和ik_smart
模式
4.3.修改IK分詞器的配置文件
vim IKAnalyzer.cfg.xml #修改配置文件 註意這個地方 不要把搞亂碼了!!! <?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <comment>IK Analyzer 擴展配置</comment> <!--用戶可以在這裡配置自己的擴展字典 --> <entry key="ext_dict">my.dic</entry> <!--用戶可以在這裡配置自己的擴展停止詞字典--> <entry key="ext_stopwords">extra_stopword.dic</entry> <!--用戶可以在這裡配置遠程擴展字典 --> <entry key="remote_ext_dict">http://106.75.109.43:28888/remote.dic</entry> <!--用戶可以在這裡配置遠程擴展停止詞字典--> <!-- <entry key="remote_ext_stopwords">http://ip地址:埠號/詞典文件</entry> --> </properties>
PUT hotel_2 { "mappings": { "properties": { "name":{ "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "address":{ "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "brand":{ "type": "keyword" }, "type":{ "type": "keyword" }, "price":{ "type": "integer" }, "specs":{ "type": "keyword" }, "salesVolume":{ "type": "integer" }, "area":{ "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "imageUrl":{ "type": "text" }, "synopsis":{ "type": "text", "analyzer": "ik_max_word", "search_analyzer": "ik_smart" }, "createTime":{ "type": "date", "format": "yyyy-MM-dd" }, "isAd":{ "type":"integer" } } } } #重建索引 非同步構建和平滑構建 POST _reindex?wait_for_completion=false&requests_per_second=2000 { "source": { "index": "原始索引名字" }, "dest": { "index": "目標索引名字" } } #查看任務完成情況 GET _tasks/任務id #重建別名關聯關係 #斷開原來的關係 POST _aliases { "actions": [ { "remove": { "index": "hotel_1", "alias": "hotel" } } ] } #刪除原來的索引表 DELETE hotel_1 #新建hotel_2的關係 POST _aliases { "actions": [ { "add": { "index": "hotel_2", "alias": "hotel" } } ] }dms
4.5.測試文檔是否被分詞
此時文檔在存儲時已經被中文分詞器進行了中文分詞並存儲,我們就可以使用termQuery精確查詢進行分詞結果測試了;
由於termQuery精確查詢,不會對查詢條件進行分詞,所依我根據分詞結果進行查詢,如果分詞成功,就會查詢到text欄位的結果;
三、分詞查詢(mathQuery)
上述的term精確查詢必須要根據分詞之後的結果進行精確查詢;
可是用戶不知道你的文檔是怎麼分詞的,所以我們需要對用戶的查詢條件也進行分詞;
1.Kibana分詞查詢
GET hotel/_search
{
"query": {
"match": {
"name":"北京市東城區瑞麟灣"
}
}
}
matchQuery會對查詢條件進行分詞,並拿分詞後的結果,去ES中進行逐一匹配,預設取結果並集。
2.JavaAPI分詞查詢
//根據酒店名稱匹配查詢 @Override public Map<String, Object> nameMatchQuery(Integer current, Integer size, Map<String, Object> searchParam) { //設置查詢 SearchRequest searchRequest = new SearchRequest("hotel"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); Map<String, Object> map = new HashMap<>(); //獲取name參數 String name = (String) searchParam.get("name"); if (StringUtils.hasText(name)) { //組裝查詢對象 MatchQueryBuilder nameMatchQueryBuilder = QueryBuilders.matchQuery("name", name); searchSourceBuilder.query(nameMatchQueryBuilder); } //設置分頁 searchSourceBuilder.from((current - 1) * size); searchSourceBuilder.size(size); searchRequest.source(searchSourceBuilder); //處理查詢結果 try { SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT); SearchHits hits = searchResponse.getHits(); long totalHits = hits.getTotalHits().value; SearchHit[] searchHits = hits.getHits(); List<HotelEntity> list = new ArrayList<>(); for (SearchHit searchHit : searchHits) { String sourceAsString = searchHit.getSourceAsString(); list.add(JSON.parseObject(sourceAsString, HotelEntity.class)); } map.put("list", list); map.put("totalResultSize", totalHits); map.put("current", current); //設置總頁數 map.put("totalPage", (totalHits + size - 1) / size); } catch (IOException e) { throw new RuntimeException(e); } return map; }HotelServiceImpl.java
四、
1.Kibana分詞查詢
GET hotel/_search
{
"query": {
"wildcard": {
"brand": {
"value": "美*"
}
}
}
}
2.JavaAPI分詞查詢
//根據酒店品牌模糊查詢 @Override public Map<String, Object> nameWildcardQuery(Integer current, Integer size, Map<String, Object> searchParam) { //設置查詢 SearchRequest searchRequest = new SearchRequest("hotel"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); //根據酒店名稱模糊查詢 //1.獲取前端參數 String name = (String) searchParam.get("name"); //2.組裝查詢對象 if (StringUtils.hasText(name)) { WildcardQueryBuilder brandWildcardQuery = QueryBuilders.wildcardQuery("brand", name+"*"); searchSourceBuilder.query(brandWildcardQuery); } //設置分頁 searchSourceBuilder.from((current - 1) * size); searchSourceBuilder.size(size); searchRequest.source(searchSourceBuilder); Map<String, Object> map = new HashMap<>(); try { SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT); SearchHits hits = searchResponse.getHits(); long totalHits = hits.getTotalHits().value; SearchHit[] searchHits = hits.getHits(); List<HotelEntity> list = new ArrayList<>(); for (SearchHit searchHit : searchHits) { String sourceAsString = searchHit.getSourceAsString(); list.add(JSON.parseObject(sourceAsString, HotelEntity.class)); } map.put("list", list); map.put("totalResultSize", totalHits); map.put("current", current); //設置總頁數 map.put("totalPage", (totalHits + size - 1) / size); } catch (IOException e) { throw new RuntimeException(e); } return map; }HotelServiceImpl.java
GET hotel/_search
{
"query": {
"query_string": {
"fields": ["name","brand","address","synopsis"],
"query": "萬豪 OR 北京 OR 上海"
}
}
}
//根據name,synopsis,area,address進行多域(欄位)查詢 @Override public Map<String, Object> searchQueryStringQuery(Integer current, Integer size, Map<String, Object> searchParam) { //設置查詢 SearchRequest searchRequest = new SearchRequest("hotel"); SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder(); Map<String, Object> map = new HashMap<>(); //根據name,synopsis,area,address進行多域查詢 String condition = (String) searchParam.get("condition"); //組裝查詢對象 if (StringUtils.hasText(condition)) { QueryStringQueryBuilder queryStringQueryBuilder = QueryBuilders.queryStringQuery(condition) .field("name") .field("address") .field("synopsis") .field("area") .defaultOperator(Operator.OR); searchSourceBuilder.query(queryStringQueryBuilder); } //設置分頁 searchSourceBuilder.from((current - 1) * size); searchSourceBuilder.size(size); searchRequest.source(searchSourceBuilder); try { SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT); SearchHits hits = searchResponse.getHits(); long totalHits = hits.getTotalHits().value; SearchHit[] searchHits = hits.getHits(); List<HotelEntity> list = new ArrayList<>(); for (SearchHit searchHit : searchHits) { String sourceAsString = searchHit.getSourceAsString(); list.add(JSON.parseObject(sourceAsString, HotelEntity.class)); } map.put("list", list); map.put("totalResultSize", totalHits); map.put("current", current); //設置總頁數 map.put("totalPage", (totalHits + size - 1) / size); } catch (IOException e) { throw new RuntimeException(e); } return map; }HotelServiceImpl
當用戶進行搜索時,有時會關註該商品的銷量、評論數等信息,對這些進行進行排序,搜索出銷量最高或評論數最多的商品。
#排序查詢:支持多欄位排序 GET hotel/_search { "query": { "match_all": {} }, "sort": [ { "price": { "order": "desc" } }, { "salesVolume": { "order": "asc" } } ] }
七、範圍查詢(range)