주소로 고객 검색 서비스 구축하기 (feat. Elastic Search)
안녕하세요. 클스 입니다.
* 목표
- Elastic Search & Kibana 설치 (1탄) - 보기
- Elastic Search Client for Python 설치 및 프로그램 (2탄) - 보기
- 주소 데이터 검색 구조 설계 및 bulk 생성/업로드 (3탄) - 보기
- 주소 데이터 검색 구조 설계 및 bulk 생성/업로드 (3.5탄) - Polars vs Pandas
- 대량으로 검색 하기 (4탄) - 보기
- 데이터 백업 및 복구 <유지관리> (5탄)
* 개요
업무를 하다보면 가장 중심이 되는 것은 고객이다.
서비스 가입시 고객은 도로명 주소를 시스템에서 조회해서 ~~ 로 까지는 거의 표준화 되어있지만
시스템이 만들어 진지 오래됐거나, 상세 주소 부분은 여러 가지 형태로 입력한다.
예를 들어 000 빌라 가동 203호 -->
000 빌라 가-203
000 빌라 가동 203호
000 빌라 가, 203
000 빌라 가 203
000 빌라 가203호
모든 택배 회사의 경우 지역을 잘아는 사람들이 배송을 하기 때문에 대충 적어도 잘 배송한다.
그렇지만 위와 같이 입력된 주소의 고객을 RDB에서 Fulltext search 나 여러가지 기능을 이용해도 고객을
찾기란 쉽지 않다. 그래서 elastic search를 검토하게 되었습니다. 검토과정을 기록한 글입니다.
* 설치 하기
* Elastic Search 가 실행되기 위해서는 Cluster를 설치하고, 그 안에 서비를 위한 노드를 만들어야 합니다.
여기서는 docker compose 를 이용해서 진행합니다.
* 요구사항
1) Elastic Search 를 설치한다. <한글 형태소 분리를 위한 플러그인 설치 필요 시>
2) 주소 데이터를 입력한다. API로 검색할 수 있어야 한다.
3) 주소를 대충 입력해도 유사한 고객의 ID 를 회신 해야 한다.
* 진행 1: 개별 설치 해보기, 복잡하네요~
1) docker directory 작성
$ mkdir -p ~/docker/es
$ cd ~/docker/es
2) Elastic Search 설치 docker <Docker Desktop 이나 관련 데몬 실행 필요>
공식 docker hub : https://hub.docker.com/_/elasticsearch 원하는 버전 선택
$ docker pull docker.elastic.co/elasticsearch/elasticsearch:8.6.2docker 로 elastic search 실행하기 7.x 와 8.x 는 명령이 좀 다르다.
7.x
$ docker run -d -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" --name \
elasticsearch docker.elastic.co/elasticsearch/elasticsearch:7.9.1
docker.elastic.co/elasticsearch/elasticsearch:8.6.2
/usr/share/elasticsearch/bin/elasticsearch-reset-password
# Password for the 'elastic' user (at least 6 characters) ELASTIC_PASSWORD= # Password for the 'kibana_system' user (at least 6 characters) KIBANA_PASSWORD= # Version of Elastic products STACK_VERSION=8.6.2 # Set the cluster name CLUSTER_NAME=docker-cluster # Set to 'basic' or 'trial' to automatically start the 30-day trial LICENSE=basic #LICENSE=trial # Port to expose Elasticsearch HTTP API to the host ES_PORT=9200 #ES_PORT=127.0.0.1:9200 # Port to expose Kibana to the host KIBANA_PORT=5601 #KIBANA_PORT=80 # Increase or decrease based on the available host memory (in bytes) MEM_LIMIT=1073741824 # Project namespace (defaults to the current folder name if not set) #COMPOSE_PROJECT_NAME=myproject
version: "2.2" services: setup: image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs user: "0" command: > bash -c ' if [ x${ELASTIC_PASSWORD} == x ]; then echo "Set the ELASTIC_PASSWORD environment variable in the .env file"; exit 1; elif [ x${KIBANA_PASSWORD} == x ]; then echo "Set the KIBANA_PASSWORD environment variable in the .env file"; exit 1; fi; if [ ! -f config/certs/ca.zip ]; then echo "Creating CA"; bin/elasticsearch-certutil ca --silent --pem -out config/certs/ca.zip; unzip config/certs/ca.zip -d config/certs; fi; if [ ! -f config/certs/certs.zip ]; then echo "Creating certs"; echo -ne \ "instances:\n"\ " - name: es01\n"\ " dns:\n"\ " - es01\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ " - name: es02\n"\ " dns:\n"\ " - es02\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ " - name: es03\n"\ " dns:\n"\ " - es03\n"\ " - localhost\n"\ " ip:\n"\ " - 127.0.0.1\n"\ > config/certs/instances.yml; bin/elasticsearch-certutil cert --silent --pem -out config/certs/certs.zip --in config/certs/instances.yml --ca-cert config/certs/ca/ca.crt --ca-key config/certs/ca/ca.key; unzip config/certs/certs.zip -d config/certs; fi; echo "Setting file permissions" chown -R root:root config/certs; find . -type d -exec chmod 750 \{\} \;; find . -type f -exec chmod 640 \{\} \;; echo "Waiting for Elasticsearch availability"; until curl -s --cacert config/certs/ca/ca.crt https://es01:9200 | grep -q "missing authentication credentials"; do sleep 30; done; echo "Setting kibana_system password"; until curl -s -X POST --cacert config/certs/ca/ca.crt -u "elastic:${ELASTIC_PASSWORD}" -H "Content-Type: application/json" https://es01:9200/_security/user/kibana_system/_password -d "{\"password\":\"${KIBANA_PASSWORD}\"}" | grep -q "^{}"; do sleep 10; done; echo "All done!"; ' healthcheck: test: ["CMD-SHELL", "[ -f config/certs/es01/es01.crt ]"] interval: 1s timeout: 5s retries: 120 es01: depends_on: setup: condition: service_healthy image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata01:/usr/share/elasticsearch/data ports: - ${ES_PORT}:9200 environment: - node.name=es01 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es02,es03 - ELASTIC_PASSWORD=${ELASTIC_PASSWORD} - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es01/es01.key - xpack.security.http.ssl.certificate=certs/es01/es01.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es01/es01.key - xpack.security.transport.ssl.certificate=certs/es01/es01.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} mem_limit: ${MEM_LIMIT} ulimits: memlock: soft: -1 hard: -1 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 es02: depends_on: - es01 image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata02:/usr/share/elasticsearch/data environment: - node.name=es02 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es01,es03 - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es02/es02.key - xpack.security.http.ssl.certificate=certs/es02/es02.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es02/es02.key - xpack.security.transport.ssl.certificate=certs/es02/es02.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} mem_limit: ${MEM_LIMIT} ulimits: memlock: soft: -1 hard: -1 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 es03: depends_on: - es02 image: docker.elastic.co/elasticsearch/elasticsearch:${STACK_VERSION} volumes: - certs:/usr/share/elasticsearch/config/certs - esdata03:/usr/share/elasticsearch/data environment: - node.name=es03 - cluster.name=${CLUSTER_NAME} - cluster.initial_master_nodes=es01,es02,es03 - discovery.seed_hosts=es01,es02 - bootstrap.memory_lock=true - xpack.security.enabled=true - xpack.security.http.ssl.enabled=true - xpack.security.http.ssl.key=certs/es03/es03.key - xpack.security.http.ssl.certificate=certs/es03/es03.crt - xpack.security.http.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.enabled=true - xpack.security.transport.ssl.key=certs/es03/es03.key - xpack.security.transport.ssl.certificate=certs/es03/es03.crt - xpack.security.transport.ssl.certificate_authorities=certs/ca/ca.crt - xpack.security.transport.ssl.verification_mode=certificate - xpack.license.self_generated.type=${LICENSE} mem_limit: ${MEM_LIMIT} ulimits: memlock: soft: -1 hard: -1 healthcheck: test: [ "CMD-SHELL", "curl -s --cacert config/certs/ca/ca.crt https://localhost:9200 | grep -q 'missing authentication credentials'", ] interval: 10s timeout: 10s retries: 120 kibana: depends_on: es01: condition: service_healthy es02: condition: service_healthy es03: condition: service_healthy image: docker.elastic.co/kibana/kibana:${STACK_VERSION} volumes: - certs:/usr/share/kibana/config/certs - kibanadata:/usr/share/kibana/data ports: - ${KIBANA_PORT}:5601 environment: - SERVERNAME=kibana - ELASTICSEARCH_HOSTS=https://es01:9200 - ELASTICSEARCH_USERNAME=kibana_system - ELASTICSEARCH_PASSWORD=${KIBANA_PASSWORD} - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt mem_limit: ${MEM_LIMIT} healthcheck: test: [ "CMD-SHELL", "curl -s -I http://localhost:5601 | grep -q 'HTTP/1.1 302 Found'", ] interval: 10s timeout: 10s retries: 120 volumes: certs: driver: local esdata01: driver: local esdata02: driver: local esdata03: driver: local kibanadata: driver: local
댓글
댓글 쓰기