未加星标

Migrating MySql Data Into Elasticsearch Using Logstash

字体大小 | |
[数据库(mysql) 所属分类 数据库(mysql) | 发布者 店小二04 | 时间 2016 | 作者 红领巾 ] 0人收藏点击收藏

In many cases we want to use inputs from different databaseswhich are not natively supported by Elasticsearch. In this postwe show how tomigrate data from a mysql database to Elasticsearch via Logstash.

JBDCPlugin

The JDBC plugin available for Logstash makes sure that the data from any database with a JDBC interface can be taken into Logstash as input. This plugin also supports aneed for schedulingrunning logstash. It alsomakes selective data as input by using queries. In these kind of databases we have the concept of rows and columns. Each row is considered asingle event and the columns in each row (event) are consideredfields in the event.

The following block diagram explains the role of theJDBC connector plugin in migratingdata from a JDBC supported database:


Migrating MySql Data Into Elasticsearch Using Logstash

In the diagramwe have logstash running the configuration filewhich firesthe predefined query we have set to collect the data of our intereststo the sequential database. Once the query is fired to the JDBC plugin,it passes it to the database and collects the data, which it willhand over to Logstash.

According to our requirements, we can process the data and make it in the desired form,after the processing the processed data is indexed to Elasticsearch. We show the detailed application in the examplein the coming sections.

Inserting MySql Data to Elasticsearch

Let us move on to migratingdata from a sequential database, such asMySql to Elasticsearch with the help of Logstash. We require the corresponding JDBC driver for MySql. You can download it here. Now let us create a database named "testdb" in MySql using the following command:

create testdb

The database is now created and we just make sure we are using the same database for our purposes:

show databases; use testdb;

Create a table named "testtable" under the database "testdb" with the following schema:

create table testtable (PersonID int, LastName varchar(255), FirstName varchar(255), City varchar(255), Date datetime(6));

Now insert some test data into the above table:

INSERT INTO testtable (PersonID, LastName, FirstName, City, Date) VALUES ('4005','Kallis','Jaques','Cape Town','2016-05-23 16:12:03.568810'); INSERT INTO testtable (PersonID, LastName, FirstName, City, Date) VALUES ('4004','Baron','Richard','Cape Town','2016-05-23 16:12:04.370460'); INSERT INTO testtable (PersonID, LastName, FirstName, City, Date) VALUES ('4003','McWell','Sharon','Cape Town','2016-05-23 16:12:06.874801');

We havecreated a table with the details of 3 employees.You candisplay the details of the table to show all of it contents by passing the query:

select * from testtable

The resulting table will look like:


Migrating MySql Data Into Elasticsearch Using Logstash
Logstash Configuration

Now that we have created a MySql table with contents as shown in the above section,look how Logstash must be configured. In the logstash folder, we have a logstash.conf file which is the one to be configured and run to obtain the necessary results.The initial configuration isshown in the following screenshot:


Migrating MySql Data Into Elasticsearch Using Logstash

In the above configuration filewe have mentioned numerous parameters such as: Which database should the JDBC connector check for data, the location for the JDBC plugin, the user name and password for MySql access, and the query statement.After the above settings are applied to the "logstash.conf" file,run Logstash by typing in the command below:

bin/logstash -f logstash.conf

As mentioned in the block diagram in the JDBC section, the logstash configuration file passes the query to the JDBC driver along with the user credentials. It alsofetchs the data and gives the data to Logstash. Logstash willmake it JSON formatted and index to the Elasticsearch database.Query the index "test-migrate" like below:

curl -XPOST 'http://localhost:9200/test-migrate/_search?pretty=true' -d '{}'

The above query listseach row as aseparate document with the columns as field. An example:

{ "_index": "test-migrate", "_type": "data", "_id": "4004", "_score": 1, "_source": { "personid": 4004, "lastname": "Baron", "firstname": "Richard", "city": "Cape Town", "date": "2016-05-23T10:42:04.370Z", "@version": "1", "@timestamp": "2016-07-10T10:36:43.685Z" } } More Configurations

In this section we showvarious use case scenarios. Add another row of data to the above MySql like below:

INSERT INTO testtable (PersonID, LastName, FirstName, City, Date) VALUES ('4002','Cheers','Joan','Cape Town','2016-05-23 16:12:07.163681');

Also, update existing values of a row in the same table like below:

UPDATE testtable -> SET FirstName='James' -> WHERE PersonID=4005; 1. Duplication Issue

After the above procedure, run the logstash configuration file again. We expecta total of4 documents inclusive to thenew row and the updated row. However,when checking the index again, that it is not the case. Instead we havea total of 7 documents. This happens becausethe initial documents are left untouched in the elasticsearch database, which is due to notgivinga specific id from the table itself. When we run the logstash configuration, the entire content in the "testtable" is indexed once more.

How can we solve thisduplication?We have to give a unique ID for each document. For every run, the ID should be the same for each documentwhich prevents the duplication issue. This can be made possible by editing the output section of the conf file as given below:


Migrating MySql Data Into Elasticsearch Using Logstash
2. Mutate Operation

One of the other important needs we encounterwillbe changingthe field names and values as we index to elasticsearch.We add a requirement to our current example todemonstrate this. For all the documents which match "Cape Town" as the value, the field "City" should be replaced with value "South Africa" and the field value should bereplaced with "Country" .

In order to achieve this requirement, use the "filter" property to manipulate the ingested data in elasticsearch. Use the "mutate" property inside "filter" to executethe required change.With the above settings for the "input" and "output" sections, we needto add the following "filter" section in the logstash.conf file:

filter { if [city] == "Cape Town" { mutate { rename => {"city" => "country"} replace => [ "country", "South Africa" ] } } }

The ingested data ischecked for the value "Cape Town" for each event's "City" column. If itfindsa match, the "City" field isrenamed "country" and the value for each match isreplaced with "South Africa" .

3. Scheduled &Incremental Updating

What if the data is continuously being updated in the MySql database and we need to index it incrementally and periodically?For periodically fetching the data,add the "scheduler" property in the input section. The "scheduler," when given a value, makesthe conf file run at p

本文数据库(mysql)相关术语:navicat for mysql mysql workbench mysql数据库 mysql 存储过程 mysql安装图解 mysql教程 mysql 管理工具

分页:12
转载请注明
本文标题:Migrating MySql Data Into Elasticsearch Using Logstash
本站链接:http://www.codesec.net/view/483718.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(mysql) | 评论(0) | 阅读(32)