Quantcast
Channel: Oracle
Viewing all articles
Browse latest Browse all 1814

Wiki Page: Streaming Kafka Messages to MySQL Database

$
0
0
Written by Deepak Vohra Consider the use case for an open source messaging system in which the messages produced are also to be stored in a relational database. While Apache Kafka produces and consumes messages it does not stream messages to a database. Starting with version 1.6 Apache Flume added support for Kafka as a source, channel sink. In an earlier tutorial we discussed streaming Kafka messages to Oracle Database and developing a Stream Data Platform for data integration and real-time processing of data streams. In this tutorial we shall produce messages at a Kafka producer and stream the messages to MySQL database using Apache Flume. The following sequence is used to stream Kafka messages to MySQL Database. Start MySQL Database. Create a MySQL Database table to receive Kafka messages. Start Kafka ZooKeeper and Kafka Server. Create a Kakfa Topic to send messages Create another Kafka Topic for a Apache Flume channel of type Kafka Start a Kafka Producer Configure a Apache Flume agent with source of type Kafka, channel of type Kafka and sink of type JDBC (MySQL Database) Start Apache Flume Agent Send Messages from Kafka Producer Kafka Messages get streamed to MySQL Database table. The sequence involved in streaming messages from Kafka producer to MySQL Database is shown in following illustration. This tutorial has the following sections. Setting the Environment Creating a MySQL Database Table Starting Kafka Configuring a Apache Flume Agent Starting the Flume Agent Sending messages at Kafka Producer Querying MySQL Database Table Setting the Environment The following software is required for this tutorial. MySQL Database Apache Flume 1.6 Apache Kafka Stratio JDBC Sink MySQL JDBC Driver Jar Jooq Apache Maven Java 7 Create a directory /flume to install the software (except MySQL Database) and set its permissions to global (777). mkdir /flume chmod -R 777 /flume cd /flume Download and extract the Apache Kafka tar file. wget http://apache.mirror.iweb.ca/kafka/0.8.2.1/kafka_2.10-0.8.2.1.tgz tar -xvf kafka_2.10-0.8.2.1.tgz Download and extract the Apache Flume tar file. wget http://archive.apache.org/dist/flume/stable/apache-flume-1.6.0-bin.tar.gz tar -xvf apache-flume-1.6.0-bin.tar.gz Copy the Kakfa jars to Flume lib directory, which is in Flume classpath. cp /flume/kafka_2.10-0.8.2.1/libs/* /flume/apache-flume-1.6.0-bin/lib Set the environment variables for MySQL Database, Flume, Kafka, Maven, and Java. vi ~/.bashrc export MAVEN_HOME=/flume/apache-maven-3.3.3-bin export FLUME_HOME=/flume/apache-flume-1.6.0-bin export KAFKA_HOME=/flume/kafka_2.10-0.8.2.1 export MYSQL_HOME=/mysql/mysql-5.6.19-linux-glibc2.5-i686 export FLUME_CONF=$FLUME_HOME/conf export JAVA_HOME=/flume/jdk1.7.0_55 export PATH=/usr/lib/qt-3.3/bin:/usr/local/sbin:/usr/sbin:/sbin:/usr/local/bin:/usr/bin:/bin:$FLUME_HOME/bin:$JAVA_HOME/bin:$MAVEN_HOME/bin:$MYSQL_HOME/bin:$KAFKA_HOME/bin export CLASSPATH=$FLUME_HOME/lib/* Download, compile and package the Stratio Flume Ingestion Maven project. git clone https://github.com/Stratio/flume-ingestion.git mvn compile mvn package Copy the stratio-jdbc-sink-0.5.0-SNAPSHOT.jar jar file generated to Flume lib directory. cp stratio-jdbc-sink-0.5.0-SNAPSHOT.jar $FLUME_HOME/lib Copy the MySQL JDBC driver jar to Flume lib directory. cp mysql-connector-java-5.1.31-bin.jar $FLUME_HOME/lib Copy the Jooq jar to Flume lib directory. cp jooq-3.6.2 $FLUME_HOME/lib Start MySQL server with the following command. bin/mysqld_safe --user=mysql & MySQL server gets started at localhost:3306 . Creating a MySQL Database Table Start MySQL CLI shell with the following command. bin/mysql -u root MySQL CLI shell gets started. Set database to “test” with use test command. Create MySQL Database table to store the Kafka messages. Run the following SQL script to create a table called kafkamsg . CREATE TABLE kafkamsg(msg VARCHAR(4000)); MySQL Database table gets created. Starting Kafka To start Kafka to produce messages we need to start the following components. ZooKeeper server Kafka server Kafka Topic/s Kafka Producer Kafka Consumer is not required to be started. Start the Kafka ZooKeeper with the following command. cd /flume/kafka_2.10-0.8.2.1 zookeeper-server-start.sh config/zookeeper.properties ZooKeeper server gets started on localhost:2181 . Start Kafka server with the following command. cd /flume/kafka_2.10-0.8.2.1 kafka-server-start.sh config/server.properties Kafka server gets started at localhost:9092 . We need to create two Kafka topics. Topic kafka-mysql to produce messages to be streamed to MySQL Database Topic kafkachannel for Flume channel of type Kafka Use the kafkachannel topic created for the Oracle Database tutorial. Run the following command to create the kafka-mysql Kafka topic. kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1 --partitions 1 --topic kafka-mysql The kafka-mysql topic gets created. Start the Kafka Producer with the following command in which the topic is set to kafka-mysql . kafka-console-producer.sh --broker-list localhost:9092 --topic kafka-mysql Kafka producer gets started. Configuring a Apache Flume Agent Create a Apache Flume configuration file flume.conf in which configure the following Flume components; the Flume source, channel and sink. Flume Source of type Kafka Flume channel of type Kafka Flume sink of type JDBC The Flume configuration properties are discussed in following table. Configuration Property Description Value agent.sources Sets the Flume source kafkaSrc agent.channels Sets the Flume channel channel1 agent.sinks Sets the Flume sink jdbcSink agent.channels.channel1.type Sets the channel type org.apache.flume.channel.kafka. KafkaChannel agent.channels.channel1.brokerList Sets the channel broker list localhost:9092 agent.channels.channel1.topic Sets the Kafka channel topic kafkachannel agent.channels.channel1.zookeeperConnect Sets the Kafka channel ZooKeeper host:port localhost:2181 agent.channels.channel1.capacity Sets the channel capacity 10000 agent.channels.channel1.transactionCapacity Sets the channel transaction capcity 1000 agent.sources.kafkaSrc.type Sets the source type org.apache.flume.source.kafka. KafkaSource agent.sources.kafkaSrc.channels Sets the channel on the source channel1 agent.sources.kafkaSrc.zookeeperConnect Sets the source ZooKeeper host:port localhost:2181 agent.sources.kafkaSrc.topic Sets the Kafka source topic kafka-mysql agent.sinks.jdbcSink.type Sets the sink type com.stratio.ingestion.sink.jdbc.JDBCSink agent.sinks.jdbcSink.connectionString Sets the connection URI for MySQL Database jdbc:mysql://127.0.0.1:3306/test agent.sinks.jdbcSink.username Sets the MySQL Database username root agent.sinks.jdbcSink.password Sets the MySQL Database password agent.sinks.jdbcSink.batchSize Sets the batch size 10 agent.sinks.jdbcSink.channel Sets the channel on the sink channel1 agent.sinks.jdbcSink.sqlDialect Sets the SQL dialect. MYSQL agent.sinks.jdbcSink.driver Sets the MySQL Database JDBC driver class com.mysql.jdbc.Driver agent.sinks.jdbcSink.sql Sets the custom SQL to add data to MySQL Database INSERT INTO kafkamsg(msg) VALUES(${body:varchar}) The flume.conf is listed: agent.sources=kafkaSrc agent.channels=channel1 agent.sinks=jdbcSink agent.channels.channel1.type=org.apache.flume.channel.kafka.KafkaChannel agent.channels.channel1.brokerList=localhost:9092 agent.channels.channel1.topic=kafkachannel agent.channels.channel1.zookeeperConnect=localhost:2181 agent.channels.channel1.capacity=10000 agent.channels.channel1.transactionCapacity=1000 agent.sources.kafkaSrc.type = org.apache.flume.source.kafka.KafkaSource agent.sources.kafkaSrc.channels = channel1 agent.sources.kafkaSrc.zookeeperConnect = localhost:2181 agent.sources.kafkaSrc.topic = kafka-mysql agent.sinks.jdbcSink.type = com.stratio.ingestion.sink.jdbc.JDBCSink agent.sinks.jdbcSink.connectionString = jdbc:mysql://127.0.0.1:3306/test agent.sinks.jdbcSink.username=root agent.sinks.jdbcSink.password= agent.sinks.jdbcSink.batchSize = 10 agent.sinks.jdbcSink.channel =channel1 agent.sinks.jdbcSink.sqlDialect=MYSQL agent.sinks.jdbcSink.driver=com.mysql.jdbc.Driver agent.sinks.jdbcSink.sql=INSERT INTO kafkamsg(msg) VALUES(${body:varchar}) Copy the Flume configuration file to Flume conf directory. cp flume.conf $FLUME_HOME/conf/flume.conf Starting the Flume Agent Start the Flume agent with the following command. flume-ng agent --classpath --conf $FLUME_CONF/ -f $FLUME_CONF/flume.conf -n agent -Dflume.root.logger=INFO,console Flume agent gets started; Flume source, channel and sink get started. Sending messages at Kafka Producer Having started the Kafka producer to send messages at the Kafka producer add a message and click on “Enter” button. Send three messages as an example. Querying MySQL Database Table Because the Flume agent is running, the messages produced at the Kafka producer are streamed to MySQL Database. In MySQL CLI shell run a SQL query to list the messages. The three messages get listed. If the Flume agent and Kafka producer are kept running the messages produced at Kafka producer are streamed as and when produced. Send a few more messages at Kafka producer. The messages get streamed to MySQL Database and could be listed with an SQL query. In this tutorial we used Apache Flume to stream Kafka messages to MySQL Database.

Viewing all articles
Browse latest Browse all 1814

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>