Spark Streaming + Kafka Integration Guide
Apache Kafka is publish-subscribe messaging rethought as a distributed, partitioned, replicated commit log service. Please read the Kafka documentation thoroughly before starting an integration using Spark.
The Kafka project introduced a new consumer API between versions 0.8 and 0.10, so there are 2 separate corresponding Spark Streaming packages available. Please choose the correct package for your brokers and desired features; note that the 0.8 integration is compatible with later 0.9 and 0.10 brokers, but the 0.10 integration is not compatible with earlier brokers.
Note: Kafka 0.8 support is deprecated as of Spark 2.3.0.
|0.8.2.1 or higher
|0.10.0 or higher
|Scala, Java, Python
|SSL / TLS Support
|Offset Commit API
|Dynamic Topic Subscription