Kafka in the Real World

Originally published in The New Stack Update.

If you’ve never used Apache Kafka you would think it’s a magical Excelsior because it’s applied to many different situations. Even people with hands-on experience argue about how to classify it. Is it for pub-sub and messaging, and thus comparable to RabbitMQ? In Confluent’s recent survey of Kafka users only half said they used it for messaging. Instead, two-thirds process data streams with it. In this regard, it is comparable to Spark Streaming and Akka Streams. However, Kafka is often used in conjunction with these technologies and/or other vendor products (e.g., StreamSets, which received Series B funding this week). When looking specifically at the Kafka Streams API, the top functionality was ETL — a flavor of data integration — with 40 percent using it with the API. Broadly speaking, data pipelines are the top “use cases” for Kafka. In practice, these use cases are most often used in asynchronous apps, data warehouses and recommendation engines.