Siphon on HDInsight Kafka

Thomas Alex joins Lara Rubbelke to discuss how Microsoft uses Apache Kafka for HDInsight to power Siphon, a data ingestion service for internal use. Apache Kafka for HDInsight is an enterprise-grade, open-source, streaming ingestion service. Microsoft created Siphon as a highly available and reliable service to ingest massive amounts of data for processing in near real time. Siphon handles ingestion of over a trillion events per day across multiple business-critical scenarios at Microsoft. In this episode, learn how Siphon uses Apache Kafka for HDInsight as its scalable pub/sub message queue.

For more information:

  • Quickstart: Create a Kafka on HDInsight cluster
  • Tutorial: Use Spark Structured Streaming with Kafka on HDInsight
  • Apache Kafka for HDInsight overview
  • Azure HDInsight pricing
  • Siphon: Streaming data ingestion with Apache Kafka
  • Create a free account (Azure)

Follow @sqlgal Follow @AzureFriday Follow @AzureHDInsight

Tag: Azure