More and more data-driven companies are looking to adopt stream processing and streaming analytics. With this concise ebook, you'll learn best practices for designing a reliable architecture that supports this emerging big-data paradigm.
Authors Ted Dunning and Ellen Friedman (Real World Hadoop) help you explore some of the best technologies to handle stream processing and analytics, with a focus on the upstream queuing or message-passing layer. To illustrate the effectiveness of these technologies, this book also includes specific use cases.
Ideal for developers and non-technical people alike, this book describes:
- Key elements in good design for streaming analytics, focusing on the essential characteristics of the messaging layer
- New messaging technologies, including Apache Kafka and MapR Streams, with links to sample code
- Technology choices for streaming analytics: Apache Spark Streaming, Apache Flink, Apache Storm, and Apache Apex
- How stream-based architectures are helpful to support microservices
- Specific use cases such as fraud detection and geo-distributed data streams
Ted Dunning is Chief Applications Architect at MapR Technologies, and active in the open source community. He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and as committer and PMC member of the Apache ZooKeeper and Drill projects. Ted is on Twitter as @ted_dunning.
Ellen Friedman, a committer for the Apache Drill and Apache Mahout projects, is a solutions consultant and well-known speaker and author, currently writing mainly about big data topics. With a PhD in Biochemistry, she has years of experience as a research scientist and has written about a variety of technical topics. Ellen is on Twitter as @Ellen_Friedman.
About the Author: Ted Dunning is Chief Applications Architect at MapR Technologiesand active in the open source community.
He currently serves as VP for Incubator at the Apache Foundation, as a champion and mentor for a large number of projects, and ascommitter and PMC member of the Apache ZooKeeper and Drillprojects. He developed the t-digest algorithm used to estimateextreme quantiles. T-digest has been adopted by several open sourceprojects. He also developed the open source log-synth projectdescribed in the book Sharing Big Data Safely (O'Reilly).
Ted was the chief architect behind the MusicMatch (now YahooMusic) and Veoh recommendation systems, built fraud-detectionsystems for ID Analytics (LifeLock), and has issued 24 patents todate. Ted has a PhD in computing science from University of Sheffield.When he's not doing data science, he plays guitar and mandolin.Ted is on Twitter as @ted_dunning.
Ellen Friedman is a solutions consultant and well-known speakerand author, currently writing mainly about big data topics. She is acommitter for the Apache Drill and Apache Mahout projects. With aPhD in Biochemistry, she has years of experience as a research scientistand has written about a variety of technical topics, includingmolecular biology, nontraditional inheritance, and oceanography.Ellen is also coauthor of a book of magic-themed cartoons, A Rabbit Under the Hat (The Edition House). Ellen is on Twitter as@Ellen_Friedman.