Welcome!

From the Author of The Agile Architecture Revolution

Jason Bloomberg

Subscribe to Jason Bloomberg: eMailAlertsEmail Alerts
Get Jason Bloomberg via: homepageHomepage mobileMobile rssRSS facebookFacebook twitterTwitter linkedinLinkedIn


Related Topics: Nastel Journal, Agile Digital Transformation

Blog Feed Post

Monitoring Kafka: Real-Time Ops Challenge

Let’s say you’re a long-time expert on message-oriented middleware – say, an MQSeries (now IBM MQ) pro from back in the day – and someone asked you to create an entirely new middleware product following modern, cloud-native architectural principles, able to process trillions of messages per day. What would it look like?

The answer: it would probably look a lot like Apache Kafka. Kafka is an open-source stream processing platform that originated at LinkedIn, where the engineering team built Kafka to support real-time data feeds to facilitate real-time analytics.

Kafka is cloud-native, which means that its creators built it to leverage the architectural best practices of the cloud, including unlimited horizontal scale, rapid elasticity, high performance, and no single point of failure.

Kafka can, of course, run in the cloud as well – but as middleware, it can connect to both cloud or on-premises endpoints, or run fully on-premises if so desired. Regardless of such deployment details, Kafka brings the architectural advantages of cloud-native software to the middleware world.

The Real-Time Context for Kafka

Kafka’s ability to handle streaming data feeds in real-time enables executives and other line-of-business personnel to obtain real-time insights into information relevant to their business at any point in time.

To achieve this level of performance, Kafka operates with low latency, processing data in memory in a fully distributed manner, while scaling writes and reads with partitioned, distributed commit logs.

In addition, Kafka offers built-in load balancing, coupled with its rapid partitioning of data (also called sharding). In other words, the Kafka team designed the platform to be more like a distributed database transaction log than a traditional middleware system.

Following its cloud-native heritage, the Kafka team also built replication into its architecture – once again, in a horizontally scalable, elastic way. Kafka accomplishes this task with a complicated set of records, topics, consumers, producers, brokers, logs, partitions, and clusters – often with one-to-many relationships, with numerous message traffic flows among them.

The end result is both scalable as well as fault tolerant, as no node acts as a single point of failure. Traditional middleware like IBM MQ, in contrast, depends upon technologies like queues to guarantee message delivery – but with queues, failures delay the arrival of messages. As a real-time platform, the Kafka architecture routes around failures.

Monitoring Kafka

The real-time, fault tolerant performance of Kafka, as with any piece of software, depends upon its proper configuration and the proper operation of the infrastructure that supports it. It would be foolish to assume that the platform’s fault-tolerant architecture doesn’t require monitoring and management.

Monitoring the performance of Kafka, however, means monitoring its various components, as well as the message flows among them, either individually or as end-to-end transactions. Furthermore, such monitoring must be in real-time in order to keep up with the real-time data flowing through the Kafka implementation.

The diagram below from Nastel AutoPilot for Kafka illustrates this challenge. On the left is a Kafka sender, which interacts with three readers, and in turn, with five topics. Note, however, that Kafka is an elastic environment, so it may spawn additional components as necessary.

Monitoring Kafka (Source: Nastel)

Monitoring Kafka (Source: Nastel)

AutoPilot monitors each of the components and the message flows connecting them – a task that would overwhelm a traditional monitoring tool.

In fact, AutoPilot provides both operational and transactional monitoring. It also offers forensics to diagnose Kafka problems, and monitors Kafka performance and availability via end-to-end stream monitoring and metrics tracking from the various Kafka components as well as Zookeeper, Kafka’s configuration service.

The Intellyx Take

As streaming data become increasingly common, organizations are coming to depend upon the real-time visibility at scale that such technologies provide. The real-time streaming capabilities that Kafka delivers add new value to the business and can help evolve traditional data-centric offerings, from data warehouses to business intelligence.

This broad-based trend to real-time insights raises the bar on monitoring and management, both for operations as well as the transaction management essential for business visibility.

Customers aren’t going to wait for real-time performance, so the business cannot afford to wait either. Every aspect of the technology infrastructure, from the network to the middleware to the applications, must now perform in a cloud-native, real-time context – including ops management and monitoring.

Copyright © Intellyx LLC. Nastel is an Intellyx client. At the time of writing, none of the other organizations mentioned in this article are Intellyx clients. Intellyx retains full editorial control over the content of this paper.

Read the original blog entry...

More Stories By Jason Bloomberg

Jason Bloomberg is the leading expert on architecting agility for the enterprise. As president of Intellyx, Mr. Bloomberg brings his years of thought leadership in the areas of Cloud Computing, Enterprise Architecture, and Service-Oriented Architecture to a global clientele of business executives, architects, software vendors, and Cloud service providers looking to achieve technology-enabled business agility across their organizations and for their customers. His latest book, The Agile Architecture Revolution (John Wiley & Sons, 2013), sets the stage for Mr. Bloomberg’s groundbreaking Agile Architecture vision.

Mr. Bloomberg is perhaps best known for his twelve years at ZapThink, where he created and delivered the Licensed ZapThink Architect (LZA) SOA course and associated credential, certifying over 1,700 professionals worldwide. He is one of the original Managing Partners of ZapThink LLC, the leading SOA advisory and analysis firm, which was acquired by Dovel Technologies in 2011. He now runs the successor to the LZA program, the Bloomberg Agile Architecture Course, around the world.

Mr. Bloomberg is a frequent conference speaker and prolific writer. He has published over 500 articles, spoken at over 300 conferences, Webinars, and other events, and has been quoted in the press over 1,400 times as the leading expert on agile approaches to architecture in the enterprise.

Mr. Bloomberg’s previous book, Service Orient or Be Doomed! How Service Orientation Will Change Your Business (John Wiley & Sons, 2006, coauthored with Ron Schmelzer), is recognized as the leading business book on Service Orientation. He also co-authored the books XML and Web Services Unleashed (SAMS Publishing, 2002), and Web Page Scripting Techniques (Hayden Books, 1996).

Prior to ZapThink, Mr. Bloomberg built a diverse background in eBusiness technology management and industry analysis, including serving as a senior analyst in IDC’s eBusiness Advisory group, as well as holding eBusiness management positions at USWeb/CKS (later marchFIRST) and WaveBend Solutions (now Hitachi Consulting).