For the 2 days prior to TechEd Australia 2012 I experienced my first Mexia Code Camp down the Gold Coast, where I got to enjoy geeking out with the rest of the Mexia team. We stayed in luxury beachfront apartments with amazing views of the beach and the Gold Coast coastline. The Code Camp was all about exploring new technology that excited us. This was definitely the coolest environment I’d ever written code in !
Mexia’s Ben Simmonds and I are both big fans of StreamInsight, Microsoft’s Complex Event Processing Engine and are lucky enough to be part of Microsoft’s StreamInsight Advisory Group which has afforded us early access to StreamInsight in Azure (codename Austin). I’m really excited by the possibilities of StreamInsight and complex event processing and really enjoyed exploring the technology at the Code Camp.
StreamInsight covers the Velocity dimension of Gartner’s 3Vs of Big Data – Volume, Velocity and Variety. It’s beauty lies in its ability to extract relevant knowledge from one or more large streams of data. Big data can be analysed in real-time and events can be raised when something relevant is detected within the large stream of information. For example, if a monitoring heartbeat on some resource was not detected, this would constitute a relevant signal we are interested in amid the irrelevant noise of the regular heartbeat.
The project that I tackled involved the following:
- Integrating with Ben‘s StreamInsight Austin instance to host a custom IObservable that monitored a heartbeat running against a Windows Azure Service Bus subscription.
- Implementing a LINQ query to find missing heartbeats and drop an event into Ben’s Windows Azure SQL Database sink.
- An ASP.NET MVC application to display the events dropped into the SQL Database sink from those queries bound to it – Ben’s maximum and average latency events and my missing heartbeat events.
First up I created an observable that generated point events for the regular heartbeat coming from some code checking whether or not a Windows Azure Service Bus subscription was available. This was turned into a stream and then I created a LINQ query over the stream that filtered on events where the subscription was not available. This filtered set of events was then passed through a tumbling window which was 10 seconds long and grouped to get a count of missed heartbeats within that window.
var serviceBusHeartbeatMissingStream = myApp.DefineStreamable(() => serviceBusInputPointObservable);
var serviceBusHeartbeatWindow = from e in serviceBusHeartbeatMissingStream
where e.IsAvailable == 0
group e by e.Subscription into gs
from win in gs.TumblingWindow(TimeSpan.FromSeconds(10))
select new ServiceBusDown
Id = null,
Subscription = gs.Key,
DateTime = DateTime.UtcNow,
Count = win.Count()
This small block of code is doing something amazing. Every 10 seconds it will aggregate the last 10 seconds of events that have passed through the engine and output an event of type ServiceBusDown if that aggregate contains any events where the heartbeat was missed. Extracting the relevant information from a torrent of data – it’s a beautiful thing !
These events were bound to the Windows Azure SQL Database sink.
This meant that they were available for reporting against in a dashboard.
A simple ASP.NET MVC application returned data from the Windows Azure SQL Database sink and displayed real-time the values of these business events:
- Maximum Latency (in ms)
- Average Latency (in ms)
- Service Bus Up/Down
Even though these were really simple examples of what is possible with StreamInsight I am blown away by the technology. As the sheer volume of data we are processing increases, it becomes critical to ensure that you have the capabilities to extract business value from the latent information filling up our data stores or washing over our systems.
I’m presenting a session at the Brisbane Azure User Group this month (October 2012) on StreamInsight titled Extracting Realtime Business Value with StreamInsight.