RSS FeedBig Data DC Meetup #4 – Kafka
Live Blogging from the Big Data DC Meetup# 4 – Chris Burroughs is presenting on Kafka.
Essentially, Kafka is a distributed publish-subscribe messaging system. It provides persistent messaging (that is, protection against restarts and shut downs. It also provides a constant time, that is, O(1) disk structures that provide constant time performance even with many TB of stored messages. (This is sort of impressive as is already).
The main advantage of Kafka seems to be the high-throughput: even using simple hardware Kafka can support hundreds of thousands of messages per second.
Hadoop
Just received from Amazon – “Hadoop: The Definitive Guide”. Now on to the difficult part of actually reading it.
Mindmap Notes from Hadoop Meetup
Hadoop meetup of September 7th was awesome. Here is an image export of my (incomplete) mindmap notes. If you need the original mind map file, please drop me an email.
Hadoop Meetup (Washington DC, Sep 7th)
Excellent gathering at the Hadoop Meetup this past Tuesday. Two good speakers: Tom White and Aaron Cordova. Tom presented some interesting possible additions to Hadoop (8 things missing he would like to see in Hadoop). Aaron presented ideas on how to use HBase for distributed single NameNode. I am new to Hadoop myself, but got a lot out of this meeting.
Tom is the author of “Hadoop: The Definitive Guide“. I have ordered the book, have not received it yet, but as I hear, it is one of a couple of good books on Hadoop.
My mind map notes from the meeting (obviously incomplete and inaccurate) are here. If there is another Hadoop meetup in DC, I am very likely to attend.
Apps