TriHUG Next Meeting featuring Josh Patterson of Cloudera set for Oct. 11
The next Triangle Hadoop User Group meeting will be October 11th at Bronto Software and will be featuring Josh Patterson of Cloudera. RSVP here.
Title: Lumberyard: Time series Indexing at Scale Abstract: As time series data explodes in volume in the genomic, sensor, and financial realms [1] companies are looking for more effective ways to store and query this data. To handle this explosion in scale systems are looking to the Hadoop, HBase, and NoSQL domain for components to build their systems on. In this talk we introduce Lumberyard [3], a system which can potentially (1) store Terabytes of time series data and allow for this data to be interactively queried at low latencies to provide real time access. Lumberyard stores iSAX [4] indexes in HBase’s Multi-dimensional sorted map storage system which give Lumberyard the reliability of HDFS yet the low latencies of HBase. Our approach leverages a multidimensional indexing structure which is stored in HBase’s highly available distributed multi-dimensional sorted map. We present the design of Lumberyard’s implementation and illustrate the differences between an in-memory iSAX index compared with a persisted HBase-backed iSAX index. Sponsored by Cloudera and Bronto Software. More info at www.trihug.org. Bio: Master’s Thesis: self-organizing mesh networks Published in IAAI-09: TinyTermite: A Secure Routing Algorithm Conceived, built, and led Hadoop integration for the openPDC project at TVA (Smartgrid stuff). Led small team which designed classification techniques for timeseries and Map Reduce. Open source work at Now: Sr. Solutions Architect at Cloudera





