Triangle Hadoop Users Group
5 months ago

TriHUG Next Meeting featuring Josh Patterson of Cloudera set for Oct. 11

The next Triangle Hadoop User Group meeting will be October 11th at Bronto Software and will be featuring Josh Patterson of Cloudera.  RSVP here.

Title: Lumberyard: Time series Indexing at Scale

Abstract: 

As time series data explodes in volume in the genomic, sensor, and

financial realms [1] companies are looking for more effective ways to

store and query this data. To handle this explosion in scale systems

are looking to the Hadoop, HBase, and NoSQL domain for components to

build their systems on. In this talk we introduce Lumberyard [3], a

system which can potentially (1) store Terabytes of time series data

and allow for this data to be interactively queried at low latencies

to provide real time access. Lumberyard stores iSAX [4] indexes in

HBase’s Multi-dimensional sorted map storage system which give

Lumberyard the reliability of HDFS yet the low latencies of HBase. Our

approach leverages a multidimensional indexing structure which is

stored in HBase’s highly available distributed multi-dimensional

sorted map. We present the design of Lumberyard’s implementation and

illustrate the differences between an in-memory iSAX index compared

with a persisted HBase-backed iSAX index.

Sponsored by Cloudera and Bronto Software.

More info at www.trihug.org.

Bio:

Master’s Thesis: self-organizing mesh networks Published in IAAI-09:

TinyTermite: A Secure Routing Algorithm

Conceived, built, and led Hadoop integration for the openPDC project

at TVA (Smartgrid stuff). Led small team which designed classification

techniques for timeseries and Map Reduce. Open source work at

http://openpdc.codeplex.com

Now: Sr. Solutions Architect at Cloudera

Sep 14th, 2011
Post has 2 note(s)
  1. free-software-downloads reblogged this from trihug
  2. desktop-virtualization reblogged this from trihug
  3. trihug posted this