Next Meeting: November 15, 2011 @ Bronto Software
Our next meeting will be November 15 at Bronto Software. The speaker will be Alan Gates, the author of Programming Pig and a member of the Hortonworks team. RSVP here.
————-
Title: New Features in Pig 0.9 and Introducing HCatalog
Abstract: Pig 0.9 added several features to make Pig a more powerful data processing platform, including macros, include statements, and the ability to embed Pig in Python for control flow. We’ll cover these, talk about some new features that have been added since 0.9, and what’s next on Pig’s roadmap.
HCatalog is a table management and storage management layer for Hadoop that enables users with different data processing tools – Pig, MapReduce, Hive, Streaming – to more easily read and write data on the grid. HCatalog’s table abstraction presents users with a relational view of data in the Hadoop distributed file system (HDFS) and ensures that users need not worry about where or in what format their data is stored – RCFile format, text files, sequence files. This talk will include an overview of HCatalog’s features and a discussion of its current roadmap.
Bio: Alan is a co-founder of Hortonworks as well as an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan also designed HCatalog and guided its adoption as an Apache Incubator project. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is also the author of Programming Pig, a forthcoming book from O’Reilly Press. Follow Alan on Twitter: @alanfgates.





