This talk takes you inside Hadoop's development process: who the people are, where the code lives, how things get tested and released -and shows how you can get involved in this, from submitting bugs, patches and beta testing, to larger undertakings. It will also raise the question of how to improve development, especially supporting larger and exploratory projects.
The Apache Hadoop stack is rapidly becoming the most complicated suite of software projects ever undertaken in the ASF -it is effectively the OS for future datacentres.
This talk introduces the projects' development view: the mailing lists and its demographics, the codebase, why there are the different branches- and what it takes to get a patch in. Testing your patches and what the current release process is gets special coverage, as code stability at scale is considered essential for production use.
A key message from this talk is that the core development group is a community, and for your patches to be accepted you need to earn their trust in the quality of your code, your testing process, and the likelihood that you will help support that code in future. Joining the community, setting up your own test process and working on other people's issues are ways to do this.
One topic will be the issue "how can the Hadoop projects mentor development"? Pieces of work that would benefit Hadoop either have fallen or may fall by the wayside due to the barriers to making large changes to the code. We need a mentoring/sponsorship process that enables interested parties to work on projects -but in a way that will make acceptance of their work more likely.
The target audience is people who are interested in getting involved in Hadoop's development, those who are curious how it works -and, for the "how to improve things" discussion, those who are already part of the development community.