DAPLAB Technical Documentation
In this documentation, you'll find plenty of training material and examples, as well as crunchy infrastructure details. Feel free to get some inspiration, and send your remarks and comments, or even contribute to enhance the documentation by submitting pull requests of the Github project hosting the DAPLAB documentation.
Access to the DAPLAB Platform
The DAPLAB cluster follows typical Hadoop deployment, i.e. it provides gateways and web interfaces as endpoints to interact with the Hadoop components, and no direct access to the servers running the components. See the architecture page for more details.
ssh -p2201 <yourusername>@pubgw1.daplab.ch
No need to have a Ph.D. in Science to be interested in data and to have a valuable perspective when looking at data. Indeed, when searching the needle in the data haystack, the wider background and broader perspective the better.
The DAPLAB follows this reality and is thus open to everyone, the only requirement is to have a computer :).
We also meet every Thursday evening for hacking data and discuss about various hadoop-related technologies. Feel free to join us !
This documentation is putting a strong focus on having up-to-date training material. The training material break down into three main categories:
- Copy-paste-style Hello World tutorials: the users can copy and paste a set of command or instructions, which will produce a result. Very low entry barrier, but limited scope.
- Starter repos: git repos containing source code and unit/integration tests to run quickly code in your favorite IDE. Low entry barrier, focus on a particular technology.
- Advanced selected topics: some topics are covered more in depth. Higher entry barrier and some prerequisites.
Please navigate through the links in the "Tutorial" tab at the very top of this page, or go to the page referencing all the tutorials.