Apache Spark Guide - Summary
Apache Spark is a general framework for distributed computing that offers high performance for both batch and interactive processing. It exposes APIs for java, python and scala and consist of spark core and several related projects.
You can run spark application locally or distributed across a cluster either by using a interactive shell or by submitting an application. Running spark application interactively is commonly performed during the data-exploration phase and ad hoc analysis. Following topics are covered in this guide:
- Apache Spark Overview
- Running your first Spark Application
- Troubleshooting for Spark
- Frequently Asked Questions (FAQs) about Spark in CDH
- Spark Application Overview
- Developing Spark Applications
- Running Spark Applications
- Spark & Hadoop Integration
Download Apache Spark Guide in pdf format or read online for free by clicking the link provided below.