User Tools

Site Tools


spark

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

spark [2018/11/28 14:20] (current)
kbob01 created
Line 1: Line 1:
 +====== Apache Spark on Mogon ======
  
 +Apache Spark is an open source framework for big data applications using Java Virtual Machines.
 +
 +When using it on Mogon, one typically occupies a full node, as Spark requires a lot of resources.
 +
 +After having occupied a node there are two possible use cases.
 +=== Job based usage ===
 +If you have an already packaged application that shall be submitted as a job, the the following script could be used to start a scala application (packed to myJar.jar) with an the entry point in the class Main in the package main.
 + 
 +<code bash>
 +#!/bin/bash
 +
 +# load module
 +module load devel/​Spark/​2.2.0-Hadoop-2.6-Java-1.8.0_162
 +
 +# start application
 +spark-submit --driver-memory 8G --master local[*] --class main.Main myJar.jar
 +
 +</​code>​
 +
 +
 +The option ''​--master local[*]'' ​ allows Spark to adjust the number of workers on its own.
 +The option ''​--driver-memory''​ is used to set the driver memory, the memory of the workers requires typically no changes.
 +
 +=== Interactive usage ===
 +If you want to use it in a more explorative manner, then using the spark shell by ''​spark-shell''​ is also a possibility. ​
spark.txt · Last modified: 2018/11/28 14:20 by kbob01