Prerequisites
- Java 6 or higher
- Python Interpreter 2.6 or higher
Installation
Install is very simple just download the latest version of Spark and unzip
wget http://apache.rediris.es/spark/spark-1.5-0/spark-1.5.0-bin-hadoop2.6.tgz tar -xf spark-1.5.0-bin-hadoop2.6.tgz
Interpreter execution
To run it can be done through the Pyspark interpreter or by loading a file.py
./spark-1.5.0-bin-hadoop2.6/bin/pyspark from pyspark import SparkConf, SparkContext sc = SparkContext()
Direct execution
./spark-1.5.0-bin-hadoop2.6/bin/spark-submit file.py
Use without installation
It is recommended to use the cloud services of databricks, for this we will register free of charge on their platform as users of the version “Community Edition“.
For use:
- Upload or create an interpretable file
Assign a cluster for execution by clicking on “detached” icon and creating a new Cluster. It is recommended to use a low Spark version to ensure Compatibility.
Regular bookstores
#! /bin/python from pyspark import SparkConf, SparkContext sc = SparkContext()
0 Comments