My Technical Diary (George Wen): January 2022

Wednesday, January 12, 2022

spark + jupyter notebook on ubuntu

Step 1: download spark from https://spark.apache.org/downloads.html

Step 2: unzip

$ tar zxvf ../spark-3.x.x.tar.gz

Step 3: setup bash by adding the following t~/.bashrc

export SPARK_HOME=/opt/spark
export PATH=$SPARK_HOME/bin:$PATH

Step 4: install jupyter notebook:  
        $ pip install jupyter

Step 5: Start Spark: 
                $ start-all.sh

Step 6: install findspark package: 
        $ pip install findspark

Step 7: launch jupyter notebook: 
        $jupyter notebook

Step 8: create a new notebook and add the following code for testing:

import findspark

findspark.init()

import pyspark

from pyspark.sql import SparkSession

spark = SparkSession.builder.getOrCreate()

df = spark.sql("select 'spark' as hello ")

df.show()

My Technical Diary (George Wen)

Wednesday, January 12, 2022

spark + jupyter notebook on ubuntu

Disable Microsoft Defender for Cloud for Visual Studio Subscription (MSDN)

Search This Blog