Monday, July 13, 2020

Load CSV File to Dataframe using Scala in IntelliJ

In this article we will learn how to read and load a CSV file into a Spark Dataframe using Scala in the IntelliJ IDE.

Below is what the input CSV file looks like.



Then, the SBT dependencies in IntelliJ.



The Scala program is provided below for reference.

import org.apache.spark.sql.SparkSession
import org.apache.spark.{SparkConf, SparkContext}

object Df1 extends App {

  val conf = new SparkConf()
    .setMaster("local")
    .setAppName("testconf")

  //val sc = new SparkContext(conf)
  val spark = SparkSession.builder()
    .master("local")
    .appName("testsparksession")
    .config(conf = conf)
    .getOrCreate()

  import spark.implicits._

  val df = spark.read.csv("C:\\Kallol\\SparkDf1\\data.txt")

  df.show()
}

And this is how the output will look like.



That's all folks! Enjoy!

No comments:

Post a Comment