In this article we will load nested JSON data into a Spark Dataframe using Scala.
The nested json file is provided below for reference.
The SBT library dependencies are shown below for reference.
The Scala program is provided below.
Thanks. That is all for now!
The nested json file is provided below for reference.
The SBT library dependencies are shown below for reference.
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"
The Scala program is provided below.
import org.apache.spark.sql.SparkSession object NestedJsonReader extends App { System.setProperty("hadoop.home.dir","C:\\intellij.winutils") val spark = SparkSession.builder() .master("local") .appName("NestedJsonFileReader") .getOrCreate() val df = spark.read .format("json") .option("multiline","true") .load("C:\\data\\nested-data.json") df.select("Id", "Name.FirstName", "Name.LastName", "Age").show() }
Here is the output after running the program.
Thanks. That is all for now!
No comments:
Post a Comment