In this article we will load XML data into a Spark Dataframe using Scala.
The XML file is provided below for reference.
The SBT library dependencies are shown below for reference.
Thanks. That is all for now!
The XML file is provided below for reference.
The SBT library dependencies are shown below for reference.
scalaVersion := "2.11.12"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.3.0"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.3.0"
libraryDependencies += "com.databricks" %% "spark-xml" % "0.6.0"
The Scala program is provided below.Here is the output after running the program.import org.apache.spark.sql.{SparkSession, types} object XMLReader extends App { System.setProperty("hadoop.home.dir","C:\\intellij.winutils") val spark = SparkSession.builder() .master("local") .appName("XMLFileReader") .getOrCreate() val df = spark.read .format("xml") .option("rowTag", "person") .load("C:\\data\\data.xml") df.show() }
+---+---+------+
|Age| Id| Name|
+---+---+------+
| 53| 1|Name-1|
| 52| 2|Name-2|
| 23| 3|Name-3|
+---+---+------+
Thanks. That is all for now!
No comments:
Post a Comment