Friday, 15 June 2018

Spark SQL to join Flat File and JSON File


Introduction
In this article we are trying to join a Flat File with a JSON file by using SPARK SQL. So were going to join a structured file with a Semi Structured file.
Hope it will be interesting.

Flat File and JSON file Meta Data

JSON file structure:

{"empid":101, "name":"Michael", "salary":3000}
{"empid":102, "name":"Andy", "salary":4500}
{"empid":103, "name":"Justin", "salary":3500}
{"empid":104, "name":"Berta", "salary":4000}

Flat File Structure:

101,Tripura
102,West Bengal
103,Bihar

Scala Code

//---------------------------------------
// Scala for SPARK to Read JSON File
// Join with FLAT file
// Creation Date: 05/31/2018
//-----------------------------------------
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.Encoder
import spark.implicits._

case class Employeestate(empid: Long, state: String)

//Read JSON File
val spark = SparkSession.builder().appName("Spark SQL basic example").config("spark.some.config.option", "some-value").getOrCreate()
import spark.implicits._
val df = spark.read.json("examples/src/main/resources/empsalarydetails.json")

//Making View for JSON file
df.createOrReplaceTempView("employee")

//Read FLAT File
val employeestaeDF = spark.sparkContext.textFile("d:/spark/bin/examples/src/main/resources/employeestate.txt").map(_.split(",")).map(attributes => Employeestate(attributes(0).trim.toInt, attributes(1).trim)).toDF()

//Making View for FLAT file
employeeDF.createOrReplaceTempView("employeestate")

val employeeDF = spark.sql("SELECT employee.empid, employee.name, employee.salary, employeestate.state FROM employee, employeestate WHERE employee.empid=employeestate.empid;")

employeeDF.show

Output



Hope you like it.

Posted By: MR. JOYDEEP DAS

No comments:

Post a Comment