Friday, 15 June 2018

Spark SQL to join Flat File and JSON File


Introduction
In this article we are trying to join a Flat File with a JSON file by using SPARK SQL. So were going to join a structured file with a Semi Structured file.
Hope it will be interesting.

Flat File and JSON file Meta Data

JSON file structure:

{"empid":101, "name":"Michael", "salary":3000}
{"empid":102, "name":"Andy", "salary":4500}
{"empid":103, "name":"Justin", "salary":3500}
{"empid":104, "name":"Berta", "salary":4000}

Flat File Structure:

101,Tripura
102,West Bengal
103,Bihar

Scala Code

//---------------------------------------
// Scala for SPARK to Read JSON File
// Join with FLAT file
// Creation Date: 05/31/2018
//-----------------------------------------
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.Encoder
import spark.implicits._

case class Employeestate(empid: Long, state: String)

//Read JSON File
val spark = SparkSession.builder().appName("Spark SQL basic example").config("spark.some.config.option", "some-value").getOrCreate()
import spark.implicits._
val df = spark.read.json("examples/src/main/resources/empsalarydetails.json")

//Making View for JSON file
df.createOrReplaceTempView("employee")

//Read FLAT File
val employeestaeDF = spark.sparkContext.textFile("d:/spark/bin/examples/src/main/resources/employeestate.txt").map(_.split(",")).map(attributes => Employeestate(attributes(0).trim.toInt, attributes(1).trim)).toDF()

//Making View for FLAT file
employeeDF.createOrReplaceTempView("employeestate")

val employeeDF = spark.sql("SELECT employee.empid, employee.name, employee.salary, employeestate.state FROM employee, employeestate WHERE employee.empid=employeestate.empid;")

employeeDF.show

Output



Hope you like it.

Posted By: MR. JOYDEEP DAS

2 comments:

  1. Microsoft SQL Server 2019 Standard provides additional capability and improvements database features. like SQL Server database engine, SQL Server Analysis Services, SQL Server Machine Learning Services, SQL Server on Linux, and SQL Server Master Data Services. Microsoft SQL Server Standard can build rich content management applications

    ReplyDelete