Introduction
Here in this
article we are trying to make SSIS conditional Split Transform by using SPARK
SQL. It is called Split Data Frame using Filter Transform.
Hope it will
be interesting.
What We Want to Do
We have a
Flat File Named Student Marks Details mentioned bellow.
101,Joydeep Das,Math,10
102,Deepasree Das,Math,89
103,Shipra Roy
Chowdhury,Math,100
104,Rajesh Roy,Math,45
105,Sunita Tendon,Math,98
106,Amit Basu,Math,20
107,Debalina
Bhattacharya,Math,99
108,Pritam Das,Math,40
109,Partha Deb Das,Math,89
110,Onkita Gupta,Math,100
The file contains Student Roll, Student Name, Subject and Marks. Based on Marks we need to display records
If [ Marks ] > 50
|
Display Records
|
If [ Marks ] < 50
|
Display Records
|
Scala Code
//---------------------------------------
// Scala for SPARK to Read Flat File
// Make Conditional Split Depends on
Makes (Marks>50)
// Creation Date: 06/172018
//-----------------------------------------
import org.apache.spark.sql.SparkSession
import org.apache.spark.sql.catalyst.encoders.ExpressionEncoder
import org.apache.spark.sql.Encoder
import spark.implicits._
case class Student(roll: Long, name: String, subject: String, marks: Long)
//Read FLAT File
val studentDF =
spark.sparkContext.textFile("d:/spark/bin/examples/src/main/resources/studentmarksdetails.txt").map(_.split(",")).map(attributes
=> Student(attributes(0).trim.toInt, attributes(1).trim, attributes(2).trim, attributes(3).trim.toInt)).toDF()
studentDF.cache() // recommended to
prevent repeating the calculation
val condition = col("marks") > 50 // Condition
val studentDF1 =
studentDF.filter(condition)
val studentDF2 =
studentDF.filter(not(condition))
//Making View for FLAT file
studentDF1.createOrReplaceTempView("studentrecord1")
studentDF2.createOrReplaceTempView("studentrecord2")
val youngstersDF1 =
spark.sql("SELECT
roll, name, subject, marks FROM studentrecord1")
youngstersDF1.show
val youngstersDF2 =
spark.sql("SELECT
roll, name, subject, marks FROM studentrecord2")
youngstersDF2.show
Output
Hope you like
it.
Posted
by: MR. JOYDEEP DAS
Nice...
ReplyDeleteSimple and productive
ReplyDeletePengertian Correct Score Sbobet atau Tebak Skor adalah salah satu jenis taruhan yang sangat menguntungkan selain mix parlay. Dikarenakan perkalian Odds / Hadiah Taruhannya yang sangat besar sehingga banyak menarik perhatian para pecandu Judi Bola. Tak hanya itu, dari sistem bermain pun sangatlah gampang karena (Baca Selengkapnya Disini...)
ReplyDeleteMicrosoft SQL Server 2019 Standard provides provides additional capability and improvements database features. like SQL Server database engine, SQL Server Analysis Services, SQL Server Machine Learning Services, SQL Server on Linux, and SQL Server Master Data Services. Microsoft SQL Server Standard can build rich content management applications
ReplyDelete