Saturday, 24 February 2018

Azure Data Lake – U-SQL – Using C# Code Behind


Introduction

In this article we are trying to discuss about Using C# Code behind in U-SQL script. As we look that every U-SQL script has a C# code behind.




Now the question is why we are going to use this C# code behind as we can create function, stored procedure and other thing successfully in U-SQL. The answer is quite simple. If we want to use the power of C# and the entire library related to it, we can use the C# code behind.

 For an example, we need to create a complex scalar value function and using C# it is quite easy by using it’s math library inbuilt functionality.

 

Case Study

 To understand the C# code behind, we are not taking any complex example. Here we have a CSV file, that have “StudentID”, “StidentName” ,“Marks1”, “Marks2” and “Marks3”.

We are going to retrieve information from CSV file and try to put the information to another output CSV file.

We are doing little transformation work by adding “Marks1”, “Marks2” and “Marks3” and give it a “Total Marks”.

We are going to use C# code behind by creating a function name “GetTotalMarks”. It takes three input marks and returns the total of three input marks.



 

C# Code Behind

 

using Microsoft.Analytics.Interfaces;
using Microsoft.Analytics.Types.Sql;
using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using System.Text;

namespace TestApplication
{
    public static class StudentRecor
    {
        public static Double GetTotalMarks(int marks_1, int marks_2, int marks_3)
        {
            return marks_1 + marks_2 + marks_3;
        }
    }

}

 

U-SQL Script


@searchlog =
    EXTRACT StudentID     int,
            StudentName   string,
            Marks1        int,
            Marks2        int,
            Marks3        int
    FROM "C:/Users/Joydeep/AppData/Local/USQLDataRoot/Input-1/StudentRecords.csv"
    USING Extractors.Csv();

@filtering =
    SELECT StudentID,
           StudentName,
           Marks1,
           Marks2,
           Marks3,
           TestApplication.StudentRecor.GetTotalMarks(Marks1, Marks2, Marks3) AS TotalMarks
    FROM @searchlog;

OUTPUT @filtering 
    TO "C:/Users/Joydeep/AppData/Local/USQLDataRoot/output/Output-1/StudentResult.csv"

    USING Outputters.Csv();

 

Please look at the calling of Function in U-SQL code.

It is :

<Name Space Name> . <Class Name>.<Function Name>

Job Graph:



 

Output File

 



 

Hope you like it.

 




Posted by: MR. JOYDEEP DAS


Friday, 23 February 2018

Azure Data Lake – U-SQL – Lab Session -1

Introduction
Here in this article, we are moving with a simple lab session which include all the article that we published before. By this article we get a good concept to work with U-SQL.

Case study
We have four CSV files and one U-SQL Table object named Sales Order

o   Customer.csv
o   ItemGroupMaster.csv
o   ItemMaster.csv
o   ItemStockMaster.csv

Meta data information are given bellow

Sales Order Table
Column Name
Data Type
SrlNo
Int
OrderNumber
String
CustID
String
ItemCode
String
SaleQty
int

ItemMaster.csv
Column Name
Data Type
ItemCode
Int
ItemName
String
GroupCode
Int

ItemGroupMaster.csv
Column Name
Data Type
ItemGroupCode
Int
ItemGroupName
String

Customer.csv
Column Name
Data Type
CustomerCode
String
CustomerName
String

ItemStockMaster.csv
Column Name
Data Type
ItemCode
Int
StockQty
Int

The output we needed by combining all those things are:

Column Name
ItemCode
ItemName
ItemGroupName
StockQty
OrderNumber
CustomerID
CustomerName
SaleQty
RemainingStockQty

U-SQL Statement

USE DATABASE MyUSQLDb
USE SCHEMA AbcPvtLtd;

@itemmaster=
    EXTRACT ItemCode int,
            ItemName string,
            GroupCode   int
    FROM "D:/USQL-Project/Project-1/Input/ItemMaster.csv"
    USING Extractors.Csv();

@joining =
  
    SELECT a.ItemCode,
           a.ItemName,
           b.ItemGroupName,
           c.StockQty,
           d.OrderNumber,
           d.CustID,
           e.CustomerName,
           d.SaleQty,
           (c.StockQty - d.SaleQty) AS stockQty
    FROM
    (
        SELECT ItemCode,
               ItemName,
               GroupCode
        FROM @itemmaster
    ) AS a
    INNER JOIN
    (
    SELECT ItemGroupCode,
           ItemGroupName
    FROM ItemGroupCode
          ) AS b
    ON a.GroupCode == b.ItemGroupCode
    INNER JOIN
    (
    SELECT ItemCode,
           StockQty
    FROM ItemStock
          ) AS c
    ON a.ItemCode == c.ItemCode
    LEFT JOIN
    (
    SELECT OrderNumber,
           CustID,
           ItemCode,
           SUM(SaleQty) AS SaleQty
    FROM salesOrder
    GROUP BY OrderNumber,
             CustID,
             ItemCode
         ) AS d
    ON a.ItemCode == d.ItemCode
    LEFT JOIN
    (
    EXTRACT CustomerId string,
            CustomerName string
    FROM "D:/USQL-Project/Project-1/Input/Customer.csv"
    USING Extractors.Csv()
         ) AS e
    ON d.CustID == e.CustomerId;    


OUTPUT @joining
    TO "D:/USQL-Project/Project-1/Output/Result.csv"
    USING Outputters.Csv();


Hope you like it.





Posted by: MR. JOYDEEP DAS