In this article we are trying to discuss about basic U-SQL query construction Technique. Hope it will be informative.
First U-SQL Project Solution Explorer
View à Solution Explorer
The solution contains single U-SQL project. Here in this example it is “Test” project. Inside that project is the empty Script.usql file.
Notice that we can expand the .usql file node. It contains a C# file. This is called the U-SQL
Code-Behind file. U-SQL Code-behind is a convenience feature of ADLToolsForVS and not a part of the U-SQL language itself.
Input and Output
Compiling U-SQL file required Input. U-SQL script transforms Input into Output. There are two types of Input and Output.
· U-SQL Table
Location of Input and Output
During U-SQL Cloud Execution the inputs/outputs must all be in the cloud - typically this means Azure Data Lake Store.
During U-SQL Local Execution the inputs/outputs must all be on your own box. There's a special name for this location: The U-SQL Local Data Root.
We can find the local data root by going to Tools à Data Lake à Options and Settings. It in the field called DataRoot at the top.
Sample U-SQL Script
This script don’t have any transformation steps.
EXTRACT UserId int,
This script doesn't have any transformation steps. It reads from the source file called SearchLog.tsv, schematizes it, and writes the rowset back into a file called SearchLog-first-u-sql.csv.
Notice the question mark next to the data type in the Duration field. It means that the Duration field could be null.
Here we have to understand Rowset, Extract and Output.
· Rowset variables: Each query expression that produces a rowset can be assigned to a variable. U-SQL follows the T-SQL variable naming pattern (@searchlog, for example) in the script.
· The EXTRACT keyword reads data from a file and defines the schema on read. Extractors.Tsv is a built-in U-SQL extractor for tab-separated-value files. We can develop custom extractors.
· The OUTPUT writes data from a rowset to a file. Outputters.Csv() is a built-in U-SQL outputter to create a comma-separated-value file. You can develop custom outputters.
The EXTRACT and OUTPUT statements use file paths. File paths can be absolute or relative.
Here we can use scalar variable to make the script maintenance easier.
DECLARE @in string = "/Samples/Data/SearchLog.tsv";
DECLARE @out string = "/output/SearchLog-scalar-variables.csv";
EXTRACT UserId int,
Hope you like it.
Posted by: MR. JOYDEEP DAS