filmov
tv
SQL Server Tutorial : Intermediate SQL Server

Показать описание
---
Welcome to this DataCamp course on Developing SQL for Data Science. My name is Ginger Grant and I will be your instructor for this course. I have been working with Transact SQL, commonly known as T-SQL, the version of SQL used in SQL Server for over 10 years and I have been recognized by Microsoft as a Data Platform MVP and a Certified Trainer. As president of the data analytics consultancy, Desert Isle Group, I provide training and develop solutions using SQL and other tools to help clients find meaning from their data.
In this course we are going to use SQL to analyze data for exploratory data analysis. In the first chapter, we will looking at summary information, commonly used in other languages like R and Python. In Chapter 2 we are going to look at some SQL Functions useful for manipulating data by using math and date functions. Chapter 3 contains techniques for modifying the data, and lastly in Chapter 4 TSQL windowing functions.
Data Scientists spend a lot of time analyzing and aggregating data by generating summary statistics such as the mean, minimum and maximum values of columns.
If your data exists in a database, the fastest way to generate summary statistics is by using SQL.
In the videos of this chapter, we will run queries on the EconomicIndicators table. The data displays by country several interesting statistics like Gross Domestic Product or GDP and Internet Use. A sample of the table is included here which includes different integers, text, floats.
You can calculate common summary stats such as the minimum, maximum, and average of each column by using the MIN, MAX, and AVG functions, respectively.
As you can see on the slide, you need to pass the column name to the relevant functions to calculate the summary statistics.
Note that all the keywords such as SELECT, FROM, and WHERE along with the function names are written in capital letters. T-SQL is by default case insensitive, so you don't have to write them in any particular case, but capitalizing keywords and function names is a standard convention. As you saw in the previous example, these functions by default calculate summary statistics for the entire column. What if you want to calculate these statistics for a specific group, say only for Solomon Islands? You can do this by adding the WHERE clause as shown here.
In the previous example, we aggregated for the entire table. Here you see that to calculate these stats for each country, you will have to use the GROUP BY clause as shown here.
Now if you want to take this one step further by returning only the countries where InternetUse is greater than 100, you can't use the WHERE clause.
When you want to filter on calculations which contain GROUP BY, you can use the HAVING clause to filter rows.
Here is a complete example using the HAVING Clause. We are filtering the query to show only countries having InternetUse greater than 100 hours a month.
In the following exercises, you will practice calculating summary stats using data gathered from MUFON, the Mutual UFO Network on all unexplained sitings all over the world in the Incidents table.
Now let's try some examples
#DataCamp #SQLServerTutorial #IntermediateSQLServer #SQL