Competition - Certification competition
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Show off your SQL expertise by earning a Certification

    πŸ“– Background

    Whether you are working in a data team or you need to use data to support decision making in your job, the chances are you will need to work with SQL. A SQL certification is a great way to demonstrate to employers that you are able to work with data and generate insights.

    πŸŽ“ Step 1: Get Certified!

    Paste the link to your newly earned SQL Associate Certification here

    πŸ“ Step 2: Explain your choice of metrics

    Your boss was impressed with the analysis work that you did and shared it with the team leaders. They have no background of working with data and don’t understand what you mean by average, or why you picked the approach you did to calculate it.

    Can you explain to the team leaders, in no more than 200 words, how you calculated the average, what an average means and the different approaches to averages, and why you picked this summary?

    *There are different types of averages:

    1- Arithmetic Mean The sum of all values divided by the number of values.

    It is commonly used example suppose we have data of daily temperatures (in degrees Celsius): 20, 22, 23, 24, 25,30, 35 To calculate the mean = (20 +22 +23 +24 +25 +30 +35 /7) = 149.

    2- Median: The middle values of dataset when data aranged in ascending or descending order. The median value of our data (20, 22, 23, 24, 25, 30, 35) The median = 25.

    3- Mode: The most frequently occurring value in the data. Mode can use with categorical data too.

    *Suppose we have sales of products in Sunday the amounts of this products 20, 15, 15, 30, 30, 10, 40, 30 In this case Mode = 30. In this example the arithmetic mean provides atypical temperature for day or week.However if there were extreme temperature, like heatwave or cold, the mean might not accurately represent typical conditions. In such case the Median would be more useful if we had dataset with repeating values.

    The Mode will be useful in identifying the most frequent these examples illustrate how different averages methods can applied depending on the nuture of the data.

    There is an important measurement related to the average and the distribution of data on the average is skewness. What is **skewnness? ** Skewness is a statistical measure that quantifies the asymmetry of the probability distribution of a dataset. It indicate whether the data is symmetrically distribution around the mean or whether it tends to have more extreme values on one side of the mean compared to the other. Skewness has a complex mathematical equation that is long explained here we will be satisfied with the general meaning and its relationship to the average. The relationship between skewness, mean and median is as follows

    1- **Positive Skewness: In positive skewness distribution(also known as right-skewned), the tail of the distribution extends toward the higher values,meaning that there are more extreme values on the right side of the distribution. In this case the mean is greater than the median because presence of the higher extrem values pulls the mean towards them. Mathematically, the mean is effected by these extrem values.

    2- Negative Skewness: In negative skewness distribution(also known as left skewned) the tail of distribution extends towards the lower values, meaning that there are more extreme values on the left side of the distribution.

    In this case the mean is less than the median because the presence of the lower extreme values pulls the mean towads them. Of course the mean is effected by these extreme values causing it to be lower than tha median.

    3- Symmetric Distribution: In symmetric distribution there is no skewness(skewness equal zero), the mean and median are equal. In this case the data is distributed evenly on both sides of the nean and there are no extreme values pulling the mean towards one side.

    βœ… Checklist before publishing into the competition

    • Rename your workspace to make it descriptive of your work. N.B. you should leave the notebook name as notebook.ipynb.
    • Remove redundant cells like the judging criteria, so the workbook is focused on your story.
    • Make sure the workbook reads well and addresses the task you were given.

    ⏳ Time is ticking. Good Luck!