Implementing SQL in MySQL

Sammy
4 min readMar 20, 2024

--

Structured query Language (SQL) is a programming language for storing and processing information in database. SQL is frequently used in all types of application. Data analyst and developers use SQL because it can integrate with different programming languages.

This project I got from MySkill Course. This project provide with 7 questions: 1) Is there any null?; 2) See what job titles are there; 3) Any related to data analyst; 4) Average data analyst salary in rupiah; 5) Average data analyst salary based on experience level in rupiah; 6) Countries with attractive salaries for fulltime data analyst positions and experience level entry level and middle level; 7) in what year did the salary increase from mid to senior have the highest increase for work related to full-time data analyst.

I worked on this project using MySQL. This dataset is Data Science Salaries and I got from Kaggle. There are 12 columns in dataset: MyUnknownColumn, work_year, experience_level, employment_type, job_title, salary, salary_currency, salary_in_usd, employee_residence, remote_ratio, company_location, and compay_size. In the experience_level column, there are Entry-level/Junior (EN), Mid-level/Intermediate (MI), Senior-level/Expert (SE), and Executive-level/Director (EX). In the employment_type, there are part-time (PT), full-time (FT), contract-time (CT), and Freelance (FL).

Data Science Salaries Dataset

Is there any null?

Before analyzing the dataset, we have to check whether there is null data in the dataset. To do this I used SELECT * to show all the columns, FROM ds_salaries (dataset), WHERE used to filter records, and OR used to filter records based on more than one condition.

Query 1

The results, there are no null data in dataset.

See what job titles are there?

To see what job titles are there, I used clause SELECT DISTINCT, FROM, ORDER BY. SELECT DISTINCT is used to return only different values. ORDER BY used to sort result-set in ascending or descending order.

Query 2

The results are as follows:

Results 2

Any related to data analyst

To see any related to data analyst, I used SELECT DISTINCT, FROM, WHERE, LIKE, and ORDER BY. LIKE used in WHERE to search for a specified pattern in a column. The percent sign % represents zero, one, or multiple characters.

Query 3

The results are as follows:

Results 3

Average data analyst salary in rupiah

To see average data analyst salary in rupiah, I used SELECT, AVG, FROM, WHERE, and LIKE.

Query 4

The results are as follows:

Results 4

Average data analyst salary based on experience level in rupiah

To see average data analyst salary based on experience level in rupiah, I used SELECT, AVG, FROM, WHERE, GROUP BY, and ORDER BY. GROUP BY used to groups rows that have same values into summary rows.

Query 5

The results are as follows:

Results 5

Countries with attractive salaries for fulltime data analyst positions and experience level entry level and middle level

To see countries with attractive salaries for fulltime data analyst positions and experience level entry level and middle level, I used SELECT, AVG, FROM, WHERE, LIKE, GROUP BY, and HAVING. HAVING used to query because the WHERE cannot be used with aggregate functions.

Query 6

The results are as follows:

Results 6

In what year did the salary increase from mid to senior have the highest increase for work related to full-time data analyst

To see in what year did the salary increase from mid to senior have the highest increase for work related to full-time data analyst, I made 3 tables before I made results tables. The tables are ds_1, ds_2, and t_year.

Query 7.1

The reason I used 3 tables, because I want use to LEFT JOIN ds_1 with ds_2. LEFT JOIN used to returns all records from the left table (ds_1), and the matching records from the right table (ds_2). t_year table used to all record which don’t have matching records.

Query 7.2

The results are as follows:

Results 7

Conclusion

MySQL can be used to processing information in a database. Before we processing information in database, we must make sure the there are no null in dataset because it can influence the results.

--

--

Sammy
Sammy

No responses yet