SQL Interview Questions And Answers
The blog will tell you Mostly asked SQL Interview Questions and Answers. The blog is Applicable for the posts of SQL Developer, Data Analyst, Business Analyst, Data Scientist
What is SQL and what is it used for? |
Sol. SQL (Structured Query Language) is a programming language used to manage and manipulate relational databases. It is used to retrieve and update data in databases, create and modify database structures, and perform administrative tasks such as granting access to databases and setting user permissions. |
What are the different types of SQL commands? |
Sol. The different types of SQL commands are:
|
What is the difference between a join and a subquery? |
Sol. A join is used to combine data from two or more tables based on a related column. A subquery is a query nested inside another query that returns a result set used in the main query. The main difference between the two is that a join combines data from different tables into a single result set, while a subquery is used to filter data based on a condition. |
What is the difference between a left join and a right join? |
Sol. A left join returns all the rows from the left table and the matching rows from the right table, while a right join returns all the rows from the right table and the matching rows from the left table. In other words, a left join includes all the rows from the left table, even if there are no matches in the right table, while a right join includes all the rows from the right table, even if there are no matches in the left table. |
What is a self-join? |
Sol. A self join is a type of join where a table is joined with itself based on a related column. This is useful when data in the same table needs to be compared or when the data needs to be organized in a hierarchical manner. |
What is a view in SQL? |
Sol. A view is a virtual table created by a query that can be used like a regular table. It contains data from one or more tables and presents it in a customized way. Views can be used to simplify complex queries, restrict access to sensitive data, and provide a consistent view of data across different applications. |
What is the difference between a clustered and a non-clustered index? |
Sol. A clustered index determines the physical order of data in a table based on the values of one or more columns. This allows for faster retrieval of data that is stored in the same order as the index. A non-clustered index is a separate data structure that provides a quick lookup of data based on the values of one or more columns, but does not determine the physical order of data in the table. |
What is a primary key in SQL? |
Sol. A primary key is a column or combination of columns in a table that uniquely identifies each row. It is used to enforce data integrity and ensure that there are no duplicate rows in the table. A primary key can be defined when the table is created or added later using an ALTER TABLE statement. |
What is a foreign key in SQL? |
Sol. A foreign key is a column or combination of columns in one table that refers to the primary key of another table. It is used to enforce referential integrity between the two tables, ensuring that data in one table is consistent with data in the other table. A foreign key can be defined when the table is created or added later using an ALTER TABLE statement. |
What is normalization and why is it important? |
Sol. Normalization is a process of organizing data in a database in a way that reduces redundancy and dependency between tables. It involves breaking down large tables into smaller, more manageable ones and establishing relationships between them. Normalization helps to avoid data inconsistencies and update anomalies that can occur when data is duplicated across multiple tables. It also helps to improve database performance and reduce storage requirements. Normalization is important because it helps to ensure that data is accurate, consistent, and easy to maintain. |
What is denormalization and when is it useful? |
Sol. Denormalization is the process of intentionally adding redundancy to a database in order to improve performance. It involves duplicating data across multiple tables or adding extra columns to existing tables. Denormalization is useful when performance is a critical factor and when the data in the database is read frequently but updated infrequently. It can help to reduce the number of joins required to retrieve data and improve query performance. However, denormalization can also lead to data inconsistencies and update anomalies, so it should be used judiciously and only when necessary. |
What is a trigger in SQL? |
Sol. A trigger in SQL is a set of actions that are automatically performed in response to certain database events, such as data modifications, updates, or deletions. Triggers are often used to enforce business rules or data integrity constraints, such as checking the validity of input data or updating related tables when data is modified. Triggers can be defined to execute before or after an event occurs, and can be used to perform a wide variety of actions, including modifying data, sending notifications, or invoking external procedures. |
What is a stored procedure in SQL? |
Sol. A stored procedure in SQL is a precompiled set of SQL statements that are stored in a database and can be executed by calling the procedure name. Stored procedures can be used to perform complex operations on data, such as data validation, data aggregation, or data transformation. They can also be used to improve performance by reducing the amount of data that needs to be sent between the database server and client applications. Stored procedures are often used in enterprise-level applications to ensure consistency and maintainability of the database logic. |
What is a transaction in SQL? |
Sol. A transaction in SQL is a sequence of one or more SQL statements that are executed as a single unit of work. Transactions are used to ensure that database operations are completed successfully or rolled back if an error occurs. Transactions ensure the consistency and reliability of the data by allowing multiple statements to be executed as a single atomic operation. Transactions are typically used when updating data in a database, ensuring that all changes are made or none at all. |
What is the difference between a function and a stored procedure? |
Sol. A function in SQL is a named set of instructions that performs a specific task and returns a value. A stored procedure, on the other hand, is a named set of instructions that can perform a wide range of tasks, including modifying data, performing calculations, or invoking external procedures. The key difference between the two is that functions must return a value, while stored procedures do not have to return a value. Functions can be used in SQL queries, whereas stored procedures cannot. |
What is a scalar function in SQL? |
Sol. A scalar function in SQL is a function that returns a single value, such as a string or a number. Scalar functions can be used in SQL queries to perform calculations, manipulate strings, or perform other operations on data. Examples of scalar functions include SUM, AVG, and COUNT. |
What is an aggregate function in SQL? |
Sol. An aggregate function in SQL is a function that performs a calculation on a set of values and returns a single value. Examples of aggregate functions include SUM, AVG, and COUNT. |
What is a window function in SQL? |
Sol. A window function in SQL is a function that performs a calculation across a set of rows that are related to the current row. Unlike aggregate functions, window functions do not group rows together but rather calculate values based on a defined window or set of rows. Examples of window functions include RANK, LAG, LEAD, and ROW_NUMBER. Window functions in SQL are a type of function that allows users to perform calculations across a set of rows that are related to the current row. Window functions use the OVER() clause to define a window or a subset of the rows in a result set to which the function is applied. The window function allows users to perform calculations that involve a set of rows, without having to group the data or create a subquery. Window functions are particularly useful for performing calculations that require the use of data from adjacent or non-adjacent rows in a result set. Some examples of window functions include ROW_NUMBER(), RANK(), DENSE_RANK(), NTILE(), and LAG() and LEAD(). These functions allow users to perform calculations such as ranking rows, calculating cumulative totals, and calculating moving averages. Window functions are supported by most popular relational database management systems, including SQL Server, MySQL, Oracle, and PostgreSQL. |
What is a subquery? |
Sol. A subquery in SQL is a query that is nested inside another query. The result of the inner query is used as the input for the outer query. Subqueries can be used to filter, sort, or aggregate data and are a powerful tool for building complex queries. |
What is a correlated subquery? |
Sol. A correlated subquery in SQL is a subquery that refers to a column from the outer query. The subquery is executed for each row returned by the outer query and uses the value of the correlated column to filter the results. |
For Free, Demo classes Call: 020-71179559 Registration Link: Click Here! |
What is a common table expression (CTE)? |
Sol. A common table expression (CTE) in SQL is a temporary result set that is defined within a SELECT, INSERT, UPDATE, or DELETE statement. CTEs are useful for complex queries where multiple subqueries are needed and can simplify the syntax of the SQL statement. |
What is a recursive CTE? |
Sol. A recursive CTE in SQL is a CTE that references itself within the query definition. Recursive CTEs are used to traverse hierarchical data structures such as organizational charts, file systems, or product categories. |
What is a temporary table? |
Sol. A temporary table in SQL is a table that is created and used for a specific purpose and then deleted when it is no longer needed. Temporary tables are often used to store intermediate results during complex queries or to perform batch operations on large datasets. They are created in the tempdb database and can be accessed only by the session that created them. |
What is the difference between a temporary table and a table variable? |
Sol. The main difference between a temporary table and a table variable in SQL is that temporary tables are physical tables that are stored in the database, while table variables are created in memory and exist only for the duration of the query or batch in which they are defined. Temporary tables can be indexed, can have constraints applied, and can be shared among multiple users or sessions, while table variables cannot. |
What is a dynamic SQL? |
Sol. Dynamic SQL in SQL is a method of constructing SQL statements at runtime, rather than having them hard-coded in a query. This allows for more flexibility in queries that need to change based on user input or other factors. Dynamic SQL can be executed using the EXECUTE or sp_executesql statements in SQL Server. |
What is the difference between UNION and UNION ALL? |
Sol. The main difference between UNION and UNION ALL in SQL is that UNION removes duplicates from the result set, while UNION ALL does not. UNION combines the results of two or more SELECT statements into a single result set, while UNION ALL simply concatenates the result sets without removing duplicates. |
What is the difference between GROUP BY and ORDER BY? |
Sol. GROUP BY is a clause in SQL that is used to group rows with identical values in a specified column or set of columns. ORDER BY is a clause in SQL that is used to sort the result set in ascending or descending order based on one or more columns. While both GROUP BY and ORDER BY can be used to organize query results, GROUP BY is used to group data while ORDER BY is used to sort data. |
What is the difference between COUNT(*) and COUNT(column_name)? |
Sol. COUNT() and COUNT(column_name) are both aggregate functions in SQL that return the number of rows in a result set. However, COUNT() returns the total number of rows, while COUNT(column_name) returns the number of non-null values in the specified column. |
What is the difference between DISTINCT and GROUP BY? |
Sol. DISTINCT and GROUP BY are both used to remove duplicates from a result set in SQL. However, DISTINCT is used to select unique values from a single column, while GROUP BY is used to group rows based on one or more columns and return a distinct result set based on the grouping. GROUP BY can also be used to perform calculations on groups of data, while DISTINCT cannot. |
What is the difference between INNER JOIN and OUTER JOIN? |
Sol. The main difference between INNER JOIN and OUTER JOIN is that INNER JOIN only returns matching rows from both tables, while OUTER JOIN returns all rows from one table and matching rows from another table. |
What is a full outer join? |
Sol. A full outer join returns all the rows from both tables and matches the rows that have common values in both tables. If there are no matching rows, NULL values are returned. |
Note: Looking to boost your data analysis skills? Want to learn how to work with large data sets more efficiently? Then look no further than SQL training in Pune! |
What is the difference between a left outer join and a right outer join? |
Sol. The difference between a left outer join and a right outer join is the table from which all the rows are returned. In a left outer join, all the rows from the left table are returned along with the matching rows from the right table, and in a right outer join, all the rows from the right table are returned along with the matching rows from the left table. |
What is a cross-join? |
Sol. A cross join, also known as a Cartesian product, returns all the possible combinations of rows between two tables. It does not use any join condition to match the rows between the tables. |
What is a pivot table? |
Sol. A pivot table is a type of summary table that is used to aggregate, sort, and filter data in a tabular format. It allows you to group data by one or more columns and perform calculations on other columns. The resulting table displays the summarized data in a matrix format, where each row represents a unique combination of the grouping columns, and each column represents a different calculation. |
SQL Interview Questions and Answers |
What is the difference between a subquery and a derived table? |
Sol. A subquery is a query that is embedded within another query and is used to retrieve data needed for the main query. It is also known as a nested query. A derived table, also known as a subselect, is a temporary table that is created using a subquery and is used as a data source for the main query. The main difference between a subquery and a derived table is that a subquery is executed first and its results are used by the main query, while a derived table is created first and its results are used by the main query. In other words, a subquery is used to filter data from a table, while a derived table is used to create a new table from the results of a subquery. |
What is a correlated subquery? |
Sol. A correlated subquery is a type of subquery that is used to reference a column from the outer query in the subquery. This allows the subquery to be executed multiple times, once for each row in the outer query. The result of the subquery is then used by the outer query to filter or modify the data returned. |
What is a natural join? |
Sol. A natural join is a type of join in SQL where the join condition is based on the common column names between two tables. In other words, it automatically matches columns with the same name in both tables and returns the rows where the values in these columns match. This type of join eliminates the need to specify the join condition explicitly. |
What is the difference between a primary key and a unique key? |
Sol. Both primary keys and unique keys are used to enforce data integrity and ensure that each row in a table is unique. However, a primary key is a column or combination of columns that uniquely identifies each row in a table, and is used to create relationships between tables. A table can have only one primary key. On the other hand, a unique key is a column or combination of columns that must contain unique values, but is not used to create relationships between tables. A table can have multiple unique keys. |
What is the difference between a transaction and a batch? |
Sol. A transaction is a sequence of one or more SQL statements that are executed as a single unit of work. Either all the statements are executed successfully, or none of them are. Transactions are used to ensure data consistency and integrity in a database. A batch is a set of SQL statements that are executed together as a group. The statements in a batch can be executed individually, but they are typically executed as a unit. Batches are used to improve performance by reducing network traffic and overhead. |
What is the difference between a delete and a truncate statement? |
Sol. Both delete and truncate statements are used to remove data from a table, but there are some differences. A delete statement removes specific rows from a table based on a condition, while a truncate statement removes all the rows from a table. Delete statements can be rolled back, while truncate statements cannot. Delete statements are slower than truncate statements for large tables. |
What is a case statement in SQL? |
Sol. A case statement in SQL is used to perform conditional logic in a query. It allows you to compare a value or expression with one or more conditions, and then return a result based on the first condition that is true. The syntax of a case statement is similar to that of a switch statement in programming languages. It is often used to perform calculations or to create new columns based on existing data. |
What is the difference between a clustered and a non-clustered index? |
Sol. A clustered index determines the physical order of the data rows in a table, based on the indexed column. Only one clustered index can be created per table. A non-clustered index is a separate structure from the data rows that stores the indexed column values and a pointer to the corresponding data rows. |
What is the difference between an inner join and a self join? |
Sol. An inner join combines data from two different tables based on a matching condition, whereas a self join is used to combine data from a single table based on a matching condition between two or more rows within the same table. |
What is the difference between a composite key and a clustered index? |
Sol. A composite key is a combination of two or more columns that uniquely identify each row in a table. A clustered index can be created on a composite key to determine the physical order of the data rows in a table based on the indexed columns. |
What is a subquery? |
Sol. A subquery is a SQL query that is embedded within another query. It can be used to retrieve data that will be used in the main query as a condition, filter, or as column in the select list. |
What is a correlated subquery? |
Sol. A correlated subquery is a subquery that is related to the outer query and references a column from the outer query in its WHERE clause. It can be used to filter the result set based on the values from the outer query. |
What is a natural join? |
Sol. A natural join is a type of join operation in SQL that combines two tables based on matching values in columns with the same name and data type. It eliminates duplicate columns and returns only the columns that are common to both tables. |
What is the difference between a primary key and a unique key? |
Sol. Both the primary key and unique key enforce uniqueness in the data stored in a table. However, there are some differences: A primary key is a column or a group of columns that uniquely identifies each row in a table. It cannot contain null values and there can only be one primary key per table. A unique key is a column or a group of columns that ensure that each row in a table is unique, but it can contain null values. A table can have multiple unique keys. |
What is the difference between a delete and a truncate statement? |
Sol. A delete statement is used to remove one or more rows from a table based on a specified condition. It is a DML (Data Manipulation Language) operation and can be rolled back using a transaction. A truncate statement is used to remove all rows from a table. It is a DDL (Data Definition Language) operation and cannot be rolled back using a transaction. Also, truncate operation resets the identity counter of the table to its seed value. In general, truncate is faster than delete because it doesn’t log individual row deletions, but it also has more restrictions, such as not being able to delete rows based on a condition. |
What is SQL injection? |
Sol. SQL injection is a type of attack in which an attacker inserts malicious SQL statements into a database query, with the intention of gaining unauthorized access to sensitive data or modifying the behavior of the database. This type of attack can occur when a web application does not properly sanitize user input before using it to construct SQL queries. The attacker can then manipulate the input to inject their own SQL commands into the query, allowing them to execute malicious actions. For example, an attacker may try to inject a SQL command that retrieves all the usernames and passwords in the database. If successful, this can allow the attacker to gain access to sensitive information and compromise the security of the application. SQL injection attacks can be prevented by using prepared statements or parameterized queries, which ensure that user input is properly sanitized and cannot be manipulated to inject malicious SQL commands. It’s also important to follow other best practices for securing web applications, such as input validation and limiting access privileges to the database. |
How candidate key is different from the primary key? |
Sol. In a database, a candidate key is a column or a set of columns that can uniquely identify each row in a table. A primary key is a specific candidate key that is chosen by the database designer to be the main method of identifying and accessing rows in a table. A table can have multiple candidate keys, but only one primary key. The primary key is used to enforce referential integrity, ensure uniqueness of data, and serve as a foreign key in other tables that reference the primary key of this table. In summary, while candidate keys can be any unique identifier for a row in a table, the primary key is a specific candidate key that is chosen as the main identifier and is used to enforce the database constraints. |
What are integrity constraints? |
Sol. Integrity constraints are rules that are enforced on the data in a database to ensure its consistency, accuracy, and reliability. These constraints define the acceptable values that can be entered into specific columns of a table or across multiple tables in a relational database. There are different types of integrity constraints that can be applied to a database:
By enforcing these constraints, the database ensures that the data is consistent, accurate, and reliable, and that there are no errors or inconsistencies in the data. |
What is the difference between relational and NoSQL databases? |
Sol. Relational databases and NoSQL databases are two different types of database management systems. The main differences between them are as follows: Data Model: Relational databases use a structured data model where data is stored in tables with predefined relationships between them. NoSQL databases, on the other hand, use a non-structured data model which can vary depending on the type of database. NoSQL databases can store data in different ways, such as key-value pairs, documents, or graphs. Scalability: Relational databases are vertically scalable, which means that they can only be scaled up by adding more resources to the server, such as RAM or CPU. NoSQL databases, on the other hand, are horizontally scalable, which means that they can be scaled out across multiple servers, making it easier to handle large amounts of data. Flexibility: Relational databases have a fixed schema, which means that the structure of the data has to be defined upfront, and any changes to the schema can be difficult to make. NoSQL databases, on the other hand, are schema-less, which means that the structure of the data can be easily changed or modified as per the requirement. Querying: Relational databases use SQL (Structured Query Language) for querying the data, which provides a powerful and standardized way to retrieve data. NoSQL databases, on the other hand, do not use SQL, and each database has its own proprietary querying language. ACID Compliance: Relational databases are ACID-compliant, which means that they guarantee that database transactions are processed reliably. NoSQL databases, on the other hand, are not always ACID-compliant, and some NoSQL databases sacrifice ACID compliance for performance and scalability. Use cases: Relational databases are ideal for applications that require complex queries, transactions, and need strict data consistency. NoSQL databases, on the other hand, are best suited for applications that require fast and flexible data storage and retrieval, such as social media applications, gaming, and e-commerce websites. |
What are the ACID properties of databases? |
Sol. ACID (Atomicity, Consistency, Isolation, Durability) properties are a set of principles that guarantee reliable and consistent transaction processing in databases.
The ACID properties ensure that the database is reliable and consistent even in the face of system failures or concurrent access by multiple users or applications. |
Explain the CAP theorem. State where the CAP theorem is applicable. |
Sol. CAP theorem is a theoretical concept that helps in understanding the limitations of distributed computing systems. The CAP theorem states that it is impossible for a distributed computer system to simultaneously provide all three of the following guarantees:
In other words, the CAP theorem asserts that in a distributed system, only two out of the three guarantees can be achieved at the same time. Therefore, the system must sacrifice one of the guarantees to ensure the other two. The CAP theorem is applicable in any distributed system that requires high availability and fault tolerance, such as cloud-based applications, social networks, and e-commerce websites. It is essential for architects and developers to understand the implications of the CAP theorem when designing and building such systems. |
What is the difference between drop and truncate? |
Sol. In SQL, DROP and TRUNCATE are used to remove data or objects from a database, but there is a key difference between them. DROP is a DDL (Data Definition Language) command used to remove an entire table or database object permanently. It not only removes the data but also the table structure, as well as any triggers, constraints, or indexes associated with it. Once a table is dropped, it cannot be recovered. On the other hand, TRUNCATE is a DML (Data Manipulation Language) command used to remove all the data from a table while keeping the structure intact. It deletes all the rows of the table, but the table itself remains in the database. Truncate is faster than delete as it does not log individual row deletions but rather deallocates the data pages. Additionally, Truncate can only remove data from a table but not from a view or a database object. In summary, DROP is a permanent removal of an object, while TRUNCATE removes only data but keeps the table structure intact. |
For Free, Demo classes Call: 020-71179559 Registration Link: Click Here! |
What are views in SQL? |
Sol. In SQL, a view is a virtual table that is based on the result set of an SQL statement. Views are stored as SELECT statements in the database, but they do not contain data themselves. Instead, they provide a way to simplify complex queries by storing them as a virtual table that can be queried like any other table. Views can be used for a variety of purposes, such as: Simplifying complex queries by joining multiple tables and filtering data in a single view. Providing a level of abstraction between the user and the underlying data, which can help to protect the data from unauthorized access or modification. Implementing business logic by defining custom views that encapsulate complex calculations or data transformations. Improving performance by caching the results of expensive queries in a view. Views can be created, modified, and dropped using SQL commands. Once created, they can be queried like any other table, and their results can be filtered, sorted, and grouped as needed. |
What is the information schema and performance schema in MySQL database? |
Sol. Information Schema and Performance Schema are two important database schemas used in SQL. The Information Schema is a read-only schema that provides information about the database’s structure and metadata. It contains tables that provide information about tables, views, columns, indexes, privileges, and other database objects. The information schema is a standard schema, so it is supported by all major relational database management systems. On the other hand, Performance Schema is a read-only schema that provides performance-related information about the database. It was introduced in MySQL 5.5 and is designed to help database administrators identify performance bottlenecks and optimize database performance. Performance Schema provides detailed information about queries, locks, threads, and other aspects of database performance. Both Information Schema and Performance Schema are important for database administration and management, as they provide useful insights into the structure and performance of the database. |
How to check which database we are currently using in MySQL? |
Sol. To check which database you are currently using in MySQL, you can use the following command: SELECT DATABASE(); This will display the name of the current database being used in MySQL. |
Write an SQL program to swap the values of the gender column. |
Sol. Assuming that the gender column contains two distinct values, let’s say ‘Male’ and ‘Female’, the following SQL program can be used to swap the values: UPDATE table_name SET gender = CASE WHEN gender = ‘Male’ THEN ‘Female’ WHEN gender = ‘Female’ THEN ‘Male’ END; Replace table_name with the actual name of the table that contains the gender column. The CASE statement is used to swap the values – if the current value is ‘Male’, it will be updated to ‘Female’, and if the current value is ‘Female’, it will be updated to ‘Male’. |
You have a table containing customer orders, and you need to find the total revenue generated by each customer. How would you approach this task using SQL? |
Sol. To find the total revenue generated by each customer from the customer orders table, you can use SQL’s GROUP BY clause and aggregate function. Here’s an example SQL query that can be used: SELECT customer_id, SUM(order_amount) AS total_revenue FROM orders GROUP BY customer_id; In this query, the SUM() function is used to calculate the total revenue generated by each customer by adding up all the order amounts associated with that customer. The GROUP BY clause is used to group the results by the customer ID, so that the total revenue is calculated for each individual customer. The AS keyword is used to alias the column name to “total_revenue” for better readability. Once this query is executed, it will return a table that lists the total revenue generated by each customer based on their orders. |
You have a table containing employee information, including their name, department, and salary. You need to find the average salary for each department. How would you write the SQL query for this task? |
Sol. |
Assuming the table name is employee_info and the columns are name, department, and salary, the SQL query to find the average salary for each department would be: SELECT department, AVG(salary) AS avg_salary FROM employee_info GROUP BY department; This will group the employee information by department and calculate the average salary for each group, which will be displayed as avg_salary in the output. |
You have a table containing student grades, including their name, subject, and grade. You need to find the top-performing student in each subject. How would you approach this task using SQL? |
Sol. |
SELECT subject, name, gradeFROM ( SELECT subject, name, grade, ROW_NUMBER() OVER (PARTITION BY subject ORDER BY grade DESC) AS rn FROM student_grades ) t WHERE rn = 1; To find the top-performing student in each subject from the student grades table, you can use the following SQL query: SELECT subject, name, grade FROM ( SELECT subject, name, grade, ROW_NUMBER() OVER (PARTITION BY subject ORDER BY grade DESC) AS rn FROM student_grades ) t WHERE rn = 1; Explanation: The inner SELECT statement selects the subject, name, and grade from the student_grades table, and uses the ROW_NUMBER() function to assign a rank to each student’s grade within each subject, ordered by grade in descending order. The outer SELECT statement selects the subject, name, and grade from the result of the inner query, but only for the rows where the rank is 1 (i.e. the top-performing student in each subject). Note: This query assumes that there are no ties for the top-performing student in each subject. If there are ties, the query may need to be modified to handle them. |
You have a table containing customer orders, including their order date, product name, and quantity. You need to find the total quantity sold for each product in the last month. How would you write the SQL query for this task? |
Sol. SELECT product_name, SUM(quantity) AS total_quantity FROM customer_orders WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH) GROUP BY product_name; This query selects the product name and the sum of the quantity sold for each product in the last month from the customer_orders table. It uses the SUM function to calculate the total quantity sold and the GROUP BY clause to group the results by product name. The WHERE clause filters the orders to those made in the last month using the DATE_SUB function to subtract 1 month from the current date (NOW()). |
You have a table containing employee information, including their name, hire date, and department. You need to find the employees who were hired in the last year and are still working in the company. How would you approach this task using SQL? |
Sol. |
SELECT name, hire_date, department FROM employee WHERE hire_date >= DATE_SUB(NOW(), INTERVAL 1 YEAR) AND is_active = 1; Assuming there is an is_active column in the employee table that indicates whether the employee is still working in the company. This query selects the name, hire_date, and department columns for all employees who were hired within the last year (hire_date >= DATE_SUB(NOW(), INTERVAL 1 YEAR)) and are still active (is_active = 1). |
You have a table containing customer information, including their name, email, and phone number. You need to find all customers who have ordered a product in the last month but have not provided their phone number. How would you write the SQL query for this task? |
Sol. SELECT name, email FROM customer_info WHERE email IN ( SELECT email FROM customer_orders WHERE order_date >= DATE_SUB(NOW(), INTERVAL 1 MONTH) ) AND phone_number IS NULL; This query uses a subquery to find all customers who have ordered a product in the last month, and then filters the results to only include customers who have not provided their phone number. The IN operator is used to match the email addresses between the two tables. |
You have a table containing product information, including their name, price, and category. You need to find the top 5 best-selling products in each category. How would you approach this task using SQL? |
Sol. To solve this task, we can use the ROW_NUMBER() function along with a subquery to rank the products within each category based on their sales quantity, and then select the top 5 products for each category. SELECT name, price, category FROM ( SELECT name, price, category, ROW_NUMBER() OVER (PARTITION BY category ORDER BY sales_quantity DESC) AS rank FROM products JOIN ( SELECT product_id, SUM(quantity) AS sales_quantity FROM orders WHERE order_date >= DATEADD(month, -1, GETDATE()) GROUP BY product_id ) AS sales ON products.id = sales.product_id ) AS ranked_products WHERE rank <= 5 Explanation: We start by joining the products table with a subquery that calculates the total quantity sold for each product in the last month. We then use the ROW_NUMBER() function to rank the products within each category based on their sales quantity, using the PARTITION BY clause to group the products by category. Finally, we select the top 5 products for each category by filtering the results based on the rank. Note: This is just an example query, and the exact syntax may vary depending on the specific SQL database management system being used. |
You have a table containing employee information, including their name, hire date, and salary. You need to find the average salary of employees hired each year. How would you write the SQL query for this task? |
Sol. SELECT YEAR(hire_date) AS hire_year, AVG(salary) AS avg_salary FROM employee GROUP BY YEAR(hire_date) ORDER BY hire_year; Explanation:
|
You have a table containing customer orders, including their order date, product name, and price. You need to find the total revenue generated by each product in the last quarter. How would you approach this task using SQL? |
Sol. To find the total revenue generated by each product in the last quarter, we can use the following SQL query: SELECT product_name, SUM(price*quantity) as revenue FROM customer_orders WHERE order_date >= DATE_SUB(NOW(), INTERVAL 3 MONTH) GROUP BY product_name ORDER BY revenue DESC; Explanation: We select the product_name column and calculate the total revenue using the SUM function and multiplying price and quantity columns. We filter the data to only include orders in the last quarter using the WHERE clause and DATE_SUB function to subtract 3 months from the current date. We group the data by product_name using the GROUP BY clause. We order the results by revenue in descending order using the ORDER BY clause. |
You have a table containing employee information, including their name, department, and salary. You need to find the highest-paid employee in each department. How would you write the SQL query for this task? |
Sol. Here’s an example SQL query to find the highest-paid employee in each department: SELECT department, MAX(salary) AS highest_salary FROM employee GROUP BY department; This query uses the MAX function to find the highest salary for each department and groups the results by department using the GROUP BY clause. The AS keyword is used to give an alias to the result column. If you also need to get the name of the employee with the highest salary, you can modify the query like this: SELECT department, name, salary FROM employee WHERE (department, salary) IN ( SELECT department, MAX(salary) FROM employee GROUP BY department ); This query uses a subquery to find the highest salary for each department and then selects the employees who have that salary for their department using the IN operator.
|
How to remove duplicate rows in SQL? |
Sol. To remove duplicate rows in SQL, you can use the DISTINCT keyword or the GROUP BY clause. Using the DISTINCT keyword: SELECT DISTINCT column1, column2, column3 FROM table_name; Using the GROUP BY clause: SELECT column1, column2, column3 FROM table_name GROUP BY column1, column2, column3; You can also use the HAVING clause with the GROUP BY clause to specify a condition for filtering the results: SELECT column1, column2, column3 FROM table_name GROUP BY column1, column2, column3 HAVING COUNT(*) > 1; This query will remove all duplicate rows and only return unique rows for the selected columns. Complete Scenario CREATE TABLE employees ( id INT PRIMARY KEY, name VARCHAR(50), age INT ); INSERT INTO employees (id, name, age) VALUES (1, ‘John’, 30), (2, ‘Alice’, 25), (3, ‘Bob’, 35), (4, ‘John’, 30), (5, ‘Charlie’, 40), (6, ‘Alice’, 25); To remove the duplicate rows, you can use the DISTINCT keyword along with a subquery that selects only the unique name values: DELETE FROM employees WHERE id NOT IN ( SELECT MIN(id) FROM employees GROUP BY name ); In this example, the subquery selects the minimum id value for each unique name value, effectively keeping only one row for each unique name. The WHERE clause then filters out any rows that have an id value that is not in the selected minimum id values. After running this query, the employee’s table will only contain the unique rows based on the name column. Note: In some cases, it might be more appropriate to use SELECT DISTINCT to retrieve the unique rows instead of DELETE. The choice depends on the specific use case and desired outcome. |
For Free, Demo classes Call: 020-71179559 Registration Link: Click Here! |
For the “SevenMentor” string, write a SQL program to only extract first 5 characters using wildcard operators |
Sol. Assuming you want to extract the first 5 characters from a column in a table that contains the string “SevenMentor”, you can use the LEFT function and wildcard characters in your SQL query. Here’s an example: Suppose you have a table named “company” and it contains a column named “name” which contains the string “SevenMentor”. You can use the following SQL query to extract the first 5 characters: SELECT LEFT(name, 5) AS first_five_chars FROM company This will output “Seven” as the first 5 characters extracted from the “name” column of the “company” table. The wildcard character % can also be used to extract the first 5 characters. Here’s an example: SELECT SUBSTR(name, 1, 5) AS first_five_chars FROM company WHERE name LIKE ‘Seven%’ This will output the same result as the previous example. The % wildcard character in the WHERE clause matches any number of characters after the “Seven” substring. Note: Sign up for SQL Course in Pune today and take your data analysis skills to the next level! For more information do visit: Click Here Author:- Karishma Pawar Call the Trainer and Book your free demo Class For SQL Call now!!! | SevenMentor Pvt Ltd. © Copyright 2021 | SevenMentor Pvt Ltd. |