How Do You Tune Database Performance
In order to run the database perfectly, we need to make sure everything in the DB is perfect and intact. For example, in order to run a query on the table, it should have proper data, constraints and indexes. If any one of them is incorrect or disabled, then its performance will degrade.
In a database, tuning the performance is a continuous task. There will be some of the DB objects that will perform better when tuned. There are many factors that affect the performance of the DB. Some of them are:
- Using Triggers : If the table has triggers, then executing query on the table data might take extra time to execute the triggers too. This is because, whenever we run the query which in turn calls the trigger, both query and trigger is considered as single transaction. Hence it will complete only when trigger is also executed and time taken to execute the query will have extra time for executing the query. Hence we need to take care of triggers in performance driven queries.
- Same Datatypes for Primary and Foreign keys : Primary and foreign keys which are used to join the tables should have same datatype and size. Otherwise it will be an extra cost to convert them into same type. i.e. it is not good to have primary key as VARCHAR and foreign key as CHAR. It is always better to have both of them same either VARCHAR or CHAR.
Data Transformation Interview Questions
These are questions that one can expect to be asked very frequently for a Data Engineer or a Machine Learning Engineer position. Though it might not be out of place in a Data Scientist Interview for positions that require some experience. Data Transformation or more generally ETL is a process used to collect data from various sources , changing it according to the business rules , and then loading such extracted and transformed data into a database.
When the data is extracted, it is done so from various data sources that, more often than not, store data in completely different formats.
Such data is loaded into another database or table that the analysts or any other users might use.
The ETL is heavily used in data warehouses, which serves as the central source of the integrated data, with data flowing into it from one or more separate sources. If you want to perform well at the SQL job interview, these are the concepts you need to know:
- Data Definition Language keywords
- Data Manipulation Language keywords
- Data Control Language keywords
- Transaction Control Language keywords
- SQL constraints
- query optimization
This is one of the easiest and yet frequently asked questions from the Southwest Airlines Data Science SQL Interview:
DELETE and TRUNCATE
“What is the difference between DELETE and TRUNCATE?”
Link to the question:
Another common question to appear is
Q1: What Is The Main Difference Between A Clustered And Non
A Non-clustered index is different from a Clustered index in that, the underlying table rows will not be stored and sorted based on the Non-clustered key, and the leaf level nodes of the Non-clustered index are made of index pages instead of data pages. The index pages of the Non-clustered index contain Non-clustered index key values with pointers to the storage location of these rows in the underlying heap table or the Clustered index.
For more information, see the article: .
Read Also: How To Prepare For Immigration Interview
Code In Logical Parts
When youâre writing the code, pay attention to its structure. Divide the code into logical parts. That way, you will make your code easier to read, which is also one of the requirements to get the job. There is no point in writing a correct code that is a mess, and nobody can read it and understand it after you write it. Not even you!! If your code is divided into logical parts, it will be easier for you to explain to the interviewer what you did.
What Is The Difference Between Drop And Truncate Commands
If a table is dropped, all things associated with that table are dropped as well. This includes the relationships defined on the table with other tables, access privileges, and grants that the table has, as well as the integrity checks and constraints.
To create and use the table again in its original form, all the elements associated with the table need to be redefined.
However, if a table is truncated, there are no such problems as mentioned above. The table retains its original structure.
You May Like: Peer Support Specialist Interview Questions
Discuss Row Chaining How Does It Happen How Can You Reduce It How Do You Correct It
Ans: Row chaining occurs when a VARCHAR2 value is updated and the length of the new value is longer than the old value and wont fit in the remaining block space.
This results in the row chaining to another block. It can be reduced by setting the storage parameters on the table to appropriate values.
It can be corrected by the export and import of the affected table.
Technical Concepts Tested In Sql Interview Questions
Since SQL roles differ widely, the testing areas vary quite a bit as well. Depending on the type of role that you are applying for and the organization, you can expect one or more of these SQL Data Science Interview Question types
- Fundamental SQL concepts
- Database Modeling interview questions
- Software Engineering SQL Interview questions
Let us look at these SQL interview question types individually to understand the similarities and distinctions among them.
Read Also: What To Answer During Job Interview
How Do You Tune Sql Queries To Improve Performance
There are various methods to tune SQL queries. While tuning the query, there are various factors about query and table affect the performance. First and foremost step is to check its explain plan to understand its current execution path and the cost at each step.
Next step is to check, if any sub queries or filters can reduce the number of rows returned by it. Third step is to reduce the number of columns in the SELECT query, i.e. keep only those columns that are requested in the query. Once these steps are performed, next step is to check what type of optimizer is being used. If it is cost based optimizer, then it will collect the statistics to evaluate the query. We need to see if the statistics are updated properly or not. Next, we need to check if the order of conditions used in the WHERE clause affects the performance. If yes, then re-arrange the conditions . Next, check if the order of table names makes any difference in the query and rearrange them too. It will sometimes affect the performance based on the selectivity of columns. Make sure indexes are used properly so that query is executed effectively.
How Can You Use The Scope: Identity Function In The Sql Server
Ans: This function returns the last identity value inserted into an identity column within the same scope. Here, the scope is nothing but a module, which will have a stored procedure, trigger, batch, and function. If two statements exist in the same stored procedure or batch or function, then it means that they are in the same scope.
The syntax for this function is provided as SCOPE_IDENTITY
What Is Sql Server Agent
SQL Server Agent plays an important role in the daily work of SQL Server administrators or DBAs. This is one of the important parts of SQL Server. The aim of the server agent is to easily implement tasks using a scheduler engine that enables the tasks to be performed at scheduled times. SQL Server Agent uses SQL Server to store scheduled management task information.
Subqueries Common Table Expressions & Window Functions
A subquery is a query found within the query. It can occur in a SELECT clause, FROM clause, or WHERE clause.
A CTE or a Common Table Expression is a temporary result set returned by a query and used by another query. In that way, itâs similar to subquery. But the main difference is CTE can be named and can reference itself.
The window functions are SQL functions performing calculations over the defined set of rows . Compared to the aggregate functions, which return a single value as a result, the window functions allow you to add the aggregated value to each row in a separate column. This means the rows are not grouped and all the rows are kept as a query result. The window functions are:
Don’t Miss: How To Prepare For A Supervisor Interview
What Do You Mean By Rdbms
Ans: RDBMS is nothing but Relational DataBase Management System. It is the software that allows storing, managing, querying, and retrieving data from a relational database. And RDBMS interacts with users and the database it can also carry out administrative tasks such as managing data storage, accessing data, and assessing database performance.
What Are The Four Types Of Physical Join Operations
- Nested loop joins: They are applied when one join input is small, and the other join input is large because nested loop joins require minimum I/O and few comparisons.
- Merge joins: When two inputs are large and of the same size and sorted out based on the join column, then merge join will provide a fast join operation.
- Hash joins: They efficiently process large, non-indexed, and unsorted inputs. They are also helpful in getting intermediate results in complex queries.
- Adaptive joins: This join helps to decide when to switch to a Nested loop based on a threshold value
Also Check: How To Be Ready For A Job Interview
Q57 What Are Entities And Relationships
Entities: A person, place, or thing in the real world about which data can be stored in a database. Tables store data that represents one type of entity. For example A bank database has a customer table to store customer information. The customer table stores this information as a set of attributes for each customer.
Relationships: Relation or links between entities that have something to do with each other. For example The customer name is related to the customer account number and contact information, which might be in the same table. There can also be relationships between separate tables .
Lets move to the next question in this SQL Interview Questions.
Sql Server Interview Questions
The following interview questions have been used on multiple interviewees and seem to work out well in finding the different areas of expertise for an individual. The rating part at the beginning gives you an understanding of where the interviewer thinks they are at in the listed categories. You can then judge yourself after you ask the questions.
Don’t Miss: How To Conduct An Exit Interview
What Do You Mean By Relationships Between Tables And Mention Their Types
Ans: Relationships between tables describe how a row in a table is related to a row or rows of another table in a database.
There are three types of relationships, as mentioned below:
Tip : Use Where Instead Of Having
HAVING clause filters the rows after all the rows are selected. It is just like a filter. Do not use the HAVING clause for any other purposes.
In the SQL Order of Operations, HAVING statements are calculated after WHERE statements. Therefore, executing the WHERE query is faster.
Hope you enjoyed reading these tips for SQL query optimization. If you have any questions, feel free to drop a comment or write to us at [email protected].
You can learn more about SQL queries and syntax at W3Schools tutorial.
About Author: Avishek Kumar Singh is a Senior Tech Lead at Mantra Labs a leading application development service provider in insurtech and e-commerce domains. He has years of experience in developing robust web and mobile applications for enterprises.
Recommended Reading: How To Pass An Interview Successfully
Which Database Are We Using
Another important factor can be which database we are using. Every database is built for a specific use case. Using the wrong type of database can severely impact your performance.
For example, using a relational database like Postgres would generally not be a good practice when you want to perform analytical queries. While relational databases can perform a good number of analytical functions, their features and performance in that domain would be much more limited than a database specifically built for those kinds of operations, for example, Cassandra or Redshift.
A lot of databases exist for specific problems that would generally perform much better with the problem statement they were built to handle. There are databases for searching, for geospatial data, for time series data, for storing temporary data, etc.
Explain The Difference Between Oltp And Olap
OLTP: It stands for online transaction processing, and we can consider it to be a category of software applications that are efficient for supporting transaction-oriented programs. One of the important attributes of the OLTP system is its potential to keep up the consistency. The OLTP system often follows decentralized planning to keep away from single points of failure. This system is generally designed for a large audience of end users to perform short transactions. The queries involved in such databases are generally simple, need fast response time, and, in comparison, return in only a few records. So, the number of transactions per second acts as an effective measure for those systems.
OLAP: It stands for online analytical processing, and it is a category of software programs that are identified by a comparatively lower frequency of online transactions. For OLAP systems, the efficiency of computing depends highly on the response time. Hence, such systems are generally used for data mining or maintaining aggregated historical data, and they are usually used in multidimensional schemas.
Recommended Reading: How To Crack Amazon Business Intelligence Interview
Tip : Proper Indexing
An index is a data structure that improves the speed of data retrieval operations on a database table. A unique index creates separate data columns without overlapping each other. Proper indexing ensures quicker access to the database, i.e. youll be able to select or sort rows faster. The following diagram explains the basics of indexing while structuring tables.
Q1: What Is The Main Difference Between A Non
If a Non-Clustered index is built over a Heap table or view the leaf level nodes of that index hold the index key values and Row ID pointers to the location of the rows in the heap table. The RID consists of the file identifier, the data page number, and the number of rows on that data page.
On the other hand, if a Non-clustered index is created over a Clustered table, the leaf level nodes of that index contain Non-clustered index key values and clustering keys for the base table, that are the locations of the rows in the Clustered index data pages.
A RID Lookup operation is performed to retrieve the rest of columns that are not available in the index from the heap table based on the ID of each row.
A Key Lookup operation is performed to retrieve the rest of columns that are not available in the index from the Clustered index, based on the Clustered key of each row,
For more information, see the article: .
Read Also: How To Reject Applicant After Interview
Q1: How Could We Benefit From The Include Feature To Overcome Non
Rather than creating a Non-clustered index with a wide key, large columns that are used to cover the query can be included to the Non-clustered index as non-key columns, up to 1023 non-key columns, using the INCLUDE clause of the CREATE INDEX T-SQL statement, that was introduced in SQL Server 2005, with a minimum of one key column.
The INCLUDE feature extends the functionality of Non-clustered indexes, by allowing us to cover more queries by adding the columns as non-key columns to be stored and sorted only in the leaf level of the index, without considering that columns values in the root and intermediate levels of the Non-clustered index. In this case, the SQL Server Query Optimizer will locate all required columns from that index, without the need for any extra lookups. Using the included columns can help to avoid exceeding the Non-clustered size limit of 900 bytes and 16 columns in the index key, as the SQL Server Database Engine will not consider the columns in the Non-clustered index non-key when calculating the size and number of columns of the index key. In addition, SQL Server allows us to include the columns with data types that are not allowed in the index key, such as VARCHAR, NVARCHAR, text, ntext and image, as Non-clustered index non-key columns.
For more information, review .