A clustered index is a type of database index that organizes the data rows in a table according to the index key values, which determines the physical order of the data on the disk. Each table can have only one clustered index because the data rows themselves can be stored in only one order.
Key Characteristics of Clustered Index:
Physical Order of Data:In a clustered index, the data rows are stored on the disk in the same order as the index key. This means that the data is physically sorted and stored according to the values of the indexed column(s).
Unique per Table:There can be only one clustered index per table because the table rows themselves can be stored in only one order. This index is often created on the primary key by default.
B-tree Structure:Clustered indexes use a B-tree structure where the leaf nodes of the tree contain the actual data pages of the table. This structure helps in maintaining order and allows for efficient searching, insertion, and deletion operations.
Automatic Creation: By default, SQL Server creates a clustered index on the primary key column(s) if no clustered index exists and no other index is specified.
A non-clustered index in SQL is a type of index where the logical order of the index does not match the physical stored order of the rows on the disk. In other words, the data in the table is not stored in the order of the index keys. Instead, a non-clustered index maintains a separate structure from the data rows, containing pointers to the actual data rows in the table.
Key Characteristics of Non-Clustered Index:
Separate Structure:Non-clustered indexes are stored separately from the data rows. The index contains a structure that includes the indexed columns and pointers to the corresponding data rows.
Multiple Indexes:A table can have multiple non-clustered indexes. This allows for flexible indexing strategies on different columns to optimize various queries.
B-tree Structure:Non-clustered indexes also use a B-tree structure. The leaf nodes of this tree do not contain the actual data but rather pointers (row locators) to the data rows.
Logical Ordering:The rows are logically ordered in the non-clustered index but this order does not affect the physical storage of the data rows in the table.
Types of DBMS:
Relational DBMS (RDBMS):
Data is organized into tables (relations) with rows and columns.
Uses SQL for data manipulation.
Examples: MySQL, PostgreSQL, Oracle Database, Microsoft SQL Server.
NoSQL DBMS:
Designed for unstructured or semi-structured data.
Includes various types such as document-based, key-value, column-family, and graph databases.
Examples: MongoDB (document-based), Redis (key-value), Cassandra (column-family), Neo4j (graph).
In-Memory DBMS:
Stores data primarily in memory for faster access.
Examples: Redis, SAP HANA.
Hierarchical and Network DBMS:
Early forms of DBMS.
Hierarchical DBMS: Organizes data in a tree-like structure (e.g., IBM’s Information Management System (IMS)).
Network DBMS: Uses a graph structure to represent data relationships (e.g., Integrated Data Store (IDS)).
The DELETE command in SQL is used to remove one or more rows from a table. It is a Data Manipulation Language (DML) command, meaning it directly affects the data stored in the database but not the structure of the database itself. The DELETE command allows for selective deletion of rows based on conditions specified in a WHERE clause. If no WHERE clause is provided, all rows in the table are deleted.
Key Characteristics of the DELETE Command:
Selective Deletion:
Allows you to delete specific rows based on a condition.
If no condition is provided, all rows in the table are deleted.
Logging:
DELETE operations are fully logged in the transaction log. Each row deletion is recorded, which makes it possible to roll back the operation if necessary.
Syntax
DELETE FROM table_name WHERE condition;
Triggers:DELETE commands can activate AFTER DELETE or BEFORE DELETE triggers if such triggers are defined on the table.
Transaction Control:DELETE can be part of a transaction. You can commit or roll back the transaction, which provides control over data modifications.
Foreign Key Constraints:DELETE operations respect foreign key constraints. If a row is referenced by a foreign key in another table, you must handle this relationship (e.g., using cascading deletes or first deleting the dependent rows).
Performance:For large datasets, DELETE can be slower compared to TRUNCATE because each row deletion is individually logged.
The TRUNCATE command in SQL is used to remove all rows from a table quickly and efficiently. Unlike the DELETE command, which removes rows one at a time and logs each deletion, TRUNCATE operates by deallocating the data pages used by the table, making it much faster for large datasets. It is considered a Data Definition Language (DDL) command rather than a Data Manipulation Language (DML) command because it affects the table's structure in a way that impacts data storage.
Key Characteristics of the TRUNCATE Command
Remove All RowsTRUNCATE removes all rows from a table, leaving the table structure intact.
Syntax:
TRUNCATE TABLE table_name;
Performance:TRUNCATE is much faster than DELETE for large tables because it deallocates the data pages used by the table rather than logging individual row deletions.
Logging:TRUNCATE is minimally logged, meaning it logs the deallocation of the data pages rather than each row deletion. This results in less transaction log space usage and faster execute
Cannot Activate Triggers:Unlike DELETE, TRUNCATE does not activate AFTER DELETE or BEFORE DELETE triggers because it does not perform row-by-row deletions.
Resets Identity Columns:If a table has an identity column, TRUNCATE will reset the counter for that column back to its seed value.
Constraints:TRUNCATE cannot be used on tables that are referenced by foreign keys. All foreign key constraints must be dropped before truncating the table.
Transaction Control:TRUNCATE can be rolled back if it is part of a transaction. However, it is an all-or-nothing operation and does not allow for partial rollback.
OLTP, or Online Transaction Processing, is a type of data processing that focuses on managing transaction-oriented applications. These applications typically involve a large number of short, online transactions that are executed in real-time. OLTP systems are designed to handle a high volume of read and write operations and ensure data integrity in multi-user environments.
Key Characteristics of OLTP Systems:
High Transaction Volume:OLTP systems handle a large number of transactions per second. These transactions are typically short and involve insert, update, delete, or read operations.
Real-Time Processing:Transactions are processed immediately as they occur, providing real-time results and ensuring that the system reflects the most current data.
Data Integrity:OLTP systems enforce strict data integrity constraints through the use of ACID (Atomicity, Consistency, Isolation, Durability) properties. This ensures that transactions are processed reliably and that the database remains consistent even in the case of system failures.
Concurrency Control: OLTP systems manage simultaneous transaction execution by multiple users through concurrency control mechanisms such as locking and transaction isolation levels, ensuring data consistency and preventing conflicts.
Normalization:Database schemas in OLTP systems are typically highly normalized to reduce redundancy and improve data integrity.
Performance Optimization:OLTP systems are optimized for fast query processing and quick response times to ensure efficient transaction processing.
The UPDATE_STATISTICS command in SQL Server is used to update the statistics of indexes and columns. This is typically done to ensure the query optimizer has up-to-date information about the distribution of data, which helps in creating efficient query execution plans. It is particularly useful after significant data modifications (inserts, updates, deletes) or when the performance of queries is degrading.
The key difference between a HAVING clause and a WHERE clause is
WHERE Clause: Filters rows before any groupings are made. It is used to specify a condition for selecting rows from a table.
HAVING Clause:Filters groups after the grouping operation has been performed. It is used to specify a condition for groups created by the GROUP BY clause.
In essence, use WHERE to filter individual rows and HAVING to filter groups.
SQL Profiler is a diagnostic tool in Microsoft SQL Server that allows you to capture and analyze SQL queries and database activities in real time. It helps in monitoring performance, debugging queries, and auditing database operations.
SQL Server Agent is a component of Microsoft SQL Server that automates and schedules tasks, such as running jobs, handling alerts, and managing scripts and maintenance plans. It helps in automating routine database operations and administrative tasks.
Log Shipping is a high-availability feature in SQL Server that involves automatically backing up transaction logs from a primary database and restoring them to one or more secondary databases. This helps maintain a standby server that can quickly take over in case of a primary server failure, providing disaster recovery and data redundancy.
The key differences between local and global temporary tables in SQL Server are:
Local Temporary Table:
Prefixed with a single # (e.g., #TempTable).
Scope is limited to the session or connection that created it.
Automatically dropped when the session ends.
Global Temporary Table:
Prefixed with double ## (e.g., ##TempTable).
Available to all sessions and connections.
Dropped when the session that created it ends and all other sessions referencing it have finished using it.
A PRIMARY KEY is a column or combination of columns in a database table that uniquely identifies each row in that table. It enforces uniqueness and ensures that no duplicate values exist in the specified column(s). Additionally, a primary key automatically creates a unique index and cannot contain NULL values.
A CHECK constraint is a rule applied to a column in a database table to enforce specific conditions on the data values that can be stored in that column. It ensures that all values in the column meet the defined criteria, enhancing data integrity. For example, a CHECK constraint can ensure that values in a column are greater than a certain number or within a specific range.
Scheduled jobs or scheduled tasks are automated processes set to run at specific times or intervals in a database or operating system. In SQL Server, scheduled jobs are managed by SQL Server Agent and can include tasks such as running SQL scripts, backing up databases, and performing maintenance operations. These tasks help automate routine operations and improve efficiency by running them without manual intervention.
A NOT NULL constraint in a database ensures that a column cannot have a NULL value. It is used to enforce that every row in the table must have a value for that particular column, thereby guaranteeing data completeness and integrity for that column.
Yes, SQL Servers can be linked to other servers, including Oracle, using a feature called Linked Servers. Linked Servers allow SQL Server to execute commands against OLE DB data sources on remote servers. This enables querying and managing data across different database systems from within SQL Server, facilitating interoperability and integration between SQL Server and other database platforms such as Oracle.
An execution plan is a detailed map generated by the SQL Server query optimizer that outlines how a SQL query will be executed. It shows the steps and the order in which operations like scans, joins, and sorting will be performed to retrieve the desired data. Execution plans are used to analyze and optimize query performance.
A UNIQUE key constraint in a database ensures that all values in a column or a group of columns are unique across the rows of the table. Unlike a primary key, a table can have multiple unique keys, and a unique key can contain NULL values (though each NULL is considered unique in most database systems)
Key Characteristics:
Uniqueness:Enforces that no duplicate values are allowed in the specified column(s).
Multiple Constraints:A table can have multiple UNIQUE constraints, each on different columns.
NULL Values:Allows NULL values unless explicitly restricted, with the uniqueness constraint applying only to non-NULL values.
A FOREIGN KEY is a column or a set of columns in a database table that creates a relationship between two tables. It acts as a reference to the primary key (or a unique key) in another table, ensuring referential integrity by enforcing that the values in the foreign key column(s) must exist in the referenced primary key column(s).
A UNIQUE key constraint is a database constraint that ensures all values in a specified column or set of columns are unique across the rows of the table. This means no two rows can have the same value(s) in those columns, except for NULL values (which can appear more than once unless otherwise restricted).
The STUFF and REPLACE functions in SQL Server are both used for manipulating strings, but they serve different purposes and work in distinct ways.
The STUFF function is used to insert a string into another string, replacing a specified length of characters. It allows you to delete a specified length of characters in a string and then insert another substring starting at a specified position.
Stored procedures offer several advantages in database management and application development. Here are some of the key benefits:
Performance Improvement
Precompilation:Stored procedures are compiled and optimized by the database server when they are created, which can result in faster execution times compared to ad-hoc queries that are compiled at runtime.
Reduced Network Traffic: Executing a stored procedure can reduce network traffic since a single call can execute multiple SQL statements, compared to sending multiple individual queries from the client to the server.
Reusability and Maintainability
Code Reuse:Stored procedures allow you to encapsulate complex logic that can be reused across multiple applications or parts of an application, reducing code duplication.
Easier Maintenance:Changes to business logic can be made in one place within the stored procedure rather than needing to update multiple queries scattered throughout the application.
SQL Server by default runs on TCP/IP port 1433. This port is used for client-server communication when the SQL Server instance is configured to use the TCP/IP protocol.
SQL Server supports two authentication modes: Windows Authentication Mode and Mixed Mode.
Windows Authentication ModeIn Windows Authentication Mode, SQL Server validates the user's credentials using their Windows account. This mode leverages the Windows security infrastructure, providing seamless integration with Active Directory and centralized user management.
Mixed ModeIn Mixed Mode, SQL Server supports both Windows Authentication and SQL Server Authentication. SQL Server Authentication allows the use of SQL Server-specific logins and passwords, which are managed by SQL Server itself. This mode provides more flexibility, especially for applications and users that do not have Windows accounts.
Subqueries, also known as inner queries or nested queries, are queries nested within another query. They are a powerful feature in SQL that allows you to perform more complex operations and retrieve data based on the results of another query.
Types of Sub-Query
Single-row sub-query, where the sub-query returns only one row.
Multiple-row sub-query, where the sub-query returns multiple row
Multiple column sub-query, where the sub-query returns multiple columns
UNION combines the result sets of two or more SELECT statements and removes duplicate rows.
UNION ALL combines the result sets of two or more SELECT statements without removing duplicates.
Stored procedures are precompiled collections of SQL statements stored in the database. They provide performance benefits, code reusability, security through encapsulation, and ease of maintenance.
ACID stands for Atomicity, Consistency, Isolation, and Durability. These properties ensure reliable processing of database transactions.
Atomicity:Ensures that all operations within a transaction are completed successfully; if not, the transaction is aborted.
Consistency:Ensures that a transaction brings the database from one valid state to another.
Isolation:Ensures that the operations of one transaction are isolated from the operations of other transactions.
Durability:Ensures that the results of a committed transaction are permanent and survive system failures.
BCP (Bulk Copy Program) is a command-line tool provided by Microsoft SQL Server that is used to import and export large amounts of data efficiently between a SQL Server instance and a data file. It is particularly useful for transferring data to or from a database in a fast and efficient manner.
BCP Used
Bulk Data Import:
When you need to load large volumes of data into a SQL Server table from a file, such as CSV, TXT, or other delimited formats.
Bulk Data Export:
When you need to export data from a SQL Server table to a file, often for the purposes of data sharing, backup, or data migration.
Example: Exporting customer data to a CSV file for analysis in an external tool.
Data Migration:
When moving large datasets between different environments or SQL Server instances, especially when migrating databases or setting up replication.
Example: Migrating data from a development environment to a production environment.
Data Archival:
When archiving old data to a file for long-term storage and potentially freeing up space in the database.
Example: Archiving yearly financial data to a CSV file.