SSIS - sql server integration services Microsoft platform used for solving complex problems like reading,loading,copying large files, data transformation and migration task. SSIS Packages - organized collections of various connections and elements related to data flow, event handlers, variables, control flow, configurations, or parameters that would be used for assembling and programmatically building graphical design tools - Data flow - Control flow - Package explorer - Event handler . SSIS Expressions - used for filtering information based on parameters and conditions to get desired information. They make lives easy by helping to work with conditions, loops, and dynamic connections - Control Flow - Data Flow . Manifest File - the utility having information that is useful for deploying packages using file system wizard and database of SQL server. . Data transformation - process of extracting required data from a data source and is the most critical SSIS step. Post extraction, the process aids in managing and transferring the data to a specific file destination. . Solution explorer - Screen where you can view and access all the data sources, data sources views and projects . Data flow - Flow of data from corresponding sources to target the destination . Container - logicla group of tasks, and it allows to maanage the scope of a task togethor - Sequence container - For loop container - Foreach loop container - Task host container . Precedence Constraint - Enables you to define a logical sequence of tasks in the order they should be executed. Connect all the tasks using connectors . Variables - To store values, System variable and User variable . Connection managers - gathering data from different sources and writing it to a destination Connection manager facilitates the connection to the system that includes, information's like data provider information, server name, authentication mechanism, database name, etc. . Breakpoint - Enables you to pause the execution of the package in the BI studio during troubleshooting your package to undestand the performance package . EventLogging - allows you to select any specific event of a task or a package to be logged.
- Disabled: To disable logging of the component - Enabled: To enable logging of the part - Use Parent Setting: To use the parent's setting of the component . Data flow buffer - it is a kind of an in-memory virtual table to hold data. . Conditional Split - like IF condition, which checks for the given condition based on the condition evaluation . Views in SSIS - . Grid . Histogran . Scatter Plot . Column Chart . Event Handlers - On the event handlers tab, workflows can be configured to respond to package events. For instance, you can configure workflow when any task stops, fails or starts. . Deploy SSIS package on production - need to execute the manifest files and need to determine whether to deploy this into File System or onto SQL Server. Alternatively, you can also import package from SSMS from SQL Server or File System. . Handling early arrival facts or late arriving dimensions Late Arriving Dimension are unavoidable; to handle these we can create a dummy dimension with natural/business key and keep the rest of the attributes as null or default. So when the actual dimension arrives, the dummy dimension is updated with Type 1 change. This is also referred to as Inferred Dimensions . Incremental load - using timestamp column in the source table and storing the last ETL timestamp. . Data conversion transformation - convert data from one type to other . Disadvantages - SSIS sometimes create issues in non-windows environments • Unclear vision and strategy • SSIS doesn't provide support for alternative data integration styles Execute SQL Task - helps you execute SQL statement against relational database SSIS Catalog - database to store all deployed packages . Project a container for developing packages while packages is an object which helps you implement ETL . XML tasks allows you to split,merge,split or reformat any XML format C:\Users\Daniel\source\repos\Integration Services Project1\Integration Services Project1\Integration Services Project1.dtpro .Sequence Container - Sequence contain helps you to organize
subsidiary tasks by dividing them into the group. It will enable you to apply transaction or assign logging to the container . Control flow - SSIS includes dataflow tasks, containers and precedence constraints which helps you to connect containers and functions C:\Users\Daniel\source\repos\Integration Services Project1\Integration Services Project1\Integration Services Project1.dtproj . OLE DB for native driver ETLing large data for faster speeds and ADO.NET for normal ETL . expression to connect the flatfile/csv/xlxs in a foreloop container and make connections with other files . SMO enumaerators allows to iterate through databse objects such as the database, schemas,tables and columns from sql to flatfile conversion . Incremental data loading - by using CDC, Incremental data loading can be performed in which data from the last ETL is needed to be trasnferred and improving ETL perfomance . CDC - Change capture data, needs to be enabled on specific tables in the datasource, SQL than captures the changes and than stored in CDC tables in the source database . Derived column - allows you to create calculated columns using expressions, functions, and operators. EXAMPLE :- CONCAT . Error handling - add scrpit for error handling description with the red pipe . Deploying packages from Integrations Services catalog on SSMS, Just like SPs bur packages can be stored in SSMS . variables vs parameters - . Variable limited to one specific package in which it is defined . Parameters can be used in all the SSIS packages created in the particular project
No comments:
Post a Comment