SQL Server Integration Services

De Banane Atomic
Version datée du 9 mars 2023 à 16:33 par Nicolas (discussion | contributions) (→‎Expressions)
(diff) ← Version précédente | Voir la version actuelle (diff) | Version suivante → (diff)
Aller à la navigationAller à la recherche

Présentation

Permet de:

  1. extraire des données depuis une source: fichier, base de données
  2. transformer ces données
  3. charger ces données vers une destination: fichier, base de données

SSIS est un outil ETL (Extraction Transformation and Load)

Liens

Nouveau Projet

Visual Studio → File → New → Project → Integration Services Project

Control Flow

Récupération des données.

Exécuter du code SQL

Execute SQL Task

Télécharger un fichier depuis internet

  1. Spécifier l'url du fichier à télécharger: clique-droit dans Connection Managers → New Connection → Type: HTTP
  2. Spécifier l'emplacement où télécharger le fichier: clique-droit dans Connection Managers → New File Connection
  3. Ajouter une Script Task
ScriptMain.cs
try
{
    // Logging start of download
    bool fireAgain = true;
    Dts.Events.FireInformation(0, "Download File", "Start downloading " + Dts.Connections["HTTP Connection Manager"].ConnectionString, string.Empty, 0, ref fireAgain);

    // Get your newly added HTTP Connection Manager
    var mySSISConnection = Dts.Connections["HTTP Connection Manager"].AcquireConnection(null);

    // Create a new connection
    var myConnection = new HttpClientConnection(mySSISConnection);

    // Download file and use the File Connectionstring to save the file
    myConnection.DownloadFile(Dts.Connections["DownloadedFile"].ConnectionString, true);

    // Logging end of download
    Dts.Events.FireInformation(0, "Download File", "Finished downloading " + Dts.Connections["DownloadedFile"].ConnectionString, string.Empty, 0, ref fireAgain);

    // Quit Script Task succesful
    Dts.TaskResult = (int)ScriptResults.Success;
}
catch (Exception ex)
{
    // Logging why download failed
    Dts.Events.FireError(0, "Download File", "Download failed: " + ex.Message, string.Empty, 0);

    // Quit Script Task unsuccesful
    Dts.TaskResult = (int)ScriptResults.Failure;
}

Data Flow

Mise en forme des données.

Fichier CSV

SSIS Toolbox Box → Other Destinations/Sources → Flat File Destination/Source

Multiple sources

Connection → MULTIFLATFILE pour plusieurs fichiers CSV ayant exactement les mêmes colonnes
Union All pour réunir des données non triées
Merge pour réunir des données triées
Merge Join pour réunir des données découpées et liées

Data Conversion

Create a new column with a data converted in a new type.

Tips to convert from Excel to MariaDB
Excel type MariaDB type
DT_DATE DT_DBDATE

Derived Column

Create a new column or replace an existing one with computed result as value.

Tsql.svg
-- convert a DT_NUMERIC to DT_DBDATE (2020[int] → 2020-01-01[date])
COLUMN1 == 0 ? NULL(DT_DBDATE) : (DT_DBDATE)((DT_STR,4,1252)COLUMN1 + "-01-01")

Aggregate

Do an aggregate operation like Group by Count Sum Average Minimum Maximum

Lookup

Creates 2 outputs, one for the matched rows and another one for the non-matched rows.

Conditional Split

Creates multiples outputs, one for each filter.

Union All

Combines the rows of 2 sources.

Data Warehousing

A way to handle separately new rows and rows to update.
To do so we can use the Slowly Changing Dimension Wizard (only for small datasets).
Or doing it manually:

  1. Use a lookup to isolate new rows
    1. Add new rows to destination
    2. Use a conditional split to isolate rows to update from non-new rows
      1. Update the rows to update

Expressions

Concatenating values

Tsql.svg
[Column1] + " - " + [Column2]
-- if [Column1] or [Column2] is null, the result will be null

-- handle null values
(ISNULL([Column1]) ? "" : [Column1]) + " - " + (ISNULL([Column2]) ? "" : [Column2])

Cast to string

Tsql.svg
"Date: " + (DT_STR, 20, 1252)@[Column3]

Variables

Right-click on the designer → Variables

  • Filter: show system variables or variables of all scopes
  • Columns: add Namespace, Description, Raise event when var value changes

Erreurs

A required privilege is not held by the client

Getting this error while deploying.
This error happens after the installation of Visual Studio SSDT extension.
To fix it, run the repai process of SQL Server.

Unable to bulk copy data. You may need to run this package as an administrator

Utiliser OLE DB Destination au lieu de SQL Server Destination.

Installation

  1. New SQL stand-alone installation or add features to an existing installation
  2. Instance Features → Database Engine Services
  3. Shared Features → Integration Services

Enable named pipes

  • C:\Windows\SysWOW64\SQLServerManager16.msc
  • SQL Server Network Configuration → Protocols for MSSQLSERVER
  • Named Pipes → Enabled
  • TCP/IP → Enabled

SSIS Catalog

  • SSMS → right-click on Integration Services Catalogs → Create Catalog
  • Enable CLR integration
  • Enable automatic execution of IS stored procedure at SQL Server startup

SQL Server Agent

  • Start from SSMS or Services

Visual Studio SQL Server Integration Services Extension