Teradata Parallel Transporter TPT – Basics , Example and Syntax

Teradata Parallel Transporter - Basics, Syntax and Example

Teradata Parallel Transporter TPT – Basics , Example and Syntax

We receive so many requests for covering Teradata Utilities tutorials in our blog. The reason primarily I was not doing it because there are other blogs which already cover it. However one utility in specific I find the resources available online are not easy to understand or not very clear. I am talking about Teradata Parallel Transporter or in short TPT. This is my attempt to make it simple and easy to understand.

Before we proceed with writing TPT script, let me give you some background about this utility and why it is gaining popularity now. TPT is a single utility which internally uses other Teradata Utilities depending on the requirement specified by user. So if you want to load bulk data into empty table you can write TPT script which invokes FASTLOAD internally and loads the data. Similarly it uses MULTILOAD, TPUMP and of course BTEQ utilities to perform various functions. So now you don’t have to write separate scripts for different utilities. One utility which replaces all Teradata utilities.  In TPT for every activity there is an associated operator. Operator is nothing but you can think of role which the component plays. Different operator plays different role. For time being just remember:

BTEQ – DDL Operator
FASTLOAD – LOAD Operator
MULTILOAD – UPDATE Operator
FASTEXPORT – EXPORT Operator
TPUMP – STREAM Operator

Also, Producer Operator refers to Source eg reading from Data File. Consumer operator refers to Target eg extracting data into Data File. Job Steps are execution block of code which applies various operators to producer or consumer.

So let’s start with a simple example where we will read data from a file and load it into target table which is empty now. So before TPT was introduced we used to write FASTLOAD scripts for this scenario. Today we will write TPT script to load the data.

So we have a Data File which we want to load and an Empty Target Table.

Step 1 : Define FILE Schema and Producer Operator to read files

DEFINE SCHEMA SCHEMA_EMP_NAME
(
NAME VARCHAR(50),
AGE VARCHAR(2)
);

 

DEFINE OPERATOR op_EMP_NAME
TYPE DATACONNECTOR PRODUCER
SCHEMA SCHEMA_EMP_NAME
ATTRIBUTES
(
VARCHAR  DirectoryPath= ‘’,
VARCHAR  FileName = ‘NAME.CSV’,
VARCHAR  Format   = ‘Delimited’,
VARCHAR  OpenMode = ‘Read’,
VARCHAR  TextDelimiter =’|’
);

 

Perfect. So we have defined the file schema and it consists of only 2 columns Name & Age.  Also we have defined a PRODUCER operator specifying file details like File path, Filename, File Delimiter etc. There are more options available for it however to keep it simple we will check only these few options here.

 

Step 2: Define a DDL operator to drop/create Error tables and Target Table.

 

DEFINE OPERATOR od_EMP_NAME
TYPE DDL
ATTRIBUTES
(
VARCHAR PrivateLogName = ”,
VARCHAR LogonMech           = ”,
VARCHAR TdpId          = ”,
VARCHAR UserName       = ”,
VARCHAR UserPassword   = ”,
VARCHAR ErrorList      = ‘3807’
);

 

So now we have defined a DDL operator which will drop error tables if already exists and will re create target table. In our case we need target table to be empty so we can recreate it all the time.

 

Note: Fill in the details for LogonMech/TDPID/USERNAME/PASSWORD.

 

Step 3: Define a LOAD operator to load the data file into target table.

 

DEFINE OPERATOR ol_EMP_NAME
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR LogonMech           = ”,
VARCHAR TdpId          = ”,
VARCHAR UserName       = ”,
VARCHAR UserPassword   = ”,
VARCHAR  LogTable     =  ‘EMP_NAME_LG’,
VARCHAR  ErrorTable1  =  ‘EMP_NAME_ET’,
VARCHAR  ErrorTable2  =  ‘EMP_NAME_UV’,
VARCHAR  TargetTable  =  ‘EMP_NAME’
);

Now we have defined LOAD operator and mentioned Target table and Error tables.

Note: Fill in the details for LogonMech/TDPID/USERNAME/PASSWORD.

 

Step 4: Define Job Step to execute DDL Operator

 

STEP stSetup_Tables
(
APPLY
(‘DROP TABLE EMP_NAME_LG;’),
(‘DROP TABLE EMP_NAME_ET;’),
(‘DROP TABLE EMP_NAME_UV;’),
(‘DROP TABLE EMP_NAME;’),
(‘CREATE TABLE EMP_NAME(NAME VARCHAR(50), AGE VARCHAR(2));’)
TO OPERATOR (od_EMP_NAME);
);

Step 5: Define Job Step to execute LOAD operator

STEP stLOAD_FILE_NAME
(
APPLY
(‘INSERT INTO EMP_NAME
(Name,Age)
VALUES
(:Name,:Age);
‘)
TO OPERATOR (ol_EMP_NAME)
SELECT * FROM OPERATOR(op_EMP_NAME);
);

 

Summary:

  1. Define FILE Schema and Producer Operator to read files.
  2. Define a DDL operator to drop/create Error tables and Target Table.
  3. Define a LOAD operator to load the data file into target table.
  4. Define Job Step to execute DDL Operator
  5. Define Job Step to execute LOAD operator

 

Keeping everything together and giving a JOB name:
DEFINE JOB SAMPLE_FILE_LOAD
DESCRIPTION ‘Load a Teradata table from a file’
(
DEFINE SCHEMA SCHEMA_EMP_NAME
(
EMP_NAME VARCHAR(50),
AGE VARCHAR(2)
);

DEFINE OPERATOR op_EMP_NAME
TYPE DATACONNECTOR PRODUCER
SCHEMA SCHEMA_EMP_NAME
ATTRIBUTES
(
VARCHAR DirectoryPath= ”,
VARCHAR FileName = ‘EMP_NAME.CSV’,
VARCHAR Format = ‘Delimited’,
VARCHAR OpenMode = ‘Read’,
VARCHAR TextDelimiter =’|’
);

DEFINE OPERATOR od_EMP_NAME
TYPE DDL
ATTRIBUTES
(
VARCHAR LogonMech = ”,
VARCHAR TdpId = ”,
VARCHAR UserName = ”,
VARCHAR UserPassword = ”,
VARCHAR ErrorList = ‘3807’
);

DEFINE OPERATOR ol_EMP_NAME
TYPE LOAD
SCHEMA *
ATTRIBUTES
(
VARCHAR LogonMech = ”,
VARCHAR PrivateLogName = ‘load_log’,
VARCHAR TdpId = ”,
VARCHAR UserName = ”,
VARCHAR UserPassword = ”,
VARCHAR LogTable = ‘EMP_NAME_LG’,
VARCHAR ErrorTable1 = ‘EMP_NAME_ET’,
VARCHAR ErrorTable2 = ‘EMP_NAME_UV’,
VARCHAR TargetTable = ‘EMP_NAME’
);

STEP stSetup_Tables
(
APPLY
(‘DROP TABLE EMP_NAME_LG;’),
(‘DROP TABLE EMP_NAME_ET;’),
(‘DROP TABLE EMP_NAME_UV;’),
(‘DROP TABLE EMP_NAME;’),
(‘CREATE TABLE EMP_NAME(EMP_NAME VARCHAR(50), AGE VARCHAR(2));’)
TO OPERATOR (od_EMP_NAME);
);

STEP stLOAD_FILE_NAME
(
APPLY
(‘INSERT INTO EMP_NAME
(EMP_NAME,Age)
VALUES
(:EMP_NAME,:Age);
‘)
TO OPERATOR (ol_EMP_NAME)
SELECT * FROM OPERATOR(op_EMP_NAME);
);
);

Executing the script:

tbuild –f TPT_Scriptname.tpt

Whenever we execute TPT there is a .out file created which is like a log file for TPT. You cannot read contents of .out file by cat or more or any other UNIX command if running on UNIX or in Notepad or any other Text Editor. To view the file use below command:

tlogview –l TPT_Script.out

So we created one basic TPT script and successfully executed it to load data from File into Target Table. I have tried to make this post simple to understand. Hence I have explained many things here in my language which you may find different than other books or materials on TPT. However I hope if you understand the concept here you can relate to what others have written about TPT.

Watch the video to see working of TPT script explained above

19 Comments

  1. This is fantastic. It is very true that there are not many examples on TPT online. This explanation of yours above, of the whole process is concise, accurate and what more?!, you actually have given me a dead on example of what I was looking for, a file to import into a CSV. It really takes passion to write articles to help others out. Thank you for your hard work and keep up the good work!

    1. Correction…….I meant an import from a csv.

    2. Author

      Hi Karan
      Glad you like the post. Let us know if you are looking for any specific topic in Teradata. We will try to cover it in our future posts.

      Cheers
      Nitin

      1. Hi Raj,

        I do have a request, took you up on it very quickly there :D, but I’m trying to assign the OS command output value to a variable, so I can insert into a table. Essentially, all I’m trying to do, is to count the number of lines in a csv and load it and check if all the records have made it into the database. I could always use the error tables to judge that, but it is a standard where we work, so I’m trying to count the lines in the csv using powershell, and once I have that count, I want to be able to insert into an audits table.

        Thanks,
        Karan

        1. Author

          Hi Karan

          I added below two steps in tpt script to get the data file row count:

          DEFINE OPERATOR OS_COMMAND_OPERATOR
          TYPE OS COMMAND
          ATTRIBUTES
          (
          VARCHAR PrivateLogName = ‘oscommand_log’,
          VARCHAR OsCmd = ‘wc -l NAME.CSV|cut -d” ” -f1>count.txt’,
          VARCHAR IgnoreError = ‘YES’
          );

          STEP Count_Row_Data_File
          (
          APPLY TO OPERATOR (OS_COMMAND_OPERATOR);
          );
          Now you can read the file count.txt to check number of records in file. You cannot directly load the variable value into table as the os command operator is not allowed for TD operations.

          Let me know if you find any other work around to achieve this.

          Thanks
          Raj

  2. Hi Raj

    I am having issues running the script u shared.
    Getting this error
    TPT_INFRA: Syntax error at or near line 6 of Job Script File ‘TPTScript.tpt’:
    TPT_INFRA: At “NAME” missing RPAREN_ in Rule: Explicit Schema Element List
    TPT_INFRA: Syntax error at or near line 8 of Job Script File ‘TPTScript.tpt’:

    My script is as below:

    DEFINE JOB SAMPLE_FILE_LOAD
    DESCRIPTION ‘Load a Teradata table from a file’
    (
    DEFINE SCHEMA SCHEMA_EMP_NAME
    (
    NAME VARCHAR(50),
    AGE VARCHAR(2)
    );

    DEFINE OPERATOR op_EMP_NAME
    TYPE DATACONNECTOR PRODUCER
    SCHEMA SCHEMA_EMP_NAME
    ATTRIBUTES
    (
    VARCHAR DirectoryPath= ‘/TPT/’,
    VARCHAR FileName = ‘NAME.CSV*’,
    VARCHAR Format = ‘Delimited’,
    VARCHAR OpenMode = ‘Read’,
    VARCHAR TextDelimiter =’|’
    );

    DEFINE OPERATOR od_EMP_NAME
    TYPE DDL
    ATTRIBUTES
    (
    VARCHAR PrivateLogName = ‘ddl_log’,
    VARCHAR LogonMech = ‘LDAP’,
    VARCHAR TdpId =”,
    VARCHAR UserName =”,
    VARCHAR UserPassword =”,
    VARCHAR ErrorList = ‘3807’
    );

    DEFINE OPERATOR ol_EMP_NAME
    TYPE LOAD
    SCHEMA *
    ATTRIBUTES
    (
    VARCHAR LogonMech = ‘LDAP’,
    VARCHAR TdpId =”,
    VARCHAR UserName =”,
    VARCHAR UserPassword =”,
    VARCHAR LogTable = ‘EMP_NAME_LG’,
    VARCHAR ErrorTable1 = ‘EMP_NAME_ET’,
    VARCHAR ErrorTable2 = ‘EMP_NAME_UV’,
    VARCHAR TargetTable = ‘EMP_NAME’
    );

    STEP stSetup_Tables
    (
    APPLY
    (DROP TABLE EMP_NAME_LG;),
    (DROP TABLE EMP_NAME_ET;),
    (DROP TABLE EMP_NAME_UV;),
    (DROP TABLE EMP_NAME;),
    (‘CREATE TABLE SCHEMA_EMP_NAME.EMP_NAME(NAME VARCHAR(50), AGE VARCHAR(2));’)
    TO OPERATOR (od_EMP_NAME);
    );

    STEP stLOAD_FILE_NAME
    (
    APPLY
    (INSERT INTO SCHEMA_EMP_NAME.EMP_NAME
    (Name,Age)
    VALUES
    (:Name,:Age);
    )
    TO OPERATOR (ol_EMP_NAME)
    SELECT * FROM OPERATOR(op_EMP_NAME);
    );
    );

    Please advice what all changes need to be done in the script shared to work as the script itself is creating the Table but the Schema name is not defined.

    1. Author

      Hi Kshitij

      I executed similar TPT script in different TPT versions. I got the error in V15.00
      Below is the complete Error Description.

      “Teradata Parallel Transporter Version 15.00.00.00
      TPT_INFRA: Syntax error at or near line 6 of Job Script File ‘TPT_Load.tpt’:
      TPT_INFRA: At “NAME” missing RPAREN_ in Rule: Explicit Schema Element List
      TPT_INFRA: Syntax error at or near line 8 of Job Script File ‘TPT_Load.tpt’:
      TPT_INFRA: TPT03020: Rule: DEFINE SCHEMA
      Compilation failed due to errors. Execution Plan was not generated.
      Job script compilation failed.
      Job terminated with status 8.”

      Looks like you are also using the same TPT version. To overcome this error all I did is changed NAME to EMP_NAME in DEFINE SCHEMA section. So the new Schema definition looks like

      DEFINE SCHEMA SCHEMA_EMP_NAME
      (
      EMP_NAME VARCHAR(50),
      AGE VARCHAR(2)
      );

      After this change the TPT script executed successfully.
      Let me know the results once you try this change.

      Cheers
      Raj

  3. Can you give an example on TPT Upsert, using ODBC as source

  4. Hi Kiran,

    Could you please give an example, how we can export the data from teradata to HDFS(Hadoop distributed file system) location.

    Thanks,
    Ranjit


  5. Hello, this was extremely helpful. Can you please also provide a sample TPT script where instead of one file (as shown above), we are using multiple (lets say, 3) data files to load data into one empty table. I have searched through internet but there’s no example as easy as your to follow. Greatly appreciate your help!


  6. Hi,

    Need a help in TPT script while loading data from flat file to a Teradata table.

    Eg:

    STEP load_data

    (

    APPLY

    $INSERT
    TO OPERATOR (LOAD_OPERATOR () [1] )

    SELECT * FROM OPERATOR (EXPORT_OPERATOR() [1] );

    );

    How to pass SELECT * FROM statement as a parameter ?
    eg:

    STEP load_data

    (

    APPLY

    $INSERT
    TO OPERATOR (LOAD_OPERATOR () [1] )

    @SELECTSQL OPERATOR (EXPORT_OPERATOR() [1] );

    );

    I m using the command to execute the script:

    tbuild -f FF_TGT_TBL_LOAD.txt -u “TgtDatabase = ‘DEV_XXX_XXX’, TgtTable = ‘FF_ABC_XYZ’, ,SELECTSQL=’SELECT * FROM ‘ ”

    But it is getting failed. Could anyone of you please help me to fix this ?

  7. Thanks for such a good article.. Simple scripting style

    1. Author

      Glad you liked it Ganesh.

      Cheers
      Raj

  8. Dear Auther,

    Could you please help me to get script for data source as a Oracle ‘ s table:
    Like, I have to extract data from oracle database table in keep in Teradata database table without landing in file.

    Much appreciated for your support.

    Thanks & Regards
    AK

    1. Hi,

      I am also looking for a similar script to load a teradata table using any ODBC connection as source. Please let me know if you have already have a working solution for it.

Leave a Reply