Sunday, June 28, 2015

tNormlize: How to normalized multivalued attribute in Talend

Use the tNormalize column to break a multi-valued attribute stored in an RDMBS column into individual rows.
This simple scenario illustrates a Job that normalizes multivalued attribute and displays the result in a table on the Run console. This Excel sheet contains and multi value attribute in a column called CourceOffered.



Monday, June 1, 2015

Talend Tutorial Step By Step - Post 3: tHasInput and tHashOutput

Talend provides set of components (tHashInput & tHashOutput) to process huge amount of data at a very faster speed. The tHashInput component is part of the Technical family of component and allows you to quickly read row data from memory. that has previously been saved by tHashOutput


tHashInput


This component is used along with tHashOutput. It reads from the cache memory data loaded by tHashOutput. Together, these twin components offer high-speed data access to facilitate transactions involving a massive amount of data.

tHashOutput

This component writes data to the cache memory and is closely related to tHashInput. Together, these twin components offer high-speed data access to facilitate transactions involving a massive amount of data.
tHashOutput and tHashInput works in conjunction. They do not have any purpose if used separately. If you have a very huge data which you want to process multiple times at a very faster pace, then you can use tHashOutput component to write the data into cache memory and then read that data using tHashInput component.

Enabling tHashInput and tHashOutput

Many of the exercises rely on the use of tHashInput and tHashOutput components. Talend 5.2.3 does not automatically enable these components for use in jobs. To enable these components perform the instructions in the following section:

How to do it…

  1. On the main menu bar navigate to File | Edit Project properties to open the properties dialogue.
  2. Select Designer then Palette Settings.
  3. Click on the Technical folder and then click on the button shown in the following screenshot to add this folder to the Show panel.


The following example shows that how to create tHashOutput and then get the data using tHashInput.


The following image shows that that how to create a sample tHashOutput
The following image shows that how to connect tHashOutput with tHashInput. One you run, you will see that same records are retrieved from tHashOutput. 
Cheers!

Talend Tutorial Step by Step - Post 1: Built-in schema for CSV source with context

In this post, you will create a Built-in schema for CSV file which don’t have column header with source file path as context variable.
Create context for source path variable:
In repository, right click on contexts and Create context Group and crate context contexts variable for sourcePath.
Once you create a context, you have to select the context under the Job as shown below.
Drag a tFileInputDelimited component from the palette, and open it by double clicking it. Click the Edit Schema button (…), shown in the following screenshot, to open the schema editor:
Built-In Schema: Follow the following steps to create Built-in schema for the following source file.
Rename the sub job as you want.
Once you run the Job, you will get the following output in tLogRow component.
Cheers!

Talend Tutorial Step by Step - Post 2: Creating a generic schema from the existing metadata

Generic schemas Generic schemas aren’t tied to a particular source, so they can be used as a shared resource across multiple types of data source or they can be used to define data sources that are generated, such as the output from custom SQL queries.
Any schema can be easily converted into a generic schema to enable it to be re-used.
The following recipe shows two methods of creating generic schemas;
  1. A pre-existing schema in the metadata repository
Rename it as you want by editing the Generic schema
  1. From a built-in schema
This will open a windows file save dialogue. Save the file as customerDelimited.xml


Now create a new generic schema from the saved XML file by right-clicking Generic schemas, and selecting the option Create generic schema from XML.

Cheers!