5 Loading Strategies (2024)

Home / Middleware / Oracle Data Integrator

This chapter explains the loading strategies used for loading data into the staging area. These strategies are implemented in the Loading Knowledge Modules.

This chapter includes the following sections:

  • Section 5.1, "Loading Process"

  • Section 5.2, "Case Studies"

5.1 Loading Process

A loading process is required when source data needs to be loaded into the staging area. This loading is needed when some transformation take place in the staging area and the source schema is not located in the same server as the staging area. The staging area is the target of the loading phase.

5.1.1 Loading Process Overview

A typical loading process works in the following way:

  1. A temporary loading table is dropped (if it exists) and then created in the staging area.

  2. Data is loaded from the source into this loading table using a loading method.

    Action 1 and 2 are repeated for all the source data that needs to be moved to the staging area.

    The data is used in the integration phase to load the integration table.

  3. After the integration phase, before the mapping completes, the temporary loading table is dropped.

5.1.2 Loading Table Structure

The loading process creates in the staging area a loading table. This loading table is typically prefixed with a C$.

A loading table represents a execution unit and not a source datastore. There is no direct mapping between the source datastore and the loading table. Execution units appear in the physical diagram of the mapping editor.

The following cases illustrate the notion of execution unit:

  • If a source CUSTOMER table has only 2 attributes CUST_NAME, CUST_ID used in mapping and joins on the staging area, then the loading table will only contain an image of these two attributes. Attributes not needed for the rest of the integration flow do not appear in the loading table.

  • If a CUSTOMER table is filtered on CUST_AGE on the source, and CUST_AGE is not used afterwards, then the loading table will not include CUST_AGE. The loading process will process the filter in the source data server, and the loading table will contain the filtered records.

  • If two tables CUSTOMER and SALES_REPS are combined using a join on the source and the resulting execution unit is used in transformations taking place in the staging area, the loading table will contain the combined attributes from CUSTOMER and SALES_REPS.

  • If all the attributes of a source datastore are mapped and this datastore is not joined on the source, then the execution unit is the whole source datastore. In that case, the loading table is the exact image of the source datastore. This is the case for source technologies with no transformation capabilities such as File.

5.1.3 Loading Method

The loading method is the key to optimal performance when loading data from a source to the staging area. There are several loading methods, which can be grouped in the following categories:

  • Loading Using the Agent

  • Loading File Using Loaders

  • Loading Using Unload/Load

  • Loading Using RDBMS-Specific Strategies

5.1.3.1 Loading Using the Agent

The run-time agent is able to read a result set using JDBC on a source server and write this result set using JDBC to the loading table in the staging area. To use this method, the knowledge module needs to include a command with a SELECT on the source with a corresponding INSERT on the target.

This method may not be suited for large volumes as data is read row-by-row in arrays, using the array fetch feature, and written row-by-row, using the batch update feature.

5.1.3.2 Loading File Using Loaders

When the mapping contains a flat file as a source, you may want to use a strategy that leverages the most efficient loading utility available for the staging area technology, rather than the standard LKM File to SQL that uses the ODI built-in driver for files. Most RDBMSs have fast loading utilities to load flat files into tables, such as Oracle's SQL*Loader, Microsoft SQL Server bcp, Teradata FastLoad or MultiLoad.

Such LKM will load the source file into the staging area, and all transformations will take place in the staging area afterwards.

A typical LKM using a loading utility will include the following sequence of steps:

  1. Drop and create the loading table in the staging area

  2. Generate the script required by the loading utility to load the file to the loading table.

  3. Execute the appropriate operating system command to start the load and check its return code.

  4. Possibly analyze any log files produced by the utility for error handling.

  5. Drop the loading table once the integration phase has completed.

5.1.3.3 Loading Using Unload/Load

When the source result set is on a remote database server, an alternate solution to using the agent to transfer the data is to unload it to a file and then load that file into the staging area.

This is usually the most efficient method when dealing with large volumes across heterogeneous technologies. For example, you can unload data from a Microsoft SQL Server source using bcp and load this data into an Oracle staging area using SQL*Loader.

The steps of LKMs that follow this strategy are often as follows:

  1. Drop and create the loading table in the staging area

  2. Unload the data from the source to a temporary flat file using either a source database unload utility (such as Microsoft SQL Server bcp or DB2 unload) or the built-in OdiSqlUnload tool.

  3. Generate the script required by the loading utility to load the temporary file to the loading table.

  4. Execute the appropriate operating system command to start the load and check its return code.

  5. Possibly analyze any log files produced by the utility for error handling.

  6. Drop the loading table once the integration KM has terminated, and drop the temporary file.

When using an unload/load strategy, data needs to be staged twice: once in the temporary file and a second time in the loading table, resulting in extra disk space usage and potential efficiency issues. A more efficient alternative would be to use pipelines between the "unload" and the "load" utility. Unfortunately, not all the operating systems support file-based pipelines (FIFOs).

5.1.3.4 Loading Using RDBMS-Specific Strategies

Certain RDBMSs have a mechanism for transferring data across servers. For example:

  • Oracle: database links

  • Microsoft SQL Server: linked servers

  • IBM DB2 400: DRDA file transfer

Other databases implement specific mechanisms for loading files into a table, such as Oracle's External Table feature.

These loading strategies are implemented into specific KM that create the appropriate objects (views, dblinks, etc.) and implement the appropriate commands for using these features.

5.2 Case Studies

This section provides example of loading strategies.

5.2.1 LKM SQL to SQL

The LKM SQL to SQL is a typical example of the loading phase using the agent.

The commands below are extracted from this KM and are provided as examples. You can review the code of this knowledge module by editing it in Oracle Data Integrator Studio.

5.2.1.1 Drop Work Table

This task drops the loading table. This command is always executed and has the Ignore Errors flag activated. It will not stop the LKM if the loading table is not found.

Command on Target

drop table <%=snpRef.getTable("L", "COLL_NAME", "A")%>

5.2.1.2 Create Work Table

This task drops the loading table. This command is always executed.

Note the use of the property COLL_NAME of the getTable method that returns the name of the loading table.

Command on Target

create table <%=snpRef.getTable("L", "COLL_NAME", "A")%>( <%=snpRef.getColList("", "[CX_COL_NAME]\t[DEST_WRI_DT] " + snpRef.getInfo("DEST_DDL_NULL"), ",\n\t", "","")%>)

5.2.1.3 Load Data

This task reads data on the source connection and loads it into the loading table. This command is always executed.

Note:

The loading phase is always using auto commit, as ODI temporary tables do not contain unrecoverable data.

Command on Source

Note the use of the getFilter, getJoin, getFrom, etc. methods. These methods are shortcuts that return contextual expressions. For example, getFilter returns all the filter expressions executed on the source. Note also the use of the EXPRESSION property of getColList, that will return the source attributes and the expressions executed on the source. These expressions and source attributes are aliases after CX_COL_NAME, which is the name of their corresponding attribute in the loading table.

This select statement will cause the correct transformation (mappings, joins, filters, etc.) to be executed by the source engine.

select <%=snpRef.getPop("DISTINCT_ROWS")%> <%=snpRef.getColList("", "[EXPRESSION]\t[ALIAS_SEP] [CX_COL_NAME]", ",\n\t", "", "")%>from <%=snpRef.getFrom()%>where (1=1)<%=snpRef.getFilter()%><%=snpRef.getJrnFilter()%><%=snpRef.getJoin()%><%=snpRef.getGrpBy()%><%=snpRef.getHaving()%>

Command on Target

Note here the use of the binding using :[CX_COL_NAME]. The CX_COL_NAME binded value will match the alias on the source attribute.

insert into <%=snpRef.getTable("L", "COLL_NAME", "A")%>( <%=snpRef.getColList("", "[CX_COL_NAME]", ",\n\t", "","")%>)values( <%=snpRef.getColList("", ":[CX_COL_NAME]", ",\n\t", "","")%>)

5.2.1.4 Drop Work Table

This task drops the loading table. This command is executed if the DELETE_TEMPORARY_OBJECTS knowledge module option is selected. This option will allow to preserve the loading table for debugging.

Command on Target

drop table <%=snpRef.getTable("L", "COLL_NAME", "A")%>
5 Loading Strategies (2024)

FAQs

What are the 5 Whys questions? ›

Five whys (or 5 whys) is an iterative interrogative technique used to explore the cause-and-effect relationships underlying a particular problem. The primary goal of the technique is to determine the root cause of a defect or problem by repeating the question "Why?" five times.

What is the 5 Whys problem-solving technique? ›

The 5 Whys Problem Solving technique is a simple process to follow to solve any problem by repeatedly asking the question “Why” (five times is a good rule of thumb), to peel away the layers of symptoms that can lead to the root cause of a problem. This strategy relates to the principle of systematic problem solving.

What are the 5 Whys in Six Sigma? ›

The 5 Whys is a root cause analysis technique used in the Analyze phase of the Six Sigma DMAIC (Define, Measure, Analyze, Improve, Control). To solve a problem, we need to identify the root cause and then eliminate it.

What is the 5 why protocol? ›

The process involves a question-asking technique to identify potential root causes of a problem. The '5 Whys' was originally developed by Sakichi Toyoda and used by Toyota Motor Corporation during the evolution of manufacturing methodologies. The '5' in the name derives from the typical number of 'Why?

What is the 5 wise technique? ›

The Five Why's is a brainstorming method where the audience asks repeatedly the question "Why" until the root cause of a problem is identified and the solution is clear.

What are all 5 Whys? ›

5 Whys is the practice of asking why repeatedly whenever a problem is encountered in order to get beyond the obvious symptoms to discover the root cause. For instance, Taiichi Ohno gives this example about a machine that stopped working (Ohno 1988, p. 17): Why did the machine stop?

What is the 5 Whys system of thinking? ›

The five whys is an investigative process that delves into cause-and-effect relationships, providing insight and solutions. By asking “why” up to 5 consecutive times, you can uncover hidden patterns in the problem area – revealing its fundamental nature for effective resolution.

What is the last step in the Five Whys? ›

Correct Answer: Identify the root cause.

What is the 5 Whys root cause? ›

The 5 Whys method is a powerful and straightforward approach to root cause analysis used in various industries worldwide. It focuses on asking why a problem happened and then repeating “Why?” four more times until you find the main cause.

What is the 5 Whys communication? ›

A 5 Whys root cause analysis is an analytical technique pioneered by Toyota for investigating the root cause of a workplace problem. By asking “why” five times, your team will dig deep to uncover the true source of an issue, rather than latching on to obvious answers that may merely be symptoms of underlying issues.

What is an example of 5 why in healthcare? ›

Example: 5 Whys

1) Why did the patient receive the wrong medication? The nurse did not complete patient identification. 2) Why did the nurse not complete patient identification? The patient did not have a wristband.

What are the 5 P's of root cause analysis? ›

Notably used to identify and analyze 'chronic failures,' which can otherwise be overlooked, this method is defined by its name: PReserving Evidence and Acquiring Data: Initial evidence collection step based on the 5-P's – Parts, Position, People, Paper, and Paradigms.

Top Articles
Latest Posts
Article information

Author: Saturnina Altenwerth DVM

Last Updated:

Views: 6347

Rating: 4.3 / 5 (44 voted)

Reviews: 91% of readers found this page helpful

Author information

Name: Saturnina Altenwerth DVM

Birthday: 1992-08-21

Address: Apt. 237 662 Haag Mills, East Verenaport, MO 57071-5493

Phone: +331850833384

Job: District Real-Estate Architect

Hobby: Skateboarding, Taxidermy, Air sports, Painting, Knife making, Letterboxing, Inline skating

Introduction: My name is Saturnina Altenwerth DVM, I am a witty, perfect, combative, beautiful, determined, fancy, determined person who loves writing and wants to share my knowledge and understanding with you.