Chapter 9. Translators

Data Virtualization uses the Teiid Connector Architecture (TCA), which provides a robust mechanism for integrating with external systems. The TCA defines a common client interface between Data Virtualization and an external system that includes metadata as to what SQL constructs are available for pushdown and the ability to import metadata from the external system.

A Translator is the heart of the TCA and acts as the bridge logic between Data Virtualization and an external system.

Translators can have a number of configurable properties. These are broken down into execution properties, which determine aspects of how data is retrieved, and import settings, which determine what metadata is read for import.

The execution properties for a translator typically have reasonable defaults. For specific translator types, such as the Derby translator, base execution properties are already tuned to match the source. In most cases the user will not need to adjust their values.

Table 9.1. Base execution properties - shared by all translators

NameDescriptionDefault

Immutable

Set to true to indicate that the source never changes. The transactional capability is reported as NONE, and update commands will fail.

false

RequiresCriteria

Set to true to indicate that source SELECT/UPDATE/DELETE queries require a where clause.

false

SupportsOrderBy

Set to true to indicate that the ORDER BY clause can be used.

false

SupportsOuterJoins

Set to true to indicate that OUTER JOINs can be used.

false

SupportsFullOuterJoins

If SupportsOuterJoins is set to true, true indicates that FULL OUTER JOINs can be used.

false

SupportsInnerJoins

Set to true to indicate that INNER JOINs can be used.

false

SupportedJoinCriteria

If join capabilities are enabled, defines the criteria that can be used as the join criteria. May be one of (ANY, THETA, EQUI, or KEY).

ANY

MaxInCriteriaSize

If the use of IN criteria is enabled, specifies the maximum number of IN entries per predicate. -1 indicates no limit.

-1

MaxDependentInPredicates

If the use of IN criteria is enabled, defines what the maximum number of predicates that can be used for a dependent join. Values less than 1 indicate to use only one IN predicate per dependent value pushed (which matches the pre-7.4 behavior).

-1

DirectQueryProcedureName

If SupportsDirectQueryProcedure is set to true for the translator, this property indicates the name of the procedure.

native

SupportsDirectQueryProcedure

Set to true to indicate that direct execution of commands is available for the translator.

false

ThreadBound

Set to true to indicate the translator’s Executions should be processed by only a single thread

false

CopyLobs

If true, then returned large object (LOB) data (clob, blob, sql/xml) is copied by the engine in a memory safe manner. Use this option if the source does not provide memory safe LOBS or if you want to disconnect LOBS from the source connection.

false

TransactionSupport

The highest level of transaction capability. Used by the engine as a hint to determine if a transaction is needed for autoCommitTxn=DETECT mode. Can be one of XA, NONE, or LOCAL. If XA or LOCAL then access under a transaction will be serialized.

XA

Note

Only a subset of the available metadata can be set through execution properties on the base ExecutionFactory. All methods are available on the BaseDelegatingExecutionFactory.

There are no base importer settings.

Override execution properties

For all translators, you can override Execution Properties in the main vdb file.

Example: Overriding a translator property

.

CREATE FOREIGN DATA WRAPPER "oracle-override" TYPE oracle OPTIONS (RequiresCriteria 'true');

CREATE SERVER ora FOREIGN DATA WRAPPER "oracle-override" OPTIONS ("resource-name" 'java:/oracle');

CREATE SCHEMA ora SERVER ora;

SET SCHEMA ora;

IMPORT FROM SERVER ora INTO ora;

The preceding example overrides the oracle translator and sets the RequiresCriteria property to true. The modified translator is only available in the scope of this VDB. As many properties as desired may be overriden together.

See also VDB Definition.

Parameterizable native queries

In some situations the teiid_rel:native-query property and native procedures accept parameterizable strings that can positionally reference IN parameters. A parameter reference has the form `$integer, for example, $1. Note that one-based indexing is used and that only IN parameters may be referenced. Dollar-sign integer is therefore reserved, but may be escaped with another $`, for example, $$1. The value will be bound as a prepared value or a literal is a source specific manner. The native query must return a result set that matches the expectation of the calling procedure.

For example the native-query select c from g where c1 = $1 and c2 = '$$1' results in a JDBC source query of select c from g where c1 = ? and c2 = '$1', where ?` will be replaced with the actual value bound to parameter 1.

General import properties

Several import properties are shared by all translators.

When specifying an importer property, it must be prefixed with importer.. For example, importer.tableTypes.

NameDescriptionDefault

autoCorrectColumnNames

Replace any usage of . in a column name with _ as the period character is not valid in Data Virtualization column names.

true

renameDuplicateColumns

If true, rename duplicate columns caused by either mixed case collisions or autoCorrectColumnNames replacing . with _. A suffix _n where n is an integer will be added to make the name unique.

false

renameDuplicateTables

If true, rename duplicate tables caused by mixed case collisions. A suffix _n where n is an integer will be added to make the name unique.

false

renameAllDuplicates

If true, rename all duplicate tables, columns, procedures, and parameters caused by mixed case collisions. A suffix _n where n is an integer will be added to make the name unique. Supersedes the individual rename duplicate options.

false

nameFormat

Set to a Java string format to modify table and procedure names on import. The only argument will be the original name Data Virtualization name. For example use prod_%s to prefix all names with prod_.

 

9.1. Amazon S3 Translator

The Amazon Simple Storage Service (S3) translator, known by the type name amazon-s3, exposes stored procedures to leverage Amazon S3 object resources.

This translator is typically used with the TEXTTABLE or XMLTABLE functions to consume CSV or XML formatted data, or to read Microsoft Excel files or other object files that are stored in Amazon S3. The S3 translator can access Amazon S3 by using an AWS access key ID and secret access key.

Usage

In the following example, a virtual database reads a CSV file with the name g2.txt from an Amazon S3 bucket called teiidbucket:

e1,e2,e3
5,'five',5.0
6,'six',6.0
7,'seven',7.0
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vdb name="example" version="1">
    <model name="s3">
        <source name="web-connector" translator-name="user-s3" connection-jndi-name="java:/amazon-s3"/>
    </model>
    <model name="Stocks" type="VIRTUAL">
        <metadata type="DDL"><![CDATA[
        CREATE VIEW G2 (e1 integer, e2 string, e3 double,PRIMARY KEY (e1))
            AS  SELECT SP.e1, SP.e2,SP.e3
                FROM (EXEC s3.getTextFile(name=>'g2.txt')) AS f,
                TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP;
        ]]> </metadata>
    </model>
    <translator name="user-s3" type="amazon-s3">
      <property name="accesskey" value="xxxx"/>
      <property name="secretkey" value="xxxx"/>
      <property name="region" value="us-east-1"/>
      <property name="bucket" value="teiidbucket"/>
    </translator>
</vdb>

Execution properties

Use the translator override mechanism to supply the following properties.

NameDescriptionDefault

Encoding

The encoding that should be used for CLOBs returned by the getTextFiles procedure. The value should match an encoding known to the JRE.

The system default encoding.

Accesskey

Amazon security access key. Log in to Amazon console to find your security access key. When provided, this becomes the default access key.

n/a

Secretkey

Amazon security secret key. Log in to Amazon console to find your security secret key. When provided, this becomes the default secret key.

n/a

Region

Amazon region to be used with the request. When provided, this will be default region used.

n/a

Bucket

Amazon S3 bucket name. If provided, this will serve as default bucket to be used for all the requests

n/a

Encryption

When server-side encryption with customer-provided encryption keys (SSE-C) is used, the key is used to define the "type" of encryption algorithm used. You can configure the translator to use the AES-256 or AWS-KMS encryption algorithms. If provided, this will be used as default algorithm for all "get" based calls.

n/a

Encryptionkey

When SSE-C type encryption used, where customer supplies the encryption key, this key will be used for defining the "encryption key". If provided, this will be used as default key for all "get" based calls.

n/a

Tip

For information about setting properties, see Override execution property in Translators, and review the examples in the sections that follow.

Procedures exposed by translator

When you add the a model (schema) like above in the example, the following procedure calls are available for user to execute against Amazon S3.

Note

bucket, region, accesskey, secretkey, encryption and encryptionkey are optional or nullable parameters in most of the methods provided. Provide them only if they are not already configured by using translator override properties as shown in preceding example.

getTextFile(…​)

Retrieves the given named object as a text file from the specified bucket and region by using the provided security credentials as CLOB.

getTextFile(string name NOT NULL, string bucket, string region,
   string endpoint, string accesskey, string secretkey,string encryption, string encryptionkey, boolean stream default false)
   returns TABLE(file blob, endpoint string, lastModified string, etag string, size long);
Note

endpoint is optional. When provided the endpoint URL is used instead of the one constructed by the supplied properties. Use encryption and encryptionkey only in when server side security with customer supplied keys (SSE-C) in force.

If the value of stream is true, then returned LOBs are read only once and are not typically buffered to disk.

Examples

exec getTextFile(name=>'myfile.txt');

SELECT SP.e1, SP.e2,SP.e3, f.lastmodified
   FROM (EXEC getTextFile(name=>'myfile.txt')) AS f,
   TEXTTABLE(f.file COLUMNS e1 integer, e2 string, e3 double HEADER) AS SP;

getFile(…​)

Retrieves the given named object as binary file from specified bucket and region using the provided security credentials as BLOB.

getFile(string name NOT NULL, string bucket, string region,
   string endpoint, string accesskey, string secretkey, string encryption, string encryptionkey, boolean stream default false)
   returns TABLE(file blob, endpoint string, lastModified string, etag string, size long)
Note

endpoint is optional. When provided the endpoint URL is used instead of the one constructed by the supplied properties. Use encryption and encryptionkey only in when server side security with customer supplied keys (SSE-C) in force.

If the value of stream is true, then returned lOBs are read once and are not typically buffered to disk.

Examples

exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx');

select b.* from (exec getFile(name=>'myfile.xml', bucket=>'mybucket', region=>'us-east-1', accesskey=>'xxxx', secretkey=>'xxxx')) as a,
XMLTABLE('/contents' PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS e1 integer, e2 string, e3 double) as b;

saveFile(…​)

Save the CLOB, BLOB, or XML value to given name and bucket. In the following procedure signature, the contents parameter can be any of the LOB types.

call saveFile(string name NOT NULL, string bucket, string region, string endpoint,
   string accesskey, string secretkey, contents object)
Note

You cannot use saveFile to stream or chunk uploads of a file’s contents. If you try to load very large objects, out-of-memory issues can result. You cannot configure saveFile to use SSE-C encryption.

Examples

exec saveFile(name=>'g4.txt', contents=>'e1,e2,e3\n1,one,1.0\n2,two,2.0');

deleteFile(…​)

Delete the named object from the bucket.

call deleteFile(string name NOT NULL, string bucket, string region, string endpoint, string accesskey, string secretkey)

Examples

exec deleteFile(name=>'myfile.txt');

list(…​)

Lists the contents of the bucket.

call list(string bucket, string region, string accesskey, string secretkey, nexttoken string)
    returns Table(result clob)

The result is the XML file that Amazon S3 provides in the following format

<?xml version="1.0" encoding="UTF-8"?>/n
<ListBucketResult
    xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
    <Name>teiidbucket</Name>
    <Prefix></Prefix>
    <KeyCount>1</KeyCount>
    <MaxKeys>1000</MaxKeys>
    <IsTruncated>false</IsTruncated>
    <Contents>
        <Key>g2.txt</Key>
        <LastModified>2017-08-08T16:53:19.000Z</LastModified>
        <ETag>&quot;fa44a7893b1735905bfcce59d9d9ae2e&quot;</ETag>
        <Size>48</Size>
        <StorageClass>STANDARD</StorageClass>
    </Contents>
</ListBucketResult>

You can parse this into a view by using a query similar to the one in the following example:

select b.* from (exec list(bucket=>'mybucket', region=>'us-east-1')) as a,
 XMLTABLE(XMLNAMESPACES(DEFAULT 'http://s3.amazonaws.com/doc/2006-03-01/'), '/ListBucketResult/Contents'
 PASSING XMLPARSE(CONTENT a.result WELLFORMED) COLUMNS Key string, LastModified string, ETag string, Size string,
 StorageClass string,	NextContinuationToken string PATH '../NextContinuationToken') as b;

If all properties (bucket, region, accesskey, and secretkey) are defined as translator override properties, you can run the following simple query:

SELECT * FROM Bucket
Note

If there are more then 1000 object in the bucket, then the value 'NextContinuationToken' need to be supplied as 'nexttoken' into the list call to fetch the next batch of objects. This can be automated in Data Virtualization with enhancement request.

9.2. Delegator translators

You can use the delegator translator, which is available in the core Data Virtualization installation, to modify the capabilities of a existing translator. Often times for debugging purposes, or in special situations, you might want to turn certain capabilities of a translator on or off. For example, say that the latest version of a Hive database supports the ORDER BY construct, but the current Data Virtualization version of the Hive translator does not. You could use the delegator translator to enable ORDER BY compatibility without actually writing any code. Similarly, you could do the reverse, and turn off certain capabilities to produce a better plan.

To use the delegator translator, you must define it in the DDL. The following example shows how to override the "hive" translator and turn off the ORDER BY capability.

CREATE DATABASE myvdb;
USE DATABASE myvdb;
CREATE FOREIGN DATA WRAPPER "hive-delegator" TYPE delegator OPTIONS (delegateName 'hive', supportsOrderBy 'false');
CREATE SERVER source FOREIGN DATA WRAPPER "hive-delegator" OPTIONS ("resource-name" 'java:hive-ds');
CREATE SCHEMA mymodel SERVER source;
SET SCHEMA mymodel;
IMPORT FROM SERVER source INTO mymodel;

For more information about the translator capabilities that you can override by using execution properties, see Translator_Capabilities in the Translator Development Guide. The preceding example shows how you might modify the default ORDER BY compatibility of the Hive translator.

9.2.1. Extending the delegator translator

You can create a delegating translator by extending the org.teiid.translator.BaseDelegatingExecutionFactory. After your classes are packaged as a custom translator, you can wire another translator instance into your delegating translator at runtime in order to intercept all of the calls to the delegate. This base class does not provide any functionality on its own, other than delegation. You can hard code capabilities into the translator instead of defining them as part of the DDL configuration. You can also override methods to provide alternate behavior.

Table 9.2. Execution properties  

Name

Description

Default

delegateName

Translator instance name to delegate to.

n/a

cachePattern

Regex pattern of queries that should be cached using the translator caching API.

n/a

cacheTtl

Time to live in milliseconds for queries matching the cache pattern.

n/a

 

For example, if you use the oracle translator in your virtual database, and you want to intercept calls that go through the translator, you could write a custom delegating translator, as in the following example:

@Translator(name="interceptor", description="interceptor")
public class InterceptorExecutionFactory extends org.teiid.translator.BaseDelegatingExecutionFactory{
    @Override
    public void getMetadata(MetadataFactory metadataFactory, C conn) throws TranslatorException {
        // do intercepting code here..

        // If you want call the original delegate, do not call if do not need to.
        // but if you did not call the delegate fullfill the method contract
        super.getMetadata(metadataFactory, conn);

        // do more intercepting code here..
    }
}

You could then deploy this translator in the Data Virtualization engine. Then in your DDL file, define an interceptor translator as in the following example:

CREATE DATABASE myvdb VERSION '1';
USE DATABASE myvdb VERSION '1';
CREATE FOREIGN DATA WRAPPER "oracle-interceptor" TYPE interceptor OPTIONS (delegateName 'oracle');
CREATE SERVER source FOREIGN DATA WRAPPER "oracle-interceptor" OPTIONS ("resource-name" 'java:oracle-ds');
CREATE SCHEMA mymodel SERVER source;
SET SCHEMA mymodel;
IMPORT FROM SERVER source INTO mymodel;

We have defined a "translator" override called oracle-interceptor, which is based on the custom translator "interceptor" from above, and supplied the translator it needs to delegate to "oracle" as its delegateName. Then, we used this override translator oracle-interceptor in the VDB. Future calls going into this VDB model’s translator are intercepted by your code to do whatever you want to do.

9.3. File translator

The file translator, known by the type name file, exposes stored procedures to leverage file resources. The translator is typically used with the TEXTTABLE or XMLTABLE functions to consume CSV or XML formatted data.

Table 9.3. Execution properties

NameDescriptionDefault

Encoding

The encoding that should be used for CLOBs returned by the getTextFiles procedure. The value should match an encoding known to Data Virtualization. For more information, see TO_CHARS and TO_BYTES in String functions.

The system default encoding.

ExceptionIfFileNotFound

Throw an exception in getFiles or getTextFiles if the specified file/directory does not exist.

true

Tip

For information about how to set properties, see the following example, and Override execution properties in Translators.

Example: Virtual datbase DDL override

CREATE SERVER "file-override"
    FOREIGN DATA WRAPPER file
    OPTIONS(
        Encoding 'ISO-8859-1', "ExceptionIfFileNotFound" false
    );

CREATE SCHEMA file SERVER "file-override";

getFiles

getFiles(String pathAndPattern) returns
TABLE(file blob, filePath string, lastModified timestamp, created timestamp, size long)

Retrieve all files as BLOBs matching the given path and pattern.

call getFiles('path/*.ext')

If the path is a directory, then all files in the directory are returned. If the path matches a single file, the file is returned.

The * character is treated as a wildcard to match any number of characters in the path name. Zero or matching files will be returned.

If *’ is not used, and if the path doesn’t exist and `ExceptionIfFileNotFound is true, then an exception is raised.

getTextFiles

getTextFiles(String pathAndPattern) returns
TABLE(file clob, filePath string, lastModified timestamp, created timestamp, size long)

Note

The size reports the number of bytes.

Retrieve all files as CLOBs matching the given path and pattern.

call getTextFiles('path/*.ext')

Retrieves the same files getFiles, but with the difference that the results are CLOB values that use the encoding execution property as the character set.

saveFile

Save the CLOB, BLOB, or XML value to given path

call saveFile('path', value)

deleteFile

Delete the file at the given path

call deleteFile('path')

The path should reference an existing file. If the file does not exist and ExceptionIfFileNotFound is true, then an exception will be thrown. An exception is also thrown if the file cannot be deleted.

Native queries

This feature is not applicable for the File translator.

Direct query procedure

This feature is not applicable for the File translator.

9.4. Google spreadsheet translator

The google-spreadsheet translator is used to connect to a Google Sheets spreadsheet.

The query approach expects that the data in the worksheet has the following characteristics:

  • All columns that contains data can be queried.
  • Any column with an empty cell has the value retrieved as null. However, differentiating between null string and empty string values may not always be possible as Google treats them interchangeably. Where possible, the translator may provide a warning or throw an exception if it cannot differentiate between null and empty strings.
  • If the first row is present and contains string values, then the row is assumed to represent the column labels.

If you are using the default native metadata import, the metadata for your Google account (worksheets and information about columns in worksheets) is loaded upon translator start up. If you make any changes in data types, it is advisable to restart your virtual database.

The translator can submit queries against a single sheet only. It provides ordering, aggregation, basic predicates, and most of the functions available in the spreadsheet query language.

The google-spreadsheet translator does not provide importer settings, but it can provide metadata for VDBs.

Warning

If you remove all data rows from a sheet with a header that is defined in Data Virtualization, you can no longer access the sheet through Data Virtualization. The Google API will treat the header as a data row at that point, and queries to it will no longer be valid.

Warning

Non-string fields are updated using the canonical Data Virtualization SQL value. In cases where the spreadsheet is using a non-conforming locale, consider disallowing updates. For more information, see TEIID-4854 and the following information about the allTypesUpdatable import property.

Importer properties

  • allTypesUpdatable- Set to true to mark all columns as updatable. Set to false to enable update only on string or Boolean columns that are not affected by TEIID-4854. Defaults to true.

Native queries

Google spreadsheet source procedures may be created using the teiid_rel:native-query extension. For more information, see Parameterizable native queries in Translators. The procedure will invoke the native-query similar to an native procedure call, with the benefits that the query is predetermined, and that result column types are known, rather than requiring the use of ARRAYTABLE or similar functionality. For more information, see the Select section that follows.

Direct query procedure

This feature is turned off by default, because of the security risk in permitting any command to execute against the data source. To enable this feature, set the property SupportsDirectQueryProcedure to true. For more information, see Override execution properties in Translators.

Tip

By default the name of the procedure that executes the queries directly is called native. You can change its name by overriding the execution property DirectQueryProcedureName. For more information, see Override execution properties in Translators.

The Google spreadsheet translator provides a procedure to execute any ad-hoc query directly against the source without any Data Virtualization parsing or resolving. Because the metadata of this procedure’s execution results are not known to Data Virtualization, they are returned as an object array. You can use ARRAYTABLE to construct tabular output for consumption by client applications. For more information, see ARRAYTABLE.

Data Virtualization exposes this procedure with a simple query structure as shown in the following example:

Select example

SELECT x.* FROM (call google_source.native('worksheet=People;query=SELECT A, B, C')) w,
 ARRAYTABLE(w.tuple COLUMNS "id" string , "type" string, "name" String) AS x

The first argument takes semicolon-separated (;) name-value pairs of the following properties to execute the procedure:

PropertyDescriptionRequired

worksheet

Google spreadsheet name.

yes

query

Spreadsheet query.

yes

limit

Number of rows to fetch.

no

offset

Offset of rows to fetch from limit or beginning.

no

9.5. JDBC translators

The JDBC translators bridge the SQL semantics and data type differences between Data Virtualization and a target RDBMS. Data Virtualization has a range of specific translators that target the most popular open source and proprietary relational databases.

Usage

Usage of a JDBC source is straight-forward. Using Data Virtualization SQL, the source can be queried as if the tables and procedures were local to the Data Virtualization system.

If you are using a relational data source, or a data source that has a JDBC driver, and you do not find a specific translator available for that data source type, then start with the JDBC ANSI translator. The JDBC ANSI translator should enable you to perform the SQL basics. If there specific data source capabilities that are not available, you can define a custom translator that does what you need. For more information, see Translator Development.

Type conventions

UID types including UUID, GUID, or UNIQUEIDENTIFIER are typically mapped to the Data Virtualization string type. JDBC data sources treat UID strings as non-case sensitive, but they are case-sensitive in Data Virtualization. If the source does not support the implicit conversion to the string type, then usage in functions that expect a string value might fail at the source.

The following table lists the execution properties that are shared by all JDBC translators.

Table 9.4. Execution properties — Shared by all JDBC translators

NameDescriptionDefault

DatabaseTimeZone

The time zone of the database. Used when fetching date, time, or timestamp values.

The system default time zone

DatabaseVersion

The specific database version. Used to further tune the use of pushdown operations.

The base compatible version, or the version that is derived from the DatabaseMetadata.getDatabaseProductVersion string. Automatic detection requires a connection. If there are circumstances where you are getting an exception due to capabilities being unavailable (for example, because a connection is not available), then set DatabaseVersion property. Use the JDBCExecutionFactory.usesDatabaseVersion()` method to control whether your translator requires a connection to determine capabilities.

TrimStrings

true trims trailing whitespace from fixed length character strings. Note that Data Virtualization only has a string, or varchar, type that treats trailing whitespace as meaningful.

false

RemovePushdownCharacters

Set to a regular expression to remove characters that not allowed or undesirable for the source. For example [\u0000] will remove the null character, which is problematic for sources such as PostgreSQL and Oracle. Note that this does effectively change the meaning of the affected string literals and bind values, which must be carefully considered.

 

UseBindVariables

true indicates that PreparedStatements should be used and that literal values in the source query should be replace with bind variables. If false only LOB values will trigger the use of PreparedStatements.

true

UseCommentsInSourceQuery

This will embed a leading comment with session/request id in the source SQL for informational purposes. Can be customized with the CommentFormat property.

false

CommentFormat

MessageFormat string to be used if UseCommentsInSourceQuery is enabled. You can set the format to one of the following values:

  • 0 - Session ID string.
  • 1 - Parent request ID string.
  • 2 - Request part ID string.
  • 3 - Execution count ID string.
  • 4 - User name string.
  • 5 - VDB name string.
  • 6 - VDB version integer.
  • 7 - Is transactional boolean.

/*teiid sessionid:\{0}, requestid:\{1}.\{2}*/

MaxPreparedInsertBatchSize

The max size of a prepared insert batch.

2048

StructRetrieval

Specify one of the following Struct retrieval modes:

  • OBJECT - getObject value returned.
  • COPY - Returned as a SerialStruct.
  • ARRAY - Returned as an array.

OBJECT

EnableDependentJoins

Allow dependent join pushdown for sources that use temporary tables (DB2, Derby, H2, HSQL 2.0+, MySQL 5.0+, Oracle, PostgreSQL, SQLServer, SQP IQ, Sybase).

false

Importer properties — Shared by all JDBC translators

When specifying the importer property, it must be prefixed with importer.. Example: importer.tableTypes

NameDescriptionDefault

catalog

See DatabaseMetaData.getTables [1]

null

schemaName

Recommended setting to import from a single schema. The schema name will be converted into an escaped pattern,overriding schemaPattern if it is also set.

null

schemaPattern

See DatabaseMetaData.getTables [1]

null

tableNamePattern

See DatabaseMetaData.getTables [1]

null

procedureNamePattern

See DatabaseMetaData.getProcedures [1]

null

tableTypes

Comma separated list — without spaces — of imported table types. See DatabaseMetaData.getTables [1]

null

excludeTables 

A case-insensitive regular expression that when matched against a fully qualified table name [2] will exclude it from import. Applied after table names are retrieved.  Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter.

null 

excludeProcedures 

A case-insensitive regular expression that when matched against a fully qualified procedure name [2] will exclude it from import.  Applied after procedure names are retrieved.  Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter.

null 

importKeys

true to import primary and foreign keys.

NOTE: Foreign keys to tables that are not imported will be ignored.

true

autoCreateUniqueConstraints

true to create a unique constraint if one is not found for a foreign keys

true

importIndexes

true to import index/unique key/cardinality information

false

importApproximateIndexes

true to import approximate index information. See DatabaseMetaData.getIndexInfo [1].

WARNING: Setting to false may cause lengthy import times.

true

importProcedures

true to import procedures and procedure columns - Note that it is not always possible to import procedure result set columns due to database limitations. It is also not currently possible to import overloaded procedures.

false

importSequences

true to import sequences. Compatible only with Db2, Oracle, PostgreSQL, SQL Server, and H2. A matching sequence will be imported to a 0-argument Data Virtualization function name_nextval.

false

sequenceNamePattern

LIKE pattern string to use when importing sequences. Null or % will match all.

null

useFullSchemaName

When false, directs the importer to use only the object name as the Data Virtualization name. It is expected that all objects will come from the same foreign schema. When true (not recommended) the Data Virtualization name will be formed using the catalog and schema names as directed by the useCatalogName and useQualifiedName properties, and it will be allowed for objects to come from multiple foreign schema. This option does not affect the name in source property.

false (only change when importing from multiple foreign schema).

useQualifiedName

true will use name qualification for both the Data Virtualization name and name in source as further refined by the useCatalogName and useFullSchemaName properties.  Set to false to disable all qualification for both the Data Virtualization name and the name in source, which effectively ignores the useCatalogName and useFullSchemaName properties. 

WARNING: When you set this option to false, it can lead to objects with duplicate names when importing from multiple schemas, which results in an exception.

true (rarely needs changed)

useCatalogName

true will use any non-null/non-empty catalog name as part of the name in source, e.g. "catalog"."schema"."table"."column", and in the Data Virtualization runtime name if applicable. false will not use the catalog name in either the name in source nor the Data Virtualization runtime name. Only required to be set to false for sources such as HSQL that do not use the catalog concept, but return a non-null/non-empty catalog name in their metadata.

true (rarely needs changed)

widenUnsignedTypes

true to convert unsigned types to the next widest type. For example, SQL Server reports tinyint as an unsigned type. With this option enabled, tinyint would be imported as a short instead of a byte.

true

useIntegralTypes

true to use integral types rather than decimal when the scale is 0.

false

quoteNameInSource

false will override the default and direct Data Virtualization to create source queries using unquoted identifiers.

true

useAnyIndexCardinality

true will use the maximum cardinality returned from DatabaseMetaData.getIndexInfo. importKeys or importIndexes needs to be enabled for this setting to have an effect. This allows for better stats gathering from sources that don’t return a statistical index.

false

importStatistics

true will use database dependent logic to determine the cardinality if none is determined. Not available for all database types — currently available for Oracle and MySQL only.

false

importRowIdAsBinary

true will import RowId columns as varbinary values.

false

importLargeAsLob

true will import character and binary types larger than the Data Virtualization max as CLOB or BLOB respectively. If you experience memory issues even with the property enabled, you should use the copyLob execution property as well.

false

[1] JavaDoc for DatabaseMetaData
[2] The fully qualified name for exclusion is based upon the settings of the translator and the particulars of the database. All of the applicable name parts used by the translator settings (see useQualifiedName and useCatalogName) including catalog, schema, table will be combined as catalogName.schemaName.tableName with no quoting. For example, Oracle does not report a catalog, so the name used with default settings for comparison would be just schemaName.tableName.

Warning

The default import settings will crawl all available metadata. This import process is time-consuming, and full metadata import is not needed in most situations. Most commonly you’ll want to limit the import by at least schemaName or schemaPattern and tableTypes.

Example: Importer settings to import only tables and views from my-schema.

SET SCHEMA ora;

IMPORT FOREIGN SCHEMA "my-schema" FROM SERVER ora INTO ora OPTIONS ("importer.tableTypes" 'TABLE,VIEW');

For more information about importer settings, see Virtual databases.

Native queries

Physical tables, functions, and procedures may optionally have native queries associated with them.  No validation of the native query is performed, it is simply used in a straight-forward manner to generate the source SQL.  For a physical table setting the teiid_rel:native-query extension metadata will execute the native query as an inline view in the source query.  This feature should only be used against sources that provide inline views.  The native query is used as is and is not treated as a parameterized string. For example, on a physical table y with nameInSource="x"` and teiid_rel:native-query="select c from g", the Data Virtualization source query "SELECT c FROM y" would generate the SQL query "SELECT c FROM (select c from g) as x".  Note that the column names in the native query must match the nameInSource of the physical table columns for the resulting SQL to be valid.

For physical procedures you may also set the teiid_rel:native-query extension metadata to a desired query string with the added ability to positionally reference IN parameters . For more information, see Parameterizable native queries in Translators.  The teiid_rel:non-prepared extension metadata property can be set to false to turn off parameter binding. 

Be careful in setting this option, because inbound allows for SQL injection attacks if not properly validated.  The native query does not need to call a stored procedure.  Any SQL that returns a result set that positionally matches the result set that is expected by the physical stored procedure metadata will work.  For example on a stored procedure x with teiid_rel:native-query="select c from g where c1 = $1 and c2 = `$$1"', the Data Virtualization source query `"CALL x(?)"` would generate the SQL query `"select c from g where c1 = ? and c2 = `$1"'`.  Note that ? in this example will be replaced with the actual value bound to parameter 1.

Direct query procedure

This feature is turned off by default, because of the inherent security risk in allowing any command to be run against the source. To enable this feature, override the execution property called SupportsDirectQueryProcedure and set it to true. For more information, see Override execution properties in Translators.

By default, the name of the procedure that executes the queries directly is native. To change the name, override the execution property DirectQueryProcedureName.

The JDBC translator provides a procedure to execute any ad-hoc SQL query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array. ARRAYTABLE can be used construct tabular output for consumption by client applications. For more information, see arraytable.

SELECT example

SELECT x.* FROM (call jdbc_source.native('select * from g1')) w,
 ARRAYTABLE(w.tuple COLUMNS "e1" integer , "e2" string) AS x

INSERT example

SELECT x.* FROM (call jdbc_source.native('insert into g1 (e1,e2) values (?, ?)', 112, 'foo')) w,
 ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x

UPDATE example

SELECT x.* FROM (call jdbc_source.native('update g1 set e2=? where e1 = ?','blah', 112)) w,
 ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x

DELETE example

SELECT x.* FROM (call jdbc_source.native('delete from g1 where e1 = ?', 112)) w,
 ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x

9.5.1. Actian Vector translator (actian-vector)

Also see common JDBC Translators information.

The Actian Vector translator, known by the type name actian-vector, is for use with Actian Vector in Hadoop.

Download the JDBC driver at http://esd.actian.com/platform. Note the port number in the connection URL is "AH7", which maps to 16967.

9.5.2. Apache Phoenix Translator (phoenix)

Also see common JDBC Translators information.

The Apache Phoenix translator, known by the type name phoenix, exposes querying functionality to HBase tables. Apache Phoenix is a JDBC SQL interface for HBase that is required for this translator as it pushes down commands into Phoenix SQL.

The translator is also known by the deprecated name hbase. The name change reflects that the translator is specific to Phoenix and that there could be other translators introduced in the future to connect to HBase.

Do not use the DatabaseTimezone property with this translator.

The HBase translator cannot process Join commands. Phoenix uses the HBase Table Row ID as the Primary Key. This Translator is developed with Phoenix 4.3 or greater for HBase 0.98.1 or greater.

Note

The translator implements INSERT/UPDATE through the Phoenix UPSERT operation. This means you can see different behavior than with standard INSERT/UPDATE. For example, repeated inserts will not throw a duplicate key exception, but will instead update the row in question.

Note

Due to Phoenix driver limitations, the importer does not look for unique constraints, and defaults to not importing foreign keys.

Note

The translator can process SQL OFFSET arguments and other features starting with Phoenix 4.8. The Phoenix driver hard codes the server version in PhoenixDatabaseMetaData, and does not otherwise provide a way to detect the server version at runtime. If a newer driver is used with an older server, set the database version translator property manually.

Warning

The Phoenix driver does not have robust handling of time values. If your time values are normalized to use a date component of 1970-01-01, then the default handling will work correctly. If not, then the time column should be modeled as timestamp instead.

9.5.3. Cloudera Impala translator (impala)

Also see common JDBC Translators information.

The Cloudera Impala translator, known by the type name impala, is for use with Cloudera Impala 1.2.1 or later.

Impala has limited support for data types. It is does not have native support for time/date/xml or LOBs. These limitations are reflected in the translator capabilities. A Data Virtualization view can use these types, however the transformation would need to specify the necessary conversions. Note that in those situations, the evaluations will be done in the Data Virtualization engine.

Do not use the DatabaseTimeZone translator property with the Impala translator.

Impala only supports EQUI join, so using any other joins types on its source tables will result in inefficient queries.

To write criteria based on partitioned columns, model them on the source table, but do not include them in selection columns.

Note

Impala Hive importer does not have concept of catalog or source schema, nor does it import keys, procedures, indexes, etc.

Impala specific importer properties

useDatabaseMetaData
Set to true to use the normal import logic with the option to import index information disabled. Defaults to false.

If the value of useDatabaseMetaData is false, the typical JDBC DatabaseMetaData calls are not used, so not all of the common JDBC importer properties are applicable to Impala. You may still use excludeTables, regardless.

Important

Some versions of Impala require the use of a LIMIT when performing an ORDER BY. If no default is configured in Impala, an exception can occur when a Data Virtualization query with an ORDER BY but no LIMIT is issued. You should set an Impala-wide default, or configure the connection pool to use a new connection SQL string to issue a SET DEFAULT_ORDER_BY_LIMIT statement. For more information about Impala limit options, such as how to control what happens when the limit is exceeded, see the Cloudera documentation.

Note

If the Impala JDBC driver has problems processing PreparedStatements or parsing statements in general, try disabling useBindVariables. For more information, see https://issues.redhat.com/browse/TEIID-4610.

9.5.4. Db2 Translator (db2)

Also see common JDBC Translators information.

The Db2 translator, known by the type name db2, is for use with IBM Db2 V8 or later, or IBM Db2 for i V5.4 or later.

Db2 execution properties

DB2ForI
Indicates that the the Db2 instance is Db2 for i. Defaults to false.
supportsCommonTableExpressions
Indicates that the Db2 instance supports common table expressions (CTEs). Defaults to true. Common table expression are not fully supported on some older versions of Db2, and on instances of Db2 that run in a conversion mode. If you encounter errors working with CTEs in these environments, set the CTE property to false.

9.5.5. Derby translator (derby)

Also see common JDBC Translators information.

The Derby translator, known by the type name derby, is for use with Derby 10.1 or later.

9.5.6. Exasol translator (exasol)

Also see common JDBC Translators information.

The Exasol translator, known by the type name exasol, is for use with Exasol version 6 or later.

Usage

The Exasol database has the NULL HIGH default ordering, whereas the Data Virtualization engine works in the NULL LOW mode. As a result, depending on whether the ordering is pushed down to Exasol or done by the engine, you might observe NULLs at either the beginning or end of returned results. To enforce consistency, you can run Data Virtualization with org.teiid.pushdownDefaultNullOrder=true to specify NULL LOW ordering. Enforcing NULL LOW ordering can result in decreased performance.

9.5.7. Greenplum Translator (greenplum)

Also see common JDBC Translators information.

The Greenplum translator, known by the type name greenplum, is for use with the Greenplum database. This translator is an extension of the PostgreSQL translator, and inherits its options.

9.5.8. H2 Translator (h2)

Also see common JDBC Translators information.

The H2 Translator, known by the type name h2, is for use with H2 version 1.1 or later.

9.5.9. Hive Translator (hive)

Also see common JDBC Translators information.

The Hive translator, known by the type name hive, is for use with Hive v.10 and SparkSQL v1.0 and later.

Capabilities

Hive is compatible with a limited set of data types. It does not have native support for time/XML or large objects (LOBs). These limitations are reflected in the translator capabilities. Although a Data Virtualization view can use these types, the transformation must specify the necessary conversions. Note that in those situations, evaluations are processed in Data Virtualization engine.

Do not use the DatabaseTimeZone translator property with the Hive translator.

Hive only supports EQUI join, so using any other joins types on its source tables will result in inefficient queries.

To write criteria based on partitioned columns, model them on the source table, but do not include them in selection columns.

Note

The Hive importer does not have the concept of catalog or source schema, nor does it import keys, procedures, indexes, and so forth.

Import properties

trimColumnNames
For Hive 0.11.0 and later, the DESCRIBE command metadata is inappropriately returned with padding. Set this property to true to remove white space from column names. Defaults to false.
useDatabaseMetaData
For Hive 0.13.0 and later, the normal JDBC DatabaseMetaData facilities are sufficient to perform an import. Set to true to use the normal import logic with the option to import index information disabled. Defaults to false. When true, trimColumnNames has no effect. If it is set to false, the typical JDBC DatabaseMetaData calls are not used, so not all of the common JDBC importer properties are applicable to Hive. You can still use excludeTables anyway.

"Database Name"

When the database name used in Hive differs from default, the metadata retrieval and execution of queries does not work as expected in Data Virtualization. The Hive JDBC driver seems to be implicitly connecting (tested with < 0.12) to "default" database, thus ignoring the database name mentioned on connection URL. You can work around this issue if you configure your connection source to send the command use {database-name}.

This is fixed in version 0.13 and later of the Hive JDBC driver. For more information, see https://issues.apache.org/jira/browse/HIVE-4256.

Limitations

Empty tables might report their description without datatype information. To work around this problem when importing, you can exclude empty tables, or use the useDatabaseMetaData option.

9.5.10. HSQL Translator (hsql)

Also see common JDBC Translators information.

The HSQL Translator, known by the type name hsql, is for use with HSQLDB 1.7 or later.

9.5.11. Informix translator (informix)

Also see common JDBC Translators information.

The Informix translator, known by the type name informix, is for use with any Informix version.

Known issues

TEIID-3808
The Informix driver’s handling of timezone information is inconsistent, even if the databaseTimezone translator property is set. Verify that the Informix server and the application server are in the same time zone.

9.5.12. Ingres translators (ingres / ingres93)

Also see common JDBC Translators information.

You can use one of the following Ingres translators, depending on your Ingres version:

ingres
The Ingres translator, known by the type name ingres, is for use with Ingres 2006 or later.
ingres93
The Ingres93 translator, known by the type name ingres93, is for use with Ingres 9.3 or later.

9.5.13. Intersystems Caché translator (intersystems-cache)

Also see common JDBC Translators information.

The Intersystem Caché translator, known by the type name intersystems-cache, is for use with Intersystems Caché Object database (relational aspects only).

9.5.14. JDBC ANSI translator (jdbc-ansi)

Also see common JDBC Translators information.

The JDBC ANSI translator, known by the type name jdbc-ansi, works with most of the SQL constructs used in Data Virtualization, except for row LIMIT/OFFSET and EXCEPT/INTERSECT. It translates source SQL into ANSI compliant syntax. This translator should be used when another more specific type is not available. If source exceptions arise due to the use of incompatible SQL constructs, then consider using the JDBC simple translator to further restrict capabilities, or create a custom translator. For more information, see the Custom Translator documentation in the Teiid community.

9.5.15. JDBC simple translator (jdbc-simple)

Also see common JDBC Translators information.

The JDBC Simple translator, known by the type name jdbc-simple, is the same as the jdbc-ansi-translator, except that, to provide maximum compatibility, it does not handle most pushdown constructs.

9.5.16. Microsoft Access translators

Also see common JDBC Translators information.

access

The Microsoft Access translator known by the type name access is for use with Microsoft Access 2003 or later via the JDBC-ODBC bridge.

If you are using the default native metadata import, or the Data Virtualization connection importer, the importer defaults to importKeys=false and excludeTables=.[.]MSys. to avoid issues with the metadata provided by the JDBC ODBC bridge. You might need to adjust these values if you use a different JDBC driver.

ucanaccess
The Microsoft Access translator known by the type name ucanaccess is for use with Microsoft Access 2003 or later via the UCanAccess driver.

9.5.17. Microsoft SQL Server translator (sqlserver)

Also see common JDBC translators information.

The Microsoft SQL Server translator, known by the type name sqlserver, is for use with SQL Server 2000 or later. A SQL Server JDBC driver version 2.0 or later (or a compatible driver such as, JTDS 1.2 or later) should be used. The SQL Server DatabaseVersion property can be set to 2000, 2005, 2008, or 2012, but otherwise expects a standard version number, for example, 10.0.

Sequences

With Data Virtualization 8.5+, sequence operations may be modeled as source functions.

With Data Virtualization 10.0+, sequences may be imported automatically import properties.

Example: Sequence native query

CREATE FOREIGN FUNCTION seq_nextval () returns integer OPTIONS ("teiid_rel:native-query" 'NEXT VALUE FOR seq');

Execution properties

SQL Server specific execution properties:

JtdsDriver
Specifies that use of the open source JTDS driver. Defaults to false.

9.5.18. MySQL translator (mysql/mysql5)

Also see common JDBC translators information.

You can use the following translators with MySQL and MariaDB:

mysql
The MySQL translator, known by the type name mysql, is for use with MySQL version 4.x.
mysql5
The MySQL5 translator, known by the type name mysql5, is for use with MySQL version 5 or later. The translator also works with other compatible MySQL derivatives, such as MariaDB.

Usage

The MySQL translators expect the database or session to be using ANSI mode. If the database is not using ANSI mode, you can set ANSI mode on the pool by submitting the following initialization query:

set SESSION sql_mode = 'ANSI'

When data includes null timestamp values, Data Virtualization generates the following conversion error: 0000-00-00 00:00:00 cannot be converted to a timestamp. To avoid error, if you expect data with null timestamp values, set the connection property zeroDateTimeBehavior=convertToNull.

Warning

If you must retrieve large result sets, consider setting the connection property useCursorFetch=true. Otherwise MySQL will fully fetch result sets into memory on the Data Virtualization instance.

Note

MySQL reports TINYINT(1) columns as a JDBC BIT type - however the value range is not actually restricted and may cause issues if for example you are relying on -1 being recognized as a true value. If not using the native importer, change the BOOLEAN columns in the affected source to have a native type of "TINYINT(1)" rather than BIT so that the translator can appropriately handle the Boolean conversion.

9.5.19. Netezza translator (netezza)

Also see common JDBC translators information.

The Netezza translator, known by the type name netezza, is for use with any version of the IBM Netezza appliance.

Usage

The current vendor-supplied JDBC driver for Netezza performs poorly with single transactional updates. It is best to perform batched updates whenever possible.

Execution properties

Netezza-specific execution properties:

SqlExtensionsInstalled
Indicates that SQL extensions, including the ability to process Netezza REGEXP_LIKE functions, are installed. All other REGEXP functions are then available as pushdown functions. Defaults to false.

9.5.20. Oracle translator (oracle)

Also see common JDBC translators information.

The Oracle translator, known by the type name oracle, is for use with Oracle Database 9i or later.

Note

The Oracle-provided JDBC driver uses large amounts of memory. Because the driver caches a high volume of data in the buffer, problems can occur on computers that lack sufficient memory allocation.

For more information, see the following resources:

Importer properties

useGeometryType

Use the Data Virtualization Geometry type when importing columns with a source type of SDO_GEOMETRY. Defaults to false.

Note

Metadata import from Oracle may be slow. It is recommended that at least a schema name filter is specified. There is also the useFetchSizeWithLongColumn=true connection property that can increase the fetch size for metadata queries. It significantly improves the metadata load process, especially when there are a large number of tables in a schema.

Execution properties

OracleSuppliedDriver
Indicates that the Oracle supplied driver (typically prefixed by ojdbc) is being used. Defaults to true. Set to false when using DataDirect or other Oracle JDBC drivers.

Oracle-specific metadata

Sequences

You can use sequences with the Oracle translator. You can model a sequence as a table with a name in source of DUAL, and setting column names in the source set to <sequence name>.[nextval|currval]

With Data Virtualization 10.0+, you can import sequences automatically.

For more information, see Importer properties in JDBC translators. Data Virtualization 8.4 and Prior Oracle Sequence DDL

CREATE FOREIGN TABLE seq (nextval integer OPTIONS (NAMEINSOURCE 'seq.nextval'), currval integer options (NAMEINSOURCE 'seq.currval') ) OPTIONS (NAMEINSOURCE 'DUAL')

With Data Virtualization 8.5 it’s no longer necessary to rely on a table representation and Oracle-specific handling for sequences.

For information about representing currval and nextval as source functions, see DDL metadata for schema objects

8.5 Example: Sequence native query

CREATE FOREIGN FUNCTION seq_nextval () returns integer OPTIONS ("teiid_rel:native-query" 'seq.nextval');

You can also use a sequence as the default value for insert columns by setting the column to autoincrement, and setting the name in source to <element name>:SEQUENCE=<sequence name>.<sequence value>.

Rownum

A rownum column can also be added to any Oracle physical table to enable use of the rownum pseudo-column. A rownum column should have a name in source of rownum. These rownum columns do not have the same semantics as the Oracle rownum construct so care must be taken in their usage.

Out parameter result set

Out parameters for procedures may also be used to return a result set, if this is not represented correctly by the automatic import you need to manually create a result set and represent the output parameter with native type REF CURSOR.

DDL for out parameter result set

create foreign procedure proc (in x integer, out y object options (native_type 'REF CURSOR'))
returns table (a integer, b string)

Geospatial functions

You can use the following geospatial functions with the translator for Oracle:

Relate = sdo_relate
CREATE FOREIGN FUNCTION sdo_relate (arg1 string,  arg2 string,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_relate (arg1 Object,  arg2 Object,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_relate (arg1 string,  arg2 Object,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_relate (arg1 Object,  arg2 string,  arg3 string) RETURNS string;
Nearest_Neighbor = sdo_nn
CREATE FOREIGN FUNCTION sdo_nn (arg1 string,  arg2 Object,  arg3 string,  arg4 integer) RETURNS string;
CREATE FOREIGN FUNCTION sdo_nn (arg1 Object,  arg2 Object,  arg3 string,  arg4 integer) RETURNS string;
CREATE FOREIGN FUNCTION sdo_nn (arg1 Object,  arg2 string,  arg3 string,  arg4 integer) RETURNS string;
Within_Distance = sdo_within_distance
CREATE FOREIGN FUNCTION sdo_within_distance (arg1 Object,  arg2 Object,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_within_distance (arg1 string,  arg2 Object,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_within_distance (arg1 Object,  arg2 string,  arg3 string) RETURNS string;
Nearest_Neigher_Distance = sdo_nn_distance
CREATE FOREIGN FUNCTION sdo_nn_distance (arg integer) RETURNS integer;
Filter = sdo_filter
CREATE FOREIGN FUNCTION sdo_filter (arg1 Object,  arg2 string,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_filter (arg1 Object,  arg2 Object,  arg3 string) RETURNS string;
CREATE FOREIGN FUNCTION sdo_filter (arg1 string,  arg2 object,  arg3 string) RETURNS string;

Pushdown functions

Depending on the Oracle version, the Oracle translator, registers the following non-geospatial pushdown functions with the engine:

TRUNC
Both numeric and timestamp versions.
LISTAGG
Requires the Data Virtualization SQL syntax "LISTAGG(arg [, delim] ORDER BY …​)"

SQLXML

If you need to retrieve SQLXML values from Oracle and are getting oracle.xdb.XMLType or OPAQUE instances instead, you make the following changes:

  • Use client driver version 11, or later.
  • Place the xdb.jar and xmlparserv2.jar files in the classpath.
  • Set the system property oracle.jdbc.getObjectReturnsXMLType="false".

    For more information, see the Oracle documentation.

9.5.21. PostgreSQL translator (postgresql)

Also see common JDBC translators information.

The PostgreSQL translator, known by the type name postgresql, is for use with the following PostgreSQL client and server versions: * Client — 8.0 or later * Server — 7.1 or later.

Execution properties

PostgreSQL-specific execution properties:

PostGisVersion
Indicates the PostGIS version in use. Defaults to 0, which means that PostGIS is not installed. Will be set automatically if the database version is not set.
ProjSupported
Boolean that indicates if the PostGis version supports PROJ coordinate transformation software. Will be set automatically if the database version is not set.
Note

Some driver versions of PostgreSQL will not associate columns to "INDEX" type tables. The current version of Data Virtualization omits such tables automatically.

Older versions of Data Virtualization may need the importer.tableType property or other filtering set.

9.5.22. PrestoDB translator (prestodb)

Also see common JDBC translators information.

The PrestoDB translator, known by the type name prestodb, exposes querying functionality to Presto data sources. In data integration respect, PrestoDB has capabilities that are similar to Data Virtualization, however it goes beyond in terms of distributed query execution with multiple worker nodes. Data Virtualization’s execution model is limited to single execution node and focuses more on pushing the query down to sources. Data Virtualization provides more complete querying capabilities and many enterprise features.

Capabilities

You can use the PrestoDB translator only with SELECT statements. The translator provides a restricted set of capabilities.

Because PrestoDB exposes a relational model, Data Virtualization can use it as it does other RDBMS sources, such as Oracle, Db2, and so forth. For information about configuring PrestoDB, see the Presto documentation.

Tip

In SQL JOIN operations, PrestoDB does not support multiple ORDER BY columns well. If errors occur during JOIN operations that involve more than one ORDER BY column, set the translator property supportsOrderBy to disable the use of the ORDER BY clause.

Note

Some versions of Presto generate errors when you include null values in subqueries.

Tip

PrestoDB does not support transactions. To overcome issues caused by this limitation, define the data source as non-transactional.

Note

By default, every catalog in PrestoDB has an information_schema. If you have to configure multiple catalogs, duplicate table errors can cause deployment of a virtual database to fail. To prevent duplicate table errors, use import options to filter the schemas.

If you want to configure multiple Presto catalogs, set one of the following import options to filter the schemas and tables in the source:

  • Set catalog to a specific catalog name to match the name of the source catalog in Presto.
  • Set schemaName to a regular expression to filter schemas by matching result.
  • Set excludeTables to a regular expression to filter tables by matching results.

9.5.23. Redshift translator (redshift)

Also see common JDBC translators information.

The Redshift translator, known by the type name redshift, is for use with the Amazon Redshift database. This translator is an extension of the PostgreSQL translator and inherits its options.

9.5.24. SAP HANA translator (hana)

Also see common JDBC translators information.

The SAP HANA translator, known by the name of hana, is for use with SAP HANA.

Known issues

TEIID-3805
The pushdown of the SUBSTRING function is inconsistent with the Data Virtualization SUBSTRING function when the FROM index exceeds the length of the string. SAP HANA will return an empty string, while Data Virtualization produces a null value.

9.5.25. SAP IQ translator (sap-iq)

Also see common JDBC translators information.

The SAP IQ translator, known by the type name sap-iq, is for use with SAP IQ version 15.1 or later. The translator name sybaseiq has been deprecated.

9.5.26. Sybase translator (sybase)

Also see common JDBC Translators information.

The Sybase translator, known by the type name sybase, is for use with SAP ASE (Adaptive Server Enterprise), formerly known as Sybase SQL Server, version 12.5 or later.

If you use the default native import, you can avoid exceptions during the retrieval of system table information, if you specify import properties. If errors occur when retrieving table information, specify a schemaName or schemaPattern, or use excludeTables to exclude system tables. For more information about using import properties, see Importer properties in JDBC translators.

If the name in the source metadata contains quoted identifiers (such as reserved words, or words that contain characters that would not otherwise be allowed), and you are using a jConnect Sybase driver, you must first configure the connection pool to enable quoted_identifier:

Example: Driver URL with SQLINITSTRING

jdbc:sybase:Tds:host.at.some.domain:5000/db_name?SQLINITSTRING=set quoted_identifier on

Important

If you are using a jConnect Sybase driver and will target the source for dependent joins, set the JCONNECT_VERSION to 6 or later to increase the number of values that the translator can send. If you do not set the JCONNECT_VERSION, an exception occurs with statements that have more than 481 bind values.

Example: Driver URL with JCONNECT_VERSION

jdbc:sybase:Tds:host.at.some.domain:5000/db_name?SQLINITSTRING=set quoted_identifier on&JCONNECT_VERSION=6

Execution properties specific to Sybase

JtdsDriver_
Indicates that the open source JTDS driver is being used. Defaults to false.

9.5.27. Data Virtualization translator (teiid)

Also see common JDBC translators information.

Use the Teiid translator, known by the type name teiid, when creating a virtual database from a Teiid data source.

9.5.28. Teradata translator (teradata)

Also see common JDBC translators information.

The Teradata translator, known by the type name teradata, is for use with Teradata Database V2R5.1 or later.

By default, Teradata driver version 15, adjusts date, time, and timestamp values to match the Data Virtualization server timezone. To remove this adjustment, set the translator DatabaseTimezone property to GMT or whatever the Teradata server defaults to.

9.5.29. Vertica translator (vertica)

Also see common JDBC translators information.

The Vertica translator, known by the type name vertica, is for use with Vertica 6 or later.

9.6. Loopback translator

The Loopback translator, known by the type name loopback, provides a quick testing solution. It works with all SQL constructs and returns default results, with some configurable behavior.

Table 9.5. Execution properties 

NameDescriptionDefault

ThrowError

true to always throw an error.

false

RowCount

Rows returned for non-update queries.

1

WaitTime

Wait randomly up to this number of milliseconds with each source query.

0

PollIntervalInMilli

If positive, results will be asynchronously returned — that is a DataNotAvailableException will be thrown initially and the engine will wait the poll interval before polling for the results.

-1

DelegateName

Set to the name of the translator which is to be mimicked.

na

You can also use the Loopback translator to mimic how a real source query would be formed for a given translator (although loopback will still return dummy data that might not be useful for your situation). To enable this behavior, set the DelegateName property to the name of the translator that you want to mimic. For example, to disable all capabilities, set the DelegateName property to jdbc-simple.

9.7. Microsoft Excel translator

The Microsoft Excel Translator, known by the type name excel, exposes querying functionality to a Microsoft Excel document. This translator provides an easy way read a Excel spreadsheet and provide the contents of the spreadsheet in a tabular form that can be integrated with other sources in Data Virtualization.

Note

This translator works on all platforms, including Windows and Linux. The translator uses Apache POI libraries to access the Excel documents which are platform independent.

Translation mapping

The following table describes how Excel translator interprets the data in Excel document into relational terms.

Excel TermRelational term

Workbook

schema

Sheet

Table

Row

Row of data

Cell

Column Definition or Data of a column

The Excel translator provides a "source metadata" feature, where for a given Excel workbook, it can introspect and build the schema based on the worksheets that are defined within it. There are options available to detect header columns and data columns in a worksheet to define the correct metadata of a table.

DDL example

The following example shows how to expose an Excel spreadsheet in a virtual database.

CREATE DATABASE excelvdb;
USE DATABASE excelvdb;
CREATE SERVER connector FOREIGN DATA WRAPPER excel OPTIONS ("resource-name" 'java:/fileDS');
CREATE SCHEMA excel SERVER connector;
SET SCHEMA excel;
IMPORT FROM SERVER connector INTO excel OPTIONS (
    "importer.headerRowNumber" '1',
    "importer.ExcelFileName" 'names.xls');

Headers in document

If the Excel document contains headers, you can guide the import process to select the cell headers as the column names in the table creation process. For information about defining import properties, see the following table, and also see Importer Properties in JDBC translators.

Import properties

Import properties guide the schema generation part during the deployment of the VDB. This can be used in a native import.

Property NameDescriptionDefault

importer.excelFileName

Defines the name of the Excel Document to import metadata. This can be defined as a file pattern (*.xls), however when defined as pattern all files must be of same format, and the translator will choose an arbitrary file to import metadata from. Use file patterns to read data from multiple Excel documents in the same directory. In the case of a single file, specify the absolute name.

Required

importer.headerRowNumber

Defines the cell header information to be used as column names.

Optional. Default is first data row of sheet

importer.dataRowNumber

Defines the row number where the data rows start.

Optional. Default is first data row of sheet.

To enable information in the Excel spreadsheet to be interpreted correctly, it is best to define all the preceding importer properties.

Note

Purely numerical cells in a column contain containing mixed types will have a string form matching their decimal representation, thus integral values will have .0 appended. If you need the exact text representation, then the cell must be a string value. You can force a string value by preceding the numeric text of a cell with a single quote ('), or a single space.

Translator extension properties

  • Excel specific execution properties

    FormatStrings
    Format non-string cell values in a string column according to the worksheet format. Defaults to false.
  • Metadata extension properties

    Properties that are defined on schema artifacts, such as Table, Column, Procedure and so forth. These properties describe how the translator interacts with or interprets source systems. All the properties are defined with the following namespace: "http://www.teiid.org/translator/excel/2014[http://www.teiid.org/translator/excel/2014\]", which also has a recognized alias teiid_excel.

Property NameSchema item property belongs toDescriptionMandatory

FILE

Table

Defines Excel Document name or name pattern (*.xls). File pattern can be used to read data from multiple files.

Yes

FIRST_DATA_ROW_NUMBER

Table

Defines the row number where records start in the sheet (applies to every sheet).

Optional

CELL_NUMBER

Column of Table

Defines cell number to use for reading data of particular column.

Yes

The following example shows a table that is defined by using the extension metadata properties.

CREATE DATABASE excelvdb;
USE DATABASE excelvdb;
CREATE SERVER connector FOREIGN DATA WRAPPER excel OPTIONS ("resource-name" 'java:/fileDS');
CREATE SCHEMA excel SERVER connector;
SET SCHEMA excel;
CREATE FOREIGN TABLE Person (
                ROW_ID integer OPTIONS (SEARCHABLE 'All_Except_Like', "teiid_excel:CELL_NUMBER" 'ROW_ID'),
                FirstName string OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '1'),
                LastName string OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '2'),
                Age integer OPTIONS (SEARCHABLE 'Unsearchable', "teiid_excel:CELL_NUMBER" '3'),
                CONSTRAINT PK0 PRIMARY KEY(ROW_ID)
             ) OPTIONS ("NAMEINSOURCE" 'Sheet1',"teiid_excel:FILE" 'names.xlsx', "teiid_excel:FIRST_DATA_ROW_NUMBER" '2')

Extended capabilities using ROW_ID column

If you define a column that has extension metadata property CELL_NUMBER with value ROW_ID, then that column value contains the row information from Excel document. You can mark this column as Primary Key. You can use this column in SELECT statements with a restrictive set of capabilities including: comparison predicates, IN predicates and LIMIT. All other columns cannot be used as predicates in a query.

Tip

Importing source metadata is not the only way to create the schema of an Excel document. You can also create a source table manually, and then add the extension properties that you need to create a fully functional model. Metadata imports result in schema models similar to the one in the preceding example.

The Excel translator processes updates with the following limitations:

  • The ROW_ID cannot be directly modified or used as an insert value.
  • UPDATE and INSERT values must be literals.
  • UPDATEs are not transactional. That is, the write lock is held while the file is written, but not throughout the entire update. As a result, it is possible for one update to overwrite another.

The ROW_ID of an inserted row can be returned as a generated key.

Native queries

This feature is not applicable for the Excel translator.

Direct query procedure

This feature is not applicable for the Excel translator.

9.8. MongoDB Translator

The MongoDB translator, known by the type name mongodb, provides a relational view of data that resides in a MongoDB database. This translator is capable of converting Data Virtualization SQL queries into MongoDB based queries. It provides for a full range of SELECT, INSERT, UPDATE and DELETE calls.

MongoDB is a document based "schema-less" database with it own query language. It does not map perfectly with relational concepts or the SQL query language. More and more systems are using NOSQL stores such as MongoDB to improve scalability and performance. For example, applications like storing audit logs, or managing web site data, are well-suited to MongoDB, and do not require the structure of relational databases. MongoDB uses JSON documents as its primary storage unit, and those documents can have additional embedded documents inside the parent document. By using embedded documents, MongoDB co-locates related information to achieve de-normalization that typically requires either duplicate data or joins to achieve querying in a relational database.

For MongoDB to work with Data Virtualization the challenge for the MongoDB translator is to design a MongoDB store that can achieve the balance between relational and document based storage. The advantages of "schema-less" design are great at development time. But "schema-less" design can pose problems during migration between application versions, and when querying data, and making effective use of the returned information.

Since it is hard and may be impossible in certain situations to derive a schema based on existing the MongoDB collection(s), Data Virtualization approaches the problem in reverse compared to other translators. When working with MongoDB, Data Virtualization requires you to define the MongoDB schema upfront, by using Data Virtualization metadata. Because Data Virtualization only allows relational schema as its metadata, you must define your MongoDB schema in relational terms, using tables, procedures, and functions. For the purposes of MongoDB, the Data Virtualization metadata has been extended to provide extension properties that can be defined on a table to convert it into a MongoDB based document. These extension properties let you define how a MongoDB document is structured and stored. Based on the relationships (primary-key, foreign-key) that are defined on a table, and their cardinality (ONE-to-ONE, ONE-to-MANY, MANY-to-ONE), relations between tables are mapped such that related information can be embedded along with the parent document for co-location (as mentioned earlier in this topic). Thus, a relational schema-based design, but document-based storage in MongoDB.

Who is the primary audience for the MongoDB translator?

The above may not satisfy every user’s needs. The document structure in MongoDB can be more complex than what Data Virtualization can currently define. We hope this will eventually catch up in future versions of Data Virtualization. This is currently designed for:

  • Users who are using relational databases and would like to move/migrate their data to MongoDB to take advantage of scaling and performance without modifying end user applications that they currently run.
  • Users who are seasoned SQL developers, but do not have experience with MongoDB. This provides a low barrier of entry compared to using MongoDB directly as an application developer.
  • Users who want to integrate MongoDB-based data with data from other enterprise data sources.

Usage

The name of the translator to use in a virtual database DDL is "mongodb". For example:

CREATE DATABASE nothwind;
USE DATABASE nothwind;
CREATE SERVER local FOREIGN DATA WRAPPER mongodb OPTIONS ("resource-name" 'java:/mongoDS');
CREATE SCHEMA northwind SERVER local;

SET SCHEMA northwind;
IMPORT FROM SERVER local INTO northwind;

The MongoDB translator can derive the metadata based on existing document collections in some scenarios. However, when working with complex documents the interpretation of metadata can be inaccurate. In such cases, you must define the metadata. For example, you can define a schema using DDL, as shown in the following example:

<vdb name="nothwind" version="1">
    <model name="northwind">
        <source name="local" translator-name="mongodb" connection-jndi-name="java:/mongoDS"/>
            <metadata type="DDL"><![CDATA[
                CREATE FOREIGN TABLE  Customer (
                    customer_id integer,
                    FirstName varchar(25),
                    LastName varchar(25)
                ) OPTIONS(UPDATABLE 'TRUE');
            ]]> </metadata>
    </model>
<vdb>

When the following INSERT operation is executed against a table using Data Virtualization, the MongoDB translator creates a document in the MongoDB:

    INSERT INTO Customer(customer_id, FirstName, LastName) VALUES (1, 'John', 'Doe');
{
  _id: ObjectID("509a8fb2f3f4948bd2f983a0"),
  customer_id: 1,
  FirstName: "John",
  LastName: "Doe"
}

If a PRIMARY KEY is defined on the table, then that column name is automatically used as "_id" field in the MongoDB collection, and then the document structure is stored in the MongoDB, as shown in the following examples:

    CREATE FOREIGN TABLE  Customer (
        customer_id integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');
{
  _id: 1,
  FirstName: "John",
  LastName: "Doe"
}

If you defined the composite PRIMARY KEY on Customer table, the document structure that results is shown in the following example:

    CREATE FOREIGN TABLE  Customer (
        customer_id integer,
        FirstName varchar(25),
        LastName varchar(25),
        PRIMARY KEY (FirstName, LastName)
    ) OPTIONS(UPDATABLE 'TRUE');
{
  _id: {
         FirstName: "John",
         LastName:  "Doe"
       },
  customer_id: 1,
}

Data types

The MongoDB translator provides automatic mapping of Data Virtualization data types into MongoDB data types, including BLOBS, CLOBS and XML. The LOB mapping is based on GridFS in MongoDB. Arrays are in the following form:

{
  _id: 1,
  FirstName: "John",
  LastName: "Doe"
  Score: [89, "ninety", 91.0]
}

Users can get individual items in the array using the function array_get, or can transform the array into tabular structure using ARRAYTABLE.

Note

Note that even though embedded documents can also be in arrays, the handling of embedded documents is different from array with scalar values.

Note

The translator does not work with regular Expressions, MongoDB::Code, MongoDB::MinKey, MongoDB::MaxKey, and MongoDB::OID.

Note

In documents that contain values of mixed types for the same key, you must mark the column as unsearchable, or MongoDB will not correctly match predicates against the column. A key is used as a mixed type of it is represented as a string value in one document, and an integer in another. For more information, see the importer.sampleSize property in the following table.

Importer Properties

Importer properties define the behavior of the translator during the metadata import from the physical source.

Importer Properties

NameDescriptionDefault

excludeTables

Regular expression to exclude the tables from import.

null

includeTables

Regular expression to include the tables from import.

null

sampleSize

Number of documents to sample to determine the structure. If documents have different fields, or fields with different types, this should be greater than 1.

1

fullEmbeddedNames

Whether to prefix embedded table names with their parents, e.g. parent_embedded. If false the name of the table will just be the name of the field - which may lead to conflicts with existing tables or other embedded tables.

false

MongoDB metadata extension properties for building complex documents

Using the preceding DDL, or any other metadata facility, you can map a table in a relational store into a document in MongoDB. However, to make effective use of MongoDB, you must be able to build complex documents that can co-locate related information, so that data can queried in a single MongoDB query. Unlike a relational database, you cannot run join operations in MongoDB. As as a result, unless you can build complex documents, you would have to issue multiple queries to retrieve data and then join it manually. The power of MongoDB comes from its "embedded" documents, its support for complex data types, such as arrays, and its use of an aggregation framework to query them. This translator provides a way to achieve the goals.

When you do not define the complex embedded documents in MongoDB, Data Virtualization can step in for join processing and provide that functionality. However, if you want to make use of the power of MongoDB itself in querying the data and avoid bringing the unnecessary data and improve performance, you need to look into building these complex documents.

MongoDB translator defines two additional metadata properties along with other Teiid metadata properties to aid in building the complex "embedded" documents. For more information about Data Virtualization schema metadata, see Section 2.2, “DDL metadata for schema objects”. You can use the following metadata properties in your DDL:

teiid_mongo:EMBEDDABLE
Means that data defined in this table is allowed to be included as an "embeddable" document in any parent document. The parent document is referenced by the foreign key relationships. In this scenario, Data Virtualization maintains more than one copy of the data in MongoDB store, one in its own collection, and also a copy in each of the parent tables that have relationship to this table. You can even nest embeddable table inside another embeddable table with some limitations. Use this property on table, where table can exist, encompass all its relations on its own. For example, a "Category" table that defines a "Product"’s category is independent of Product, which can be embeddable in "Products" table.
teiid_mongo:MERGE
Means that data of this table is merged with the defined parent table. There is only a single copy of the data that is embedded in the parent document. Parent document is defined using the foreign key relationships.

Using the above properties and FOREIGN KEY relationships, we will illustrate how to build complex documents in MongoDB.

Usage

A given table can contain either the teiid_mongo:EMBEDDABLE property or the teiid_mongo:MERGE property defining the type of nesting in MongoDB. You cannot use both properties within one table.

ONE-2-ONE Mapping

If your current DDL structure representing ONE-2-ONE relationship is like

    CREATE FOREIGN TABLE  Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE Address (
        CustomerId integer,
        Street varchar(50),
        City varchar(25),
        State varchar(25),
        Zipcode varchar(6),
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
     ) OPTIONS(UPDATABLE 'TRUE');

By default, this will produce two different collections in MongoDB, like with sample data it will look like

Customer
{
  _id: 1,
  FirstName: "John",
  LastName: "Doe"
}

Address
{
  _id: ObjectID("..."),
   CustomerId: 1,
   Street: "123 Lane"
   City: "New York",
   State: "NY"
   Zipcode: "12345"
}

You can enhance the storage in MongoDB to a single collection by using teiid_mongo:MERGE extension property on the table’s OPTIONS clause.

    CREATE FOREIGN TABLE  Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE Address (
        CustomerId integer PRIMARY KEY,
        Street varchar(50),
        City varchar(25),
        State varchar(25),
        Zipcode varchar(6),
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
     ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Customer');

this will produce single collection in MongoDB, like

Customer
{
  _id: 1,
  FirstName: "John",
  LastName: "Doe",
  Address:
     {
        Street: "123 Lane",
        City: "New York",
        State: "NY",
        Zipcode: "12345"
     }
}

With the above both tables are merged into a single collection that can be queried together using the JOIN clause in the SQL command. Since the existence of child/additional record has no meaning with out parent table using the "teiid_mongo:MERGE" extension property is right choice in this situation.

Note

The Foreign Key defined on a child table must refer to Primary Keys on both the parent and child tables to form a One-2-One relationship.

ONE-2-MANY Mapping.

Typically there can be more than two (2) tables involved in this relationship. If MANY side is only associated single table, then use teiid_mongo:MERGE property on MANY side of table and define ONE as the parent. If associated with more than single table then use teiid_mongo:EMBEDDABLE.

For example, if you define a virtual database as in the following DDL:

    CREATE FOREIGN TABLE  Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE  Order (
        OrderID integer PRIMARY KEY,
        CustomerId integer,
        OrderDate date,
        Status integer,
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
    ) OPTIONS(UPDATABLE 'TRUE');

then a Single Customer can have MANY Orders. There are two options to define the how we store the MongoDB document. If in your schema, the Customer table’s CustomerId is only referenced in Order table (i.e. Customer information used for only Order purposes), you can use

    CREATE FOREIGN TABLE  Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE  Order (
        OrderID integer PRIMARY KEY,
        CustomerId integer,
        OrderDate date,
        Status integer,
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
    ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Customer');

that will produce a single document for Customer table like

{
  _id: 1,
  FirstName: "John",
  LastName: "Doe",
  Order:
  [
     {
       _id: 100,
        OrderDate: ISODate("2000-01-01T06:00:00Z")
        Status: 2
     },
     {
       _id: 101,
        OrderDate: ISODate("2001-03-06T06:00:00Z")
        Status: 5
     }
     ...
   ]
}

If Customer table is referenced in more tables other than Order table, then use "teiid_mongo:EMBEDDABLE" property

    CREATE FOREIGN TABLE Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:EMBEDDABLE" 'TRUE');

    CREATE FOREIGN TABLE Order (
        OrderID integer PRIMARY KEY,
        CustomerId integer,
        OrderDate date,
        Status integer,
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE Comments (
        CommentID integer PRIMARY KEY,
        CustomerId integer,
        Comment varchar(140),
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId)
    ) OPTIONS(UPDATABLE 'TRUE');

This creates three different collections in MongoDB.

Customer
{
  _id: 1,
  FirstName: "John",
  LastName: "Doe"
}

Order
{
  _id: 100,
  CustomerId: 1,
  OrderDate: ISODate("2000-01-01T06:00:00Z")
  Status: 2
  Customer:
   {
     FirstName: "John",
     LastName: "Doe"
   }
}

Comment
{
  _id: 12,
  CustomerId: 1,
  Comment: "This works!!!"
  Customer:
   {
     FirstName: "John",
     LastName: "Doe"
   }
}

Here as you can see the Customer table contents are embedded along with other table’s data where they were referenced. This creates duplicated data where multiple of these embedded documents are managed automatically in the MongoDB translator.

Note

All the SELECT, INSERT, DELETE operations that are generated against the tables with "teiid_mongo:EMBEDDABLE" property are atomic, except for UPDATES, as there can be multiple operations involved to update all the copies. Since there are no transactions in MongoDB, Data Virtualization plans to provide automatic compensating transaction framework around this in future releases TEIID-2957.

MANY-2-ONE Mapping.

This is same as ONE-2-MANY, see above to define relationships.

Note

A parent table can have multiple "embedded" and as well as "merge" documents inside it, it not limited so either one or other. However, please note that MongoDB imposes document size is limited can not exceed 16MB.

MANY-2-MANY Mapping.

This can also mapped with combination of "teiid_mongo:MERGE" and "teiid_mongo:EMBEDDABLE" properties (partially). For example if DDL looks like

    CREATE FOREIGN TABLE Order (
        OrderID integer PRIMARY KEY,
        OrderDate date,
        Status integer
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE OrderDetail (
        OrderID integer,
        ProductID integer,
        PRIMARY KEY (OrderID,ProductID),
        FOREIGN KEY (OrderID) REFERENCES Order (OrderID),
        FOREIGN KEY (ProductID) REFERENCES Product (ProductID)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE Products (
       ProductID integer PRIMARY KEY,
       ProductName varchar(40)
    ) OPTIONS(UPDATABLE 'TRUE');

you modify the DDL like below, to have

    CREATE FOREIGN TABLE Order (
        OrderID integer PRIMARY KEY,
        OrderDate date,
        Status integer
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE OrderDetail (
        OrderID integer,
        ProductID integer,
        PRIMARY KEY (OrderID,ProductID),
        FOREIGN KEY (OrderID) REFERENCES Order (OrderID),
        FOREIGN KEY (ProductID) REFERENCES Product (ProductID)
    ) OPTIONS(UPDATABLE 'TRUE', "teiid_mongo:MERGE" 'Order');

    CREATE FOREIGN TABLE Products (
       ProductID integer PRIMARY KEY,
       ProductName varchar(40)
    ) OPTIONS(UPDATABLE 'TRUE',  "teiid_mongo:EMBEDDABLE" 'TRUE');

That will produce a document like

{
   _id : 10248,
   OrderDate : ISODate("1996-07-04T05:00:00Z"),
   Status : 5
   OrderDetails : [
     {
       _id : {
               OrderID : 10248,
               ProductID : 11
               Products : {
                  ProductID: 11
                  ProductName: "Hammer"
               }
       }
     },
     {
       _id : {
         OrderID : 10248,
         ProductID : 14
         Products : {
             ProductID: 14
             ProductName: "Screw Driver"
         }
       }
     }
   ]
}

Products
{
    {
      ProductID: 11
      ProductName: "Hammer"
    }
    {
      ProductID: 14
      ProductName: "Screw Driver"
    }
}

Limitations

  • Nested embedding of documents is limited due to capabilities of handling nested arrays is limited in the MongoDB. Nesting of "EMBEDDABLE" property with multiple levels is OK, however more than two levels with MERGE is not recommended. Also, you need to be caution about not exceeding the document size of 16 MB for single row, so deep nesting is not recommended.
  • JOINS between related tables, MUST use either the "EMBEDDABLE" or "MERGE" properties, otherwise the query will result in error. In order for Data Virtualization to correctly plan and work with JOINS, in the case that any two tables are NOT embedded in each other, use allow-joins=false property on the Foreign Key that represents the relation. For example:
    CREATE FOREIGN TABLE  Customer (
        CustomerId integer PRIMARY KEY,
        FirstName varchar(25),
        LastName varchar(25)
    ) OPTIONS(UPDATABLE 'TRUE');

    CREATE FOREIGN TABLE  Order (
        OrderID integer PRIMARY KEY,
        CustomerId integer,
        OrderDate date,
        Status integer,
        FOREIGN KEY (CustomerId) REFERENCES Customer (CustomerId) OPTIONS (allow-join 'FALSE')
    ) OPTIONS(UPDATABLE 'TRUE');

with the example above, Data Virtualization will create two collections, however when user issues query such as

  SELECT OrderID, LastName FROM Order JOIN Customer ON Order.CustomerId = Customer.CustomerId;

instead of resulting in error, the JOIN processing will happen in the Data Virtualization engine, without the above property it will result in an error.

When you use above properties and carefully design the MongoDB document structure, Data Virtualization translator can intelligently collate data based on their co-location and take advantage of it while querying.

Geo Spatial functions

MongoDB translator enables you to use geo spatial query operators in the "WHERE" clause, when the data is stored in the GeoJSon format in the MongoDB Document. The following functions are available:

CREATE FOREIGN FUNCTION geoIntersects (columnRef string,  type string, coordinates double[][]) RETURNS boolean;
CREATE FOREIGN FUNCTION geoWithin (ccolumnRef string,  type string, coordinates double[][]) RETURNS boolean;
CREATE FOREIGN FUNCTION near (ccolumnRef string,  coordinates double[], maxdistance integer) RETURNS boolean;
CREATE FOREIGN FUNCTION nearSphere (ccolumnRef string, coordinates double[], maxdistance integer) RETURNS boolean;
CREATE FOREIGN FUNCTION geoPolygonIntersects (ref string, north double, east double, west double, south double) RETURNS boolean;
CREATE FOREIGN FUNCTION geoPolygonWithin (ref string, north double, east double, west double, south double) RETURNS boolean;

a sample query looks like

SELECT loc FROM maps where mongo.geoWithin(loc, 'LineString', ((cast(1.0 as double), cast(2.0 as double)), (cast(1.0 as double), cast(2.0 as double))))

Same functions using built-in Geometry type (the versions of the functions in the preceding list will be deprecated and removed in future versions)

CREATE FOREIGN FUNCTION geoIntersects (columnRef string,  geo geometry) RETURNS boolean;
CREATE FOREIGN FUNCTION geoWithin (ccolumnRef string,  geo geometry) RETURNS boolean;
CREATE FOREIGN FUNCTION near (ccolumnRef string, geo geometry, maxdistance integer) RETURNS boolean;
CREATE FOREIGN FUNCTION nearSphere (ccolumnRef string, geo geometry, maxdistance integer) RETURNS boolean;
CREATE FOREIGN FUNCTION geoPolygonIntersects (ref string, geo geometry) RETURNS boolean;
CREATE FOREIGN FUNCTION geoPolygonWithin (ref string, geo geometry) RETURNS boolean;

a sample query looks like

SELECT loc FROM maps where mongo.geoWithin(loc, ST_GeomFromGeoJSON('{"coordinates":[[1,2],[3,4]],"type":"Polygon"}'))

There are various "st_geom.." methods are available in the Geo Spatial function library in Data Virtualization.

Capabilities

MongoDB translator is designed on top of the MongoDB aggregation framework. You must use a MongoDB version that the aggregation framework. Apart from SELECT queries, the MongoDB translator also works with INSERT, UPDATE and DELETE queries.

You can use the MongoDB translator with the following functions:

  • Grouping.
  • Matching.
  • Sorting.
  • Filtering.
  • Limits.
  • Working with LOBs stored in GridFS.
  • Composite primary and foreign keys.

Native queries

MongoDB source procedures may be created using the teiid_rel:native-query extension. For more information, see Parameterizable native queries in Translators. The procedure will invoke the native-query similar to a direct procedure call with the benefits that the query is predetermined and that result column types are known, rather than requiring the use of ARRAYTABLE or similar functionality.

Direct query procedure

This feature is turned off by default because of the security risk this exposes to execute any command against the source. To enable direct query procedures, set the execution property called SupportsDirectQueryProcedure to true. For more information, see Override the execution properties in Chapter 9, Translators.

By default the name of the procedure that executes the queries directly is called native. For information about how to change the default name, see Override the execution properties in Chapter 9, Translators.

The MongoDB translator provides a procedure to execute any ad-hoc aggregate query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array containing single blob at array location one(1). This blob contains the JSON document. XMLTABLE can be used construct tabular output for consumption by client applications.

Example MongoDB Direct Query

    select x.* from TABLE(call native('city;{$match:{"city":"FREEDOM"}}')) t,
          xmltable('/city' PASSING JSONTOXML('city', cast(array_get(t.tuple, 1) as BLOB)) COLUMNS city string, state string) x

In the above example, a collection called "city" is looked up with filter that matches the "city" name with "FREEDOM", using "native" procedure and then using the nested tables feature the output is passed to a XMLTABLE construct, where the output from the procedure is sent to a JSONTOXML function to construct a XML then the results of that are exposed in tabular form.

The direct query MUST be in the format

     "collectionName;{$pipeline instr}+"

From Data Virtualization 8.10, MongoDB translator also allows to execute Shell type java script commands like remove, drop, createIndex. For this the command needs to be in format

     "$ShellCmd;collectionName;operationName;{$instr}+"

and example looks like

   "$ShellCmd;MyTable;remove;{ qty: { $gt: 20 }}"

9.9. OData translator

The OData translator, known by the type name "odata" exposes the OData V2 and V3 data sources and uses the Data Virtualization web services resource adapter for making web service calls. This translator is an extension of the Web services translator.

What is OData?

The Open Data Protocol (OData) web protocol is for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom Publishing Protocol (AtomPub) and JSON to provide access to information from a variety of applications, services, and stores. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.

Using this specification from the OASIS group, with help from the OData4J framework, Data Virtualization maps OData entities into relational schema. Data Virtualization can read CSDL (Conceptual Schema Definition Language) from a provided OData endpoint, and convert the OData schema into a relational schema. The following table shows the mapping selections in the OData translator from a CSDL document.

ODataMapped to relational entity

EntitySet

Table

FunctionImport

Procedure

AssociationSet

Foreign keys on the table*

ComplexType

ignored**

  • A many-to-many association will result in a link table that can not be selected from, but can be used for join purposes.

    • When used in functions, an implicit table is exposed. When used to define a embedded table, all the columns will be in-lined.

All CRUD operations will be appropriately mapped to the resulting entity based on the SQL submitted to the OData translator.

  1. Usage

Usage of a OData source is similar to that of a JDBC translator. The metadata import is provided through the translator, once the metadata is imported from the source system and exposed in relational terms, then this source can be queried as if the EntitySets and Function Imports were local to the Data Virtualization system.

Table 9.6. Execution properties

NameDescriptionDefault

DatabaseTimeZone

The time zone of the database. Used when fetchings date, time, or timestamp values.

The system default time zone

SupportsOdataCount

Enables the use of the $count option in system queries.

true

SupportsOdataFilter

Enables the use of the $filter option in system queries.

true

SupportsOdataOrderBy

Enables the use of the $orderby option in system queries.

true

SupportsOdataSkip

Enables the use of the $skip option in system queries.

true

SupportsOdataTop

Enables the use of the $top option in system queries.

true

Table 9.7. Importer Properties

NameDescriptionDefault

schemaNamespace

Namespace of the schema to import.

null

entityContainer

Entity Container Name to import.

default container

Example: Importer settings to import only tables and views from NetflixCatalog

<property name="importer.schemaNamespace" value="System.Data.Objects"/>
<property name="importer.entityContainer" value="NetflixCatalog"/>

OData Server is not fully compatible

The OData server that you connect to might not fully implement the entire OData specification. If the server’s OData implementation does not support a feature, set "execution properties" to turn off the corresponding capability, so that Data Virtualization will not push down invalid queries to the translator.

For example, to turn off $filter, add the following statement to the virtual database DDL:

CREATE SERVER odata FOREIGN DATA WRAPPER "odata-override" OPTIONS ("SupportOdataFilter" 'false');
Native queries

The OData translator cannot perform native or direct query execution. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.

Using OData as server.

Data Virtualization can not only consume OData-based data sources, but it can also expose any data source as an OData-based web service.

For more information about configuring an OData server, see OData support in the Client Developer’s Guide.

9.10. OData V4 translator

The OData V4 translator, known by the type name "odata4" exposes the OData Version 4 data sources and uses the Data Virtualization web services resource adapter for making web service calls. This translator is extension of Web Services Translator. The OData V4 translator is not for use with older OData V1-3 sources. Use the OData translator ("odata") for older OData sources.

What is OData

The Open Data Protocol (OData) Web protocol is for querying and updating data that provides a way to unlock your data and free it from silos that exist in applications today. OData does this by applying and building upon Web technologies such as HTTP, Atom Publishing Protocol (AtomPub), and JSON to provide access to information from a variety of applications, services, and stores. OData is being used to expose and access information from a variety of sources including, but not limited to, relational databases, file systems, content management systems and traditional Web sites.

Using this specification from the OASIS group, with the help from the Olingo framework, Data Virtualization maps OData V4 CSDL (Conceptual Schema Definition Language) document from the OData endpoint provided and converts the OData metadata into Data Virtualization’s relational schema. The following table shows the mapping selections in the OData V4 translator from a CSDL document

Using OData as a server

Data Virtualization can not only consume OData-based data sources, but it can expose any data source as an OData based web service. For more information see OData Support in the Client Developer’s Guide.

ODataMapped to relational entity

EntitySet

Table

EntityType

Table see [1]

ComplexType

Table see [2]

FunctionImport

Procedure [3]

ActionImport

Procedure [3]

NavigationProperties

Table [4]

[1] Only if the EntityType is exposed as the EntitySet in the Entity container. [2] Only if the complex type is used as property in the exposed EntitySet. This table will be designed as child table with foreign key [1-to-1] or [1-to-many] relationship to the parent.
[3] If the return type is EntityType or ComplexType, the procedure is designed to return a table. [4] Navigation properties are exposed as tables. The table will be created with foreign key relationship to the parent.

All CRUD operations will be appropriately mapped to the resulting entity based on the SQL submitted to the OData translator.

Usage

Usage of a OData source is similar a JDBC translator. The metadata import is supported through the translator, once the metadata is imported from source system and exposed in relational terms, then this source can be queried as if the EntitySets, Function Imports and Action Imports were local to the Data Virtualization system.

It is not recommended to define your own metadata using Data Virtualization DDL for complex services. There are several extension metadata properties required to enable proper functioning. On non-string properties, a NATIVE_TYPE property is expected and should specify the full EDM type name - Edm.xxx.

The below is sample VDB that can read metadata service from TripPin service on http://odata.org site.

<vdb name="trippin" version="1">
    <model name="trippin">
         <source name="odata4" translator-name="odata4" connection-jndi-name="java:/tripDS"/>
    </model>
</vdb>

You can connect to the VDB deployed using Data Virtualization JDBC driver and issue SQL statements like

SELECT * FROM trippin.People;
SELECT * FROM trippin.People WHERE UserName = 'russelwhyte';
SELECT * FROM trippin.People p INNER JOIN trippin.People_Friends pf ON p.UserName = pf.People_UserName; (note that People_UserName is implicitly added by Data Virtualization metadata)
EXEC GetNearestAirport(lat, lon) ;

Execution properties

Sometimes default properties need to adjusted for proper execution of the translator. The following execution properties extend or limit the functionality of the translator based on the physical source capabilities.

NameDescriptionDefault

SupportsOdataCount

Supports $count

true

SupportsOdataFilter

Supports $filter

true

SupportsOdataOrderBy

Supports $orderby

true

SupportsOdataSkip

Supports $skip

true

SupportsOdataTop

Supports $top

true

SupportsUpdates

Supports INSERT/UPDATE/DELETE

true

The OData server that you connect to might not fully implement the entire OData specification. If the server’s OData implementation does not support a feature, set "execution properties" to turn off the corresponding capability, so that Data Virtualization does not push down invalid queries to the translator.

<translator name="odata-override" type="odata">
     <property name="SupportsOdataFilter" value="false"/>
</translator>

then use "odata-override" as the translator name on your source model.

Importer properties

The following table lists the importer properties that define the behavior of the translator during metadata import from the physical source.

NameDescriptionDefault

schemaNamespace

Namespace of the schema to import

null

Example importer settings to only import tables and views from Trippin service exposed on odata.org

<property name="importer.schemaNamespace" value="Microsoft.OData.SampleService.Models.TripPin"/>

You can leave this property undefined. If the translator does not detect a configured instance of the property, it specifies the default name of the EntityContainer.

Tip

Native queries - Native or direct query execution is not supported through the OData translator. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.

9.11. OpenAPI translator

The OpenAPI translator, known by the type name "openapi" exposes OpenAPI data sources via relational concepts and uses the Data Virtualization WS resource adapter for making web service calls.

What is OpenAPI?

[OpenAPI is a simple yet powerful representation of your RESTful API. With the largest ecosystem of API tooling on the planet, thousands of developers are supporting OpenAPI in almost every modern programming language and deployment environment. With an OpenAPI-enabled API, you get interactive documentation, client SDK generation, and discoverability.

This translator is compatible with OpenAPI/Swagger v2 and OpenAPI v3.

Usage

Usage of a OpenAPI source is similar any other translator in Data Virtualization. The translator enables metadata import. The metadata is imported from source system’s metadata file and then exposed as stored procedures in Data Virtualization. The source system can be queried by executing these stored procedures in Data Virtualization system.

Note

Although parameter order is guaranteed by the Swagger libraries, if you rely upon the native import, it is best if you call procedures using named, rather than positional parameters.

The below is sample VDB that can read metadata from Petstore reference service on http://petstore.swagger.io/ site.

<vdb name="petstore" version="1">
    <model visible="true" name="m">
        <property name="importer.metadataUrl" value="/swagger.json"/>
        <source name="s" translator-name="openapi" connection-jndi-name="java:/openapi"/>
    </model>
</vdb>

The required resource-adapter configuration will look like

<resource-adapter id="openapi">
    <module slot="main" id="org.jboss.teiid.resource-adapter.webservice"/>
    <transaction-support>NoTransaction</transaction-support>
    <connection-definitions>
        <connection-definition class-name="org.teiid.resource.adapter.ws.WSManagedConnectionFactory" jndi-name="java:/openapi" enabled="true" use-java-context="true" pool-name="teiid-openapi-ds">
            <config-property name="EndPoint">
                http://petstore.swagger.io/v2
            </config-property>
        </connection-definition>
    </connection-definitions>
</resource-adapter>

After you configure the preceding resource-adapter and deploy the VDB successfully, then you can connect to the VDB deployed using Data Virtualization JDBC driver and issue SQL statements such as the following:

EXEC findPetsByStatus(('sold',))
EXEC getPetById(1461159803)
EXEC deletePet('', 1461159803)

Execution properties

Execution properties extend/limit the functionality of the translator based on the physical source capabilities. Sometimes default properties must be adjusted for proper execution of the translator.

Execution properties

None.

Importer properties

The following table lists the importer properties that define the behavior of the translator during the import of from the physical source.

NameDescriptionDefault

metadataUrl

URL from which to obtain the OpenAPI metadata. May be a local file using a file: URL.

true

server

The server to use. Otherwise the first server listed will be used.

null

preferredProduces

Preferred Accept MIME type header, this should be one of the OpenAPI 'produces' types;

application/json

preferredConsumes

Preferred Content-Type MIME type header, this should be one of the OpenAPI 'consumer' types;

application/json

Tip

Native queries - The OpenAPI translator cannot perform native or direct query execution. However, you can use the invokehttp method of the Web services translator to issue REST-based calls, and then use SQLXML to parse results.

Limitations

The OpenAPI translator does not fully implement all of the features of OpenAPI. The following limitations apply:

  • You cannot set the MIME type to application/xml in either the Accept or Content-Type headers.
  • File and Map properties cannot be used. As a result, any multi-part payloads are not supported.
  • The translator does not process security metadata.
  • The translator does not process custom properties that start with x-.
  • The translator does not work with following JSON schema keywords:

    • allOf
    • multipleOf
    • items

9.12. Salesforce translators

You can use the Salesforce translator to run SELECT, DELETE, INSERT, UPSERT, and UPDATE operations against a Salesforce.com account.

salesforce

The translator, known by the type name salesforce, works with Salesforce API 37.0 and later.

Table 9.8. Execution properties

NameDescriptionDefault

MaxBulkInsertBatchSize

Batch Size to use to insert bulk inserts.

2048

SupportsGroupBy

Enables GROUP BY Pushdown. Set to false to have Data Virtualization process group by aggregations, such as those returning more than 2000 rows which error in SOQL.

true

The Salesforce translator can import metadata.

Table 9.9. Import properties

Property NameDescriptionRequiredDefault

NormalizeNames

If the importer should attempt to modify the object/field names so that they can be used unquoted.

false

true

excludeTables

A case-insensitive regular expression that when matched against a table name will exclude it from import. Applied after table names are retrieved. Use a negative look-ahead (?!<inclusion pattern>).* to act as an inclusion filter.

false

n/a

includeTables

A case-insensitive regular expression that when matched against a table name will be included during import. Applied after table names are retrieved from source.

false

n/a

importStatstics

Retrieves cardinalities during import using the REST API explain plan feature.

false

false

ModelAuditFields

Add Audit Fields To Model. This includes CreatedXXX, LastModifiedXXX, and SystemModstamp fields.

false

false

NOTE: When both includeTables and excludeTables patterns are present during the import, the includeTables pattern matched first, then the excludePatterns will be applied.

Note

If you need connectivity to an API version other than what is built in, you may try to use an existing connectivity pair, but in some circumstances - especially accessing a later remote api from an older Java API - this is not possible and results in what appears to be hung connections.

Extension metadata properties

Salesforce is not relational database, however Data Virtualization provides ways to map Saleforce data into relational constructs like Tables and Procedures. You can define a foreign table using DDL in Data Virtualization VDB, which maps to Salesforce’s SObject. At runtime, to interpret this table back to a SObject, Data Virtualization decorates or tags this table definition with additional metadata. For example, a table is defined as in the following example:

CREATE FOREIGN TABLE Pricebook2 (
	Id string,
	Name string,
	IsActive boolean,
	IsStandard boolean,
	Description string,
	IsDeleted boolean)
	OPTIONS (
	  UPDATABLE 'TRUE',
	  "teiid_sf:Supports Query" 'TRUE');

In the preceding example, the property in the OPTIONS clause with the property "teiid_sf:Supports Query" set to TRUE indicates that you can run SELECT commands against this table. The following table lists the metadata extension properties that can be used in a Salesforce schema.

Property NameDescriptionRequiredDefaultApplies To

Supports Query

You can run SELECT commands against the table.

false

true

Table

Supports Retrieve

You can retrieve the results of SELECT commands run against the table.

false

true

Table

SQL processing

Salesforce does not provide the same set of functionality as a relational database. For example, Salesforce does not support arbitrary joins between tables. However, working in combination with the Data Virtualization Query Planner, the Salesforce connector can use nearly all of the SQL syntax capabilities in Data Virtualization. The Salesforce Connector executes SQL commands by "pushing down" the command to Salesforce whenever possible, depending on the available capabilities. Data Virtualization will automatically provide additional database functionality when the Salesforce Connector does not explicitly enable use of a given SQL construct. In cases where certain SQL capabilities cannot be pushed down to Salesforce, Data Virtualization will push down the capabilities that it can, and fetch a set of data from Salesforce. Then, Data Virtualization will evaluate the additional capabilities, creating a subset of the original data set. Finally, Data Virtualization will pass the result to the client.

If you issue queries with a GROUP BY clause, and you receive a Salesforce error that indicates that queryMore is not supported, you can either add limits, or set the execution property SupportsGroupBy to false.

SELECT array_agg(Reports) FROM Supervisor where Division = 'customer support';

Neither Salesforce, nor the Salesforce Connector support the array_agg() scalar. however, both are compatible with the CompareCriteriaEquals query, so the connector transforms the query that it receives into this query to Salesforce.

SELECT Reports FROM Supervisor where Division = 'customer support';

The array_agg() function will be applied by the Data Virtualization Query Engine to the result set returned by the connector.

In some cases, multiple calls to the Salesforce application will be made to process the SQL that is passed to the connector.

DELETE From Case WHERE Status = 'Closed';

The API in Salesforce to delete objects can delete by object ID only. In order to accomplish this, the Salesforce connector will first execute a query to get the IDs of the correct objects, and then delete those objects. So the above DELETE command will result in the following two commands.

SELECT ID From Case WHERE Status = 'Closed';
DELETE From Case where ID IN (<result of query>);

NOTE: The Salesforce API DELETE call is not expressed in SQL, but the above is an equivalent SQL expression.

It’s useful to be aware of incompatible capabilities, in order to avoid fetching large data sets from Salesforce and making you queries as performant as possible. For information about the SQL constructs that you can push down to Salesforce, see Compatible SQL capabilities.

Selecting from multi-select picklists

A multi-select picklist is a field type in Salesforce that can contain multiple values in a single field. Query criteria operators for fields of this type in SOQL are limited to EQ, NE, includes and excludes. For the Salesforce documentation about how to select from multi-select picklists, see Querying Multi-select Picklists

Data Virtualization SQL does not support the includes or excludes operators, but the Salesforce connector provides user-defined function definitions for these operators that provide equivalent functionality for fields of type multi-select. The definition for the functions is:

boolean includes(Column column, String param)
boolean excludes(Column column, String param)

For example, take a single multi-select picklist column called Status that contains all of these values.

  • current
  • working
  • critical

For that column, all of the below are valid queries:

SELECT * FROM Issue WHERE true = includes (Status, 'current, working' );
SELECT * FROM Issue WHERE true = excludes (Status, 'current, working' );
SELECT * FROM Issue WHERE true = includes (Status, 'current;working, critical' );

EQ and NE criteria will pass to Salesforce as supplied. For example, these queries will not be modified by the connector.

SELECT * FROM Issue WHERE Status = 'current';
SELECT * FROM Issue WHERE Status = 'current;critical';
SELECT * FROM Issue WHERE Status != 'current;working';

Selecting all objects

You can use the Salesforce connector to call the queryAll operation from the Salesforce API. The queryAll operation is equivalent to the query operation with the exception that it returns data about all current and deletedobjects in the system.

The connector determines if it will call the query or queryAll operation via reference to the isDeleted property present on each Salesforce object, and modeled as a column on each table generated by the importer. By default this value is set to false when the model is generated and thus the connector calls query. Users are free to change the value in the model to true, changing the default behavior of the connector to be queryAll.

The behavior is different if isDeleted is used as a parameter in the query. If the isDeleted column is used as a parameter in the query, and the value is true, then the connector calls queryAll.

select * from Contact where isDeleted = true;

If the isDeleted column is used as a parameter in the query, and the value is false, then the connector that performs the default behavior will call the query.

select * from Contact where isDeleted = false;

Selecting updated objects

If the option is selected when importing metadata from Salesforce, a GetUpdated procedure is generated in the model with the following structure:

GetUpdated (ObjectName IN string,
    StartDate IN datetime,
    EndDate IN datetime,
    LatestDateCovered OUT datetime)
returns
    ID string

See the description of the GetUpdated operation in the Salesforce documentation for usage details.

Selecting deleted objects

If the option is selected when importing metadata from Salesforce, a GetDeleted procedure is generated in the model with the following structure:

GetDeleted (ObjectName IN string,
    StartDate IN datetime,
    EndDate IN datetime,
    EarliestDateAvailable OUT datetime,
    LatestDateCovered OUT datetime)
returns
    ID string,
    DeletedDate datetime

See the description of the GetDeleted operation in the Salesforce documentation for usage details.

Relationship queries

Unlike a relational database, Salesforce does not support join operations, but it does have support for queries that include parent-to-child or child-to-parent relationships between objects. These are termed Relationship Queries. You can run Relationship Queries in the SalesForce connector through Outer Join syntax.

SELECT Account.name, Contact.Name from Contact LEFT OUTER JOIN Account
on Contact.Accountid = Account.id

This query shows the correct syntax to query a SalesForce model with to produce a relationship query from child to parent. It resolves to the following query to SalesForce.

SELECT Contact.Account.Name, Contact.Name FROM Contact
select Contact.Name, Account.Name from Account Left outer Join Contact
on Contact.Accountid = Account.id

This query shows the correct syntax to query a SalesForce model with to produce a relationship query from parent to child. It resolves to the following query to SalesForce.

SELECT Account.Name, (SELECT Contact.Name FROM
Account.Contacts) FROM Account

See the description of the Relationship Queries operation in the SalesForce documentation for limitations.

Bulk insert queries

You can also use bulk insert statements in the SalesForce translator by using JDBC batch semantics or SELECT INTO semantics. The batch size is determined by the execution property MaxBulkInsertBatchSize, which can be overridden in the vdb file. The default value of the batch is 2048. The bulk insert feature uses the async REST based API exposed by Salesforce for execution for better performance.

Bulk selects

When querying tables with more than 10,000,000 records, or if experiencing timeouts with just result batching, Data Virtualization can issue queries to Salesforce using the bulk API. When using a bulk select, primary key (PK) chunking is enabled if it is compatible with the query.

The use of the bulk api requires a source hint in the query:

SELECT /*+ sh salesforce:'bulk' */ Name ... FROM Account

Where salesforce is the source name of the target source.

The default chunk size of 100,000 records will be used.

Note

This feature is only supported in the Salesforce API version 28 or higher.

Compatible SQL capabilities

You ca use the following SQL capabilities with the Salesforce Connector. These SQL constructs will be pushed down to Salesforce.

  • SELECT command
  • INSERT Command
  • UPDATE Command
  • DELETE Command
  • NotCriteria
  • OrCriteria
  • CompareCriteriaEquals
  • CompareCriteriaOrdered
  • IsNullCritiera
  • InCriteria
  • LikeCriteria - Can be used for String fields only.
  • RowLimit
  • Basic Aggregates
  • OuterJoins with join criteria KEY

Native Queries

Salesforce procedures may optionally have native queries associated with them. For more information, see Parameterizable native queries in Translators. The operation prefix (select;, insert;, update;, delete; - see below for more) must be present in the native-query, but it will not be issued as part of the query to the source.

Example DDL for a Salesforce native procedure

CREATE FOREIGN PROCEDURE proc (arg1 integer, arg2 string) OPTIONS ("teiid_rel:native-query" 'search;SELECT ... complex SOQL ... WHERE col1 = $1 and col2 = $2')
returns (col1 string, col2 string, col3 timestamp);

Direct query procedure

This feature is turned off by default because of the security risk this exposes to execute any command against the source. To enable direct query procedures, set the execution property called SupportsDirectQueryProcedure to true. For more information, see Override the execution properties in Chapter 9, Translators.

Tip

By default the name of the procedure that executes the queries directly is called native. For information about how to change the default name, see Override the execution properties in Chapter 9, Translators.

The Salesforce translator provides a procedure to execute any ad-hoc SOQL query directly against the source without Data Virtualization parsing or resolving. Since the metadata of this procedure’s results are not known to Data Virtualization, they are returned as an object array. ARRAYTABLE can be used construct tabular output for consumption by client applications. Data Virtualization exposes this procedure with a simple query structure as follows:

Select example

SELECT x.* FROM (call sf_source.native('search;SELECT Account.Id, Account.Type, Account.Name FROM Account')) w,
 ARRAYTABLE(w.tuple COLUMNS "id" string , "type" string, "name" String) AS x

from the above code, the "search" keyword followed by a query statement.

Note

The SOQL is treated as a parameterized native query so that parameter values may be inserted in the query string properly. For more information, see Parameterizable native queries in Translators. The results returned by search may contain the object Id as the first column value regardless of whether it was selected. Also queries that select columns from multiple object types will not be correct.

Delete Example

SELECT x.* FROM (call sf_source.native('delete;', 'id1', 'id2')) w,
 ARRAYTABLE(w.tuple COLUMNS "updatecount" integer) AS x

form the above code, the "delete;" keyword followed by the ids to delete as varargs.

Create example

SELECT x.* FROM
 (call sf_source.native('create;type=table;attributes=one,two,three', 'one', 2, 3.0)) w,
 ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x

form the above code, the "create" or "update" keyword must be followed by the following properties. Attributes must be matched positionally by the procedure variables - thus in the example attribute two will be set to 2.

Property NameDescriptionRequired

type

Table Name

Yes

attributes

comma separated list of names of the columns

no

The values for each attribute is specified as separate argument to the "native" procedure.

Update is similar to create, with one more extra property called "id", which defines identifier for the record.

Update example

SELECT x.* FROM
 (call sf_source.native('update;id=pk;type=table;attributes=one,two,three', 'one', 2, 3.0)) w,
 ARRAYTABLE(w.tuple COLUMNS "update_count" integer) AS x

Tip

By default the name of the procedure that executes the queries directly is called native, however you can add set an override execution property in the DDL file to change it.

9.13. Rest translator

The Rest translator, known by the type name rest, exposes stored procedures for calling REST services.  Results from this translator will commonly be used with the TEXTTABLE, JSONTABLE, or XMLTABLE table functions to use CSV, JSON, or XML formated data.

Execution properties 

There are no rest importer settings, but it can provide metadata for VDBs.

Usage

The rest translator exposes low level procedures for accessing web services.

InvokeHTTP procedure

invokeHttp can return the byte contents of an HTTP(S) call.

Procedure invokeHttp(action in STRING, request in OBJECT, endpoint in STRING, stream in BOOLEAN, contentType out STRING, headers in CLOB) returns BLOB

Action indicates the HTTP method (GET, POST, etc.), which defaults to POST.

A null value for endpoint will use the default value. The default endpoint is specified in the rest source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.

Since multiple parameters are not required to have values, it is often more clear to call the invokeHttp procedure with named parameter syntax.

call invokeHttp(action=>'GET')

The request can be one of SQLXML, STRING, BLOB, or CLOB. The request will be sent as the POST payload in byte form. For STRING/CLOB values this will default to the UTF-8 encoding. To control the byte encoding, see the to_bytes function.

The optional headers parameter can be used to specify the request header values as a JSON value. The JSON value should be a JSON object with primitive or list of primitive values.

call invokeHttp(... headers=>jsonObject('application/json' as "Content-Type", jsonArray('gzip', 'deflate') as "Accept-Encoding"))

Recommendations for setting headers parameter:

  • Content-Type might be necessary if the HTTP POST/PUT method is invoked.
  • Accept is necessary if you want to control return Media Type.
Native queries

You cannot use native queries or direct query execution procedures with the web services translator.

Streaming considerations

If the stream parameter is set to true, then the resulting LOB value may only be used a single time. If stream is null or false, then the engine may need to save a copy of the result for repeated use. Care must be used as some operations, such as casting or XMLPARSE might perform validation which results in the stream being consumed.

9.14. Web services translator

The Web Services translator, known by the type name soap or ws, exposes stored procedures for calling web/SOAP services.  Results from this translator will commonly be used with the TEXTTABLE or XMLTABLE table functions to use CSV or XML formated data.

Execution properties 

NameDescriptionWhen UsedDefault

DefaultBinding

The binding that should be used if one is not specified. Can be one of HTTP, SOAP11, or SOAP12.

invoke*

SOAP12

DefaultServiceMode

The default service mode. For SOAP, MESSAGE mode indicates that the request will contain the entire SOAP envelope. and not just the contents of the SOAP body. Can be one of MESSAGE or PAYLOAD

invoke* or WSDL call

PAYLOAD

XMLParamName

Used with the HTTP binding (typically with the GET method) to indicate that the request document should be part of the query string.

invoke*

null - unused

Note

Setting the proper binding value on the translator is recommended as it removes the need for callers to pass an explicit value. If your service is actually uses SOAP11, but the binding used SOAP12 you will receive execution failures.

There are no importer settings, but it can provide metadata for VDBs. If the connection is configured to point at a specific WSDL, the translator will import all SOAP operations under the specified service and port as procedures.

Importer properties

When specifying the importer property, it must be prefixed with "importer.". Example: importer.tableTypes

NameDescriptionDefault

importWSDL

Import the metadata from the WSDL URL configured in resource-adapter.

true

Usage

The translator exposes low level procedures for accessing web services.

Invoke procedure

Invoke allows for multiple binding, or protocol modes, including HTTP, SOAP11, and SOAP12.

Procedure invoke(binding in STRING, action in STRING, request in XML, endpoint in STRING, stream in BOOLEAN) returns XML

The binding may be one of null (to use the default) HTTP, SOAP11, or SOAP12. Action with a SOAP binding indicates the SOAPAction value. Action with a HTTP binding indicates the HTTP method (GET, POST, etc.), which defaults to POST.

A null value for the binding or endpoint will use the default value. The default endpoint is specified in the source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.

Since multiple parameters are not required to have values, it is often more clear to call the invoke procedure with named parameter syntax.

call invoke(binding=>'HTTP', action=>'GET')

The request XML should be a valid XML document or root element.

InvokeHTTP procedure

invokeHttp can return the byte contents of an HTTP(S) call.

Procedure invokeHttp(action in STRING, request in OBJECT, endpoint in STRING, stream in BOOLEAN, contentType out STRING, headers in CLOB) returns BLOB

Action indicates the HTTP method (GET, POST, etc.), which defaults to POST.

A null value for endpoint will use the default value. The default endpoint is specified in the source configuration. The endpoint URL may be absolute or relative. If it’s relative then it will be combined with the default endpoint.

Since multiple parameters are not required to have values, it is often more clear to call the invokeHttp procedure with named parameter syntax.

call invokeHttp(action=>'GET')

The request can be one of SQLXML, STRING, BLOB, or CLOB. The request will be sent as the POST payload in byte form. For STRING/CLOB values this will default to the UTF-8 encoding. To control the byte encoding, see the to_bytes function.

The optional headers parameter can be used to specify the request header values as a JSON value. The JSON value should be a JSON object with primitive or list of primitive values.

call invokeHttp(... headers=>jsonObject('application/json' as "Content-Type", jsonArray('gzip', 'deflate') as "Accept-Encoding"))

Recommendations for setting headers parameter:

  • Content-Type might be necessary if the HTTP POST/PUT method is invoked.
  • Accept is necessary if you want to control return Media Type.

WSDL based procedures

The procedures above give you anonymous way to execute any web service methods by supplying an endpoint, with this mechanism you can alter the endpoint defined in WSDL with a different endpoint. However, if you have access to the WSDL, then you can configure the WSDL URL in the web-service resource-adapter’s connection configuration, Web Service translator can parse the WSDL and provide the methods under configured port as pre-built procedures as its metadata. If you are using the default native metadata import, you will see the procedures in your web service’s source model.

Native queries

You cannot use native queries or direct query execution procedures with the web services translator.

Streaming considerations

If the stream parameter is set to true, then the resulting LOB value may only be used a single time. If stream is null or false, then the engine may need to save a copy of the result for repeated use. Care must be used as some operations, such as casting or XMLPARSE might perform validation which results in the stream being consumed.