Configuration of the Data Import
2017 R1 Update Rollup 1

Configuration of the Data ImportPermanent link for this heading

In Data Import objects or Data Import Component Objects (can be delivered with a software component) is configured how single records or fields of a data source are converted to Fabasoft Folio objects and their properties.

Apply the following principles:

  • From a single record of the data source, one or more objects of the same class or objects of different classes can be created.
  • In a separate allocation table, object relations can be mapped.
  • In addition, functions that affect the data import can be implemented.

Note: In a Linux system environment the Fabasoft web service user (:<hostname>:\fscsrv) needs writing properties (“Change Properties“) for the Data Import object and the Data Import Log object. So for the Data Import object an appropriate ACL has to be selected. Additionally in the Log Object field or the Data Import, a Data Import Log has to be created manually before the first import and an appropriate ACL has to be selected.

Example of a Data Import:

Data SourcePermanent link for this heading

In the Data Source property, select the type of data source (ODBC, OLE DB, Report Converter, Script, LDAP, ADE-DB, Spreadsheet Document or XML) and enter a reference to the data source. The available types of data sources are described in chapter “Definition of Data Sources”.

Examples of references:

  • Using “OLE-DB”, for example the path to a UDL file can be defined
    e.g. “E:\migration.udl”
  • Using “ODBC”, for example a Microsoft Office Access database can be defined
    e.g. “DSN=Migration;DBQ=E:\migration.mdb;DriverId=25;FIL=MS Access;MaxBufferSize=2048;PageTimeout=5;”
  • Using “Report Converter”, define a report converter object in a Fabasoft product environment
    e.g. “LOCAL@1.115:ReportConverterRC1”
  • Using “Script”, define a script component object in a Fabasoft product environment
    e.g. “LOCAL@1.115:SkriptS3”
  • Using “LDAP”, for example an LDAP directory can be defined as data source (default timeout: 10 seconds).
    e.g. “host=192.168.8.138;port=389;username=CN=root,DC=adminusers,DC=faba,DC=local;
    password=mypassword;searchbase=DC=faba,DC=local;timeout=100”
  • Using “ADE-DB”, for example a Microsoft SQL Server connection can be defined as data source
    e.g. “Provider=SQLOLEDB;Datasource=(local);Catalog=FSCReporting”
  • Using “ADE-DB”, for example an Oracle connection can be defined data source
    e.g. “Transport=OCI;Location=oradb;Username=system;Password=manager”
  • Using “ADE-DB” for example a PostgreSQL connection cab be defined as data source
  • e.g. “Transport=PGSI;Location=192.168.100.83;Catalog=postgres;ConnectString=PORT=5432;
    Username=postgres;Password=postgres”
  • Using “Spreadsheet Document”, the file path is defined in the Table field. Optionally the separator in case of a CSV file can be defined in the Data Source line.
    e.g. “separator=';'”
  • Using “XML“, define the path to an XML file
    e.g. “E:\migration.xml“
  • Using “File System“, define the path to the directory
    e.g. “E:\importdata“

ADE-DB Parameter:

Transport:e.g. “OCI” for Oracle, “PGSI” for PostgreSQL
Provider:e.g. “SQLOLEDB” for Microsoft SQL Server
Location: For defining the location
Datasource:For defining the data source
Catalog:For defining the name of the database catalogue
Username:Name of the database user
Password:Password of the database user
Connectstring:Additional settings (e.g. “PORT=5432” for PostgreSQL)

Data Source ObjectPermanent link for this heading

This property can be used to reference a Fabasoft object as data source.

Using “Spreadsheet Document”, “XML” or “File System”, data of a Content (Unknown Type) can be imported, which needs the value “xlsx”, “xls”, “sxc”, “ods”, “csv“, “xml” or “zip” in the property File Extension (COOSYSTEM@1.1:contextension).

TablePermanent link for this heading

In this property the name of the table (or the view) is defined, that contains the data to be imported.

Note:

  • Using “OLE DB”, a SELECT statement can be defined.
    e.g. “SELECT * FROM Organization WHERE id>10;“
  • Using “LDAP”, define an LDAP filter.
    e.g. “(objectClass=inetOrgPerson)“
  • Using “Spreadsheet Document”, the file path can be defined.
    e.g. “C:\personen.xlsx“
    Alternatively a Fabasoft Folio object can be selected in the Data Source Object field (see chapter “Data Source Object“).
  • Using “XML”, the path to the desired elements can be defined.
    e.g. “/staff/contactdata/person“
  • Using “File System”, define the extensions considered in the import.
    e.g. “{jpg;png;gif}“

MappingPermanent link for this heading

This property is used to map external data to Fabasoft Folio properties.

  • Column
    In this field enter the name of the database column, the attribute name of the LDAP directory, or using “XML”, a path similar to XPath (see chapter ”XML“).
  • Object Class
    In this field select the object class of the object to be filled with the data of the specified data source column.
  • Object ID
    From a record, several objects of the same object classes can be created. Use this field to distinguish objects of the same object class. For each different object a unique object id is required (integer).
  • Property
    In this field select the property that is to be filled with the data from the specified data source column. For compound properties, a property path can be defined.
    Note: Which properties are available depends on the Object Class. Defining a property path, only the available parts of the compound property can be selected.
  • Value
    This field is used if a property gets a constant value, regardless of the data source values. In this case the Column field is left blank.
  • Key Mode/Update Mode
    In this field, the import behavior is determined.
    • Key
      The value of the specified column is set when Fabasoft objects are created. The key is used for finding objects.
      Following property types can be used as a key: string, integer, enumeration, date and (conditioned) float.
      Following property types cannot be used as a key: field, content, dictionary and object pointer.
      Note: All properties with selection “Key“ are used together as key for unique identification of objects of one object class having the same id.
    • Set if Created
      The value is only set if a new object is created.
    • Update
      The value is set or added to a list. Existing values are not overwritten by blanks.
    • Overwrite
      The value is set or added to a list. Existing values are overwritten by blanks.
    • Ignore
      The line is ignored.
  • Options
    For the values to be imported further conditions can be defined.
    • String List
      This option can be selected to split a string into a list of strings.
    • Aggregate Key
      For aggregates, instead of using the keys defined in the data model, this option can be used to define own key properties.
    • Group by
      This option can be selected to keep certain changes of an object within one transaction. Selecting this option, a variable transaction size is achieved. Usually “Group by” is used for key properties.
    • Content as Value / Filename
      When assigning values to content properties, the value is interpreted as a file name. If there is no matching file, then the value is written into a temporary file and the file is assigned to the property.
      Using “OLE DB” and “ODBC”, values longer than 4000 characters are ​​written to a temporary file and the file name is passed as value.
      This behavior can be changed through the options “Content as Value“ and “Filename”.
      • Content as Value
        The value is interpreted as a string, never as a file name. Using “OLE DB”, values are not written to temporary files but kept in the memory as strings. Values exceeding a size of 500 KB are cut.
      • Filename
        The value is interpreted as file name.
    • Unique Entries in List
      Before adding a value to a list, the list is searched. If the list already contains a certain value, it is not added. Existing double entries are not removed.
      Note: This Option can be used to repeat data imports without producing double entries.
    • Must Be Defined in Aggregate
      An aggregate entry is only produced if the property at which this option is set has a value. (e.g. used for a list of phone numbers).
    • Skip Create Value Method
      The action, defined in the Constructor Action field of the property will not be executed.
    • Skip Set Method
      The action, defined in the Action Called Before Property is Saved field of the property will not be executed.

Object RelationsPermanent link for this heading

In this field, imported objects can be assigned to object pointer properties.

  • Source Object Class
    In this filed select the object class of the source object. The source object is the object you want to assign to a certain object pointer property.
  • S-ID
    In this field enter the object ID defined for this objects in the Mapping field.
  • Target Object Class
    In this field select the object class of the target object. The target object is the object to whose object pointer property the source object shall be assigned.
  • T-ID
    In this field enter the object ID defined for this objects in the Mapping field.
  • Target Property
    In this field, select the object pointer property you want to fill.
  • Update Mode
    In this field, the import behavior is determined.
    • Set if Created
      The value is only set if a new object is created.
    • Update
      The value is set or added to a list. Existing values are not overwritten by blanks.
    • Overwrite
      The value is set or added to a list. Existing values are overwritten by blanks.
    • Ignore
      The line is ignored.
  • Options
    For the values to be imported further conditions can be defined.
    • Aggregate Key
      For aggregates, instead of using the keys defined in the data model, this option can be used to define own key properties.
    • Unique Entries in List
      Before adding a value to a list, the list is searched. If the list already contains a certain value, it is not added. Existing double entries are not removed.
      Note: This Option can be used to repeat data imports without producing double entries.
    • Must Be Defined in Aggregate
      An aggregate entry is only produced if the property at which this option is set has a value. (e.g. used for a list of phone numbers).
    • Skip Create Value Method
      The action defined in the Constructor Action field of the property will not be executed.
    • Skip Set Method
      The action defined in the Action Called Before Property is Saved field of the property will not be executed.

Class PropertiesPermanent link for this heading

This field is used to control the behavior of creating and updating objects of certain object classes. To search for objects different methods are available.

  • Object Class
    Create an entry for each object class to be imported. If for an object class different IDs are used, create an entry for each ID of the object class.
  • ID
    Enter the ID assigned to the object classes in the Mapping field.
  • Create Objects
    Select this option to create objects of that object class that do not yet exist in Fabasoft Folio.
  • Avoid Duplicate Objects
    To avoid creating duplicate objects, different methods are available. They are described in chapter “Avoide Duplicate Objects“.
    • “No Check of Existing Objects”
    • “Check by Collecting Keys of All Objects“
    • ”Check by Query for Each Object“
    • “Check by Query for Each Object (No Cache)“
    • “Check by Hash Table Query“
    • “Check by Local Query“
    • “Use Same Object as Lower Object ID“
  • Query Scope
    In this field, a Query Scope object can be used for restricting the search. This allows improving the performance of the search.
  • Options
    These options can be used to prevent running certain actions for objects of the selected object class.
    • “Skip Prepare Commit Method“
      The action COOSYSTEM@1.1:ObjectPrepareCommit will not be executed.
    • “Skip Finalize Commit Method“
      The action COOSYSTEM@1.1:ObjectFinalizeCommit will not be executed.
    • “Skip Committed Method“
      The action COOSYSTEM@1.1:ObjectCommitted will not be executed.
    • “Skip Object Constructor Method“
      The action COOSYSTEM@1.1:ObjectConstructor will not be executed.

Avoide Duplicate ObjectsPermanent link for this heading

This section describes the various configuration options in property Avoid Duplicate Objects.

No Check of Existing Objects

This setting is recommended if the import process does not create objects that already exist in the Fabasoft product environment. A key property is not required. Objects are always created (even if Create Objects is set to "No"), except when key properties (one or more) are defined, but these are all empty.

Advantages:

  • Quick
  • Simple

Disadvantages:

  • No check for duplicate objects
  • Difficult error handling

Check by Collecting Keys of All Objects

In an initialization, a local search tree (Cache) is set up in which the key values and the object address of all objects of the object class are stored.

During the import, these keys are searched for a matching object. If no matching object is found, a new object is created and the key also is added to the list of existing objects. This method is useful if many changes are carried out, references are placed on objects or new objects are created and the number of existing objects is not too extensive.

Advantages:

  • Quick search in the local search tree

Disadvantages:

  • Initialization needs much time
  • High memory usage
  • Problems, if objects of many object classes shall be collected

Check by Query for Each Object

For each imported object, a search is performed, which checks whether the object already exists. The keys of found objects are stored locally and thus quickly found again. This method is much slower in finding than the method described above. It is particularly well suited when only few objects of a class are accessed. Moreover, less memory is needed.

Advantages:

  • No initialization

Disadvantages:

  • Slow search per record

Check by Query for Each Object (No Cache)

This option corresponds to the option "Check by Query for Each Object", but keys of found objects are not stored. It makes sense when objects are not used more than once or the memory consumption would be too great.

Advantages:

  • No initialization
  • Low memory usage

Disadvantages:

  • Slow search per record
  • No cache for already found keys

Check by Hash Table Query

The object address is calculated from the key value using a hash algorithm.

Advantages:

  • In special cases more efficient

Disadvantages:

  • Creating objects only possible via COLD or scripts
  • Only makes sense for a limited number of objects

Check by Local Query

The keys of the objects that are created by the current import are stored. Not the server is searched for objects but the stored keys. This method is suitable if the keys of the imported data are not already in use for existing objects.

Advantages:

  • No initialization
  • No search per record
  • Quick search in the local search tree

Disadvantages:

  • Existing objects are not found

Use Same Object as Lower Object ID

This method can only be used in combination with another search method.

Using this option it is possible to make several list entries (based on one record) within one object, e.g. several phone numbers of a person.

This requires defining the keys and the search method for the object class. Further list entries can be defined using a higher object ID and selecting the search method “Use Same Object as Lower Object ID“.

Digression: Selecting the Search MethodPermanent link for this heading

The following graphic may serve as a decision aid for selecting the search method, but for each loader the right method must be verified.

Search via object address

If the object address of an object is known, it can be used as a key. Through that the performance is very high.

To do so, before the import static object classes should be exported and it should be possible to get the object address in a JOIN on the database using the key properties.

Search for the reference

The reference name of component objects can be used for searching. Thereby two possibilities are available: It is possible to use the reference including the software component or, if the reference is unique, it can also be used without software component. To use this option, for the Mapping the property Reference (COOSYSTEM@1.1:reference) has to be selected.

Get methods for key properties

Get methods for key properties can have the consequence that the value in the database does not match the value in the object property. So the query for each object will have another result than the check by collecting keys of all objects.

Set methods for key properties

If the value of a key property is changed by a set method, this change is not considered in the local search tree, whereby the search can get inconsistent.

Optimization

It is advantageous to use only one key property per object class. Using search methods that search per object, in the database this property should be provided with an index on the value.

Particularly suitable as an index are External Key (COOSYSTEM@1.1:objexternalkey) and Subject (COOSYSTEM@1.1:objsubject). The property External Key has been defined for this purpose, but has the disadvantage that it is not efficiently indexed. The property Subject is displayed in many forms and is only suitable if the key values contain “readable” information. This property is after definition of an index very efficient in finding.

Number of Records for CommitPermanent link for this heading

Use this property to determine the maximum number (and using the group change also the minimum number) of records to be processed in one transaction.

Some actions when loading objects can be performed on multiple objects (for example, the "commit", which writes the data to the database), whereby the number of communication steps is greatly reduced. However, very large transactions involve the risk that they collide with other transactions, and thus take longer than several small transactions. In many cases the default value of 150 is a value that leads to a good result.

Number of ThreadsPermanent link for this heading

Use this field to define the number of threads that are used by the client to generate objects. By multiple threads, it is possible to parallelize the processing. Depending on the type of import (e.g. creating or modifying objects) and, depending on the performance of individual components, a large increase in performance can be achieved. On the other hand, a certain synchronization effort is connected with the use of parallel transactions, so that an optimum value mostly is found in a small number (less than 5). The optimum value should be determined experimentally for the particular application.

The use of numerator properties prevents the use of multiple threads, because numerators remain locked until the end of the "Commit" and therefore allow no more parallelism. The restriction applies only when new objects of that class may be created (hence if Create Objects is set to "Yes" or "Undefined").

First RecordPermanent link for this heading

In this property, specify the record number that is applied in the database to the record from which starts the loading process.

Number of Records to Be ReadPermanent link for this heading

Use this property to determine the number of records to be loaded. Otherwise all records are loaded.

Logging ModePermanent link for this heading

Use this property to determine whether a log is created or not and which information of each import process is logged.

Available options:

  • “No Log“
    No log is created.  
  • “Log Errors“
    Only errors (and their raw data) are logged.
  • “Full Log“
    All raw data, the addresses of the created objects and error messages are logged.

Log ObjectPermanent link for this heading

The created logs are written to a Data Import Log object which is assigned to this field.

Maximum Number of Entries in LogPermanent link for this heading

Use this property to determine the maximum number of logs in the log object (Data Import Log).

Note: If the maximum number of logs is exceeded the eldest log is deleted.

Skip Methods Without Check?Permanent link for this heading

Use this property to determine whether a review action is called, which determines whether object classes and attributes match the specified object classes and attributes.

Skip Setting Modification PropertiesPermanent link for this heading

Use this property to determine whether the properties Last Change on/at and Last Change by are updated for the objects that are changed during the data import.

Abort Data Import Immediately on ErrorPermanent link for this heading

Use this property to determine whether the data import is aborted if an error occurs.

Skip Object RefreshPermanent link for this heading

Use this property to determine whether objects are updated during import or not. If it is ensured that problems such as inconsistent caches cannot occur, the data import can be accelerated using this option (e.g. for the initial data import).

FilterPermanent link for this heading

On the “Filter” tab of the data import object, filters can be configured. The available properties are described in the following chapters.

Filter Expression for RAW DataPermanent link for this heading

This property defines an expression, which is called for each record of the database. The expression can be used to modify raw data. More information about the parameters contained in the global scope can be found in the reference documentation.

For modifying raw data, a Fabasoft app.ducx expression can be used.

The modification of raw data can include

  • the calculation of values
  • the formatting and
  • filtering.

Filter Expression for ObjectsPermanent link for this heading

This property defines an expression, which is called for each handled record after all properties from the current record have been set and before the transaction is committed. More information about the parameters contained in the global scope can be found in the reference documentation.

  • The modification of objects can include
  • versioning,
  • setting additional properties and
  • key numerators.

Filter Expression for Objects After CommitPermanent link for this heading

This property defines an expression, which is called for each data record after the transaction has been committed. More information about the parameters contained in the global scope can be found in the reference documentation.

Digression: NumeratorsPermanent link for this heading

The value of numerators is calculated by default from the property constructor. This is done as follows: The numerator is locked in a separate transaction, subsequently the value is increased and finally the process is finished by a "commit".

Using Fabasoft Folio/COLD the following possibilities are available for handling numerators:

Automatically

  • The numerator is locked in the COLD transaction.
  • All objects within the transaction use the numerator.
  • COLD transaction commit.

Note: The import can be done using only one thread.

Manually

  • The property constructor is ignored.
  • The value is set explicitly.
  • In an own step the numerator object is updated using the action NUMERATOR@1.1001:UpdateValue.