Show/Hide Toolbars

Navigation: » No topics above this level «

Generic (customizable)

Scroll Prev Up Next More

This import format is a generic format that can be used to import features and observations data for any feature type. It can read data from a wide range of tabular-structured data files, by allowing you to define which column in the source file contains what information as well as many other settings. This import type gives you great flexibility to adapt to the structure of a given source file.

The file format expected by generic import type is a text file structured in lines and columns. The import type will try to detect the encoding of the source files automatically. Listing 1 shows the supported encodings.

A line in the source file must be terminated by line feed character (U+000A, UTF-8: 0x0A, typical for Unix, Linux, Android, Mac OS X, BSD, and other operating systems), a carriage return (U+000D, UTF-8: 0x0D typical for Mac OS till Version 9 and other operating systems), or a carriage return immediately followed by a line feed (typical for Windows operating systems).

Leading and trailing white space characters (see table 1) will be automatically trimmed from each line. If the column separator is set to tab (U+0009) tabs will not be trimmed. The first character of a line is defined as the first non-white space character, the last character as the last non-white space character.

Lines where the first characters (after trimming) equals the user-defined Comment Designator (see table 2) are treated as comment lines and will be ignored. If a comment designator is contained in a line, but is not the first character in the line it will be treated as data.

UTF-8

UTF-16 (BE and LE)

UTF-32 (BE and LE)

windows-1252 (mostly equivalent to iso8859-1)

windows-1251 and ISO-8859-5 (cyrillic)

windows-1253 and ISO-8859-7 (greek)

windows-1255 (logical hebrew. Includes ISO-8859-8-I and most of x-mac-hebrew)

ISO-8859-8 (visual hebrew)

Big-5

gb18030 (superset of gb2312)

HZ-GB-2312

Shift-JIS

EUC-KR, EUC-JP, EUC-TW

ISO-2022-JP, ISO-2022-KR, ISO-2022-CN

KOI8-R

x-mac-cyrillic

IBM855 and IBM866

X-ISO-10646-UCS-4-3412 and X-ISO-10646-UCS-4-2413 (unusual BOM)

ASCII

Listing 1: Encodings supported by this import format

Empty lines (i.e. lines that only contain white space characters) are ignored.

Each non-comment and non-empty line is separated into cells by the user-defined column separator character (see table 2).

Class

Members

Space Separators

SPACE (U+0020), OGHAM SPACE MARK (U+1680), MONGOLIAN VOWEL SEPARATOR (U+180E), EN QUAD (U+2000), EM QUAD (U+2001), EN SPACE (U+2002), EM SPACE (U+2003), THREE-PER-EM SPACE (U+2004), FOUR-PER-EM SPACE (U+2005), SIX-PER-EM SPACE (U+2006), FIGURE SPACE (U+2007), PUNCTUATION SPACE (U+2008), THIN SPACE (U+2009), HAIR SPACE (U+200A), NARROW NO-BREAK SPACE (U+202F), MEDIUM MATHEMATICAL SPACE (U+205F), and IDEOGRAPHIC SPACE (U+3000)

Line Separator

LINE SEPARATOR character (U+2028)

Other

CHARACTER TABULATION (U+0009), LINE FEED (U+000A), LINE TABULATION (U+000B), FORM FEED (U+000C), CARRIAGE RETURN (U+000D), NEXT LINE (U+0085), and NO-BREAK SPACE (U+00A0).

Table 1: White space characters that are removed from the beginning or end of a string when a string is trimmed

Setting

Description

General

True value text

If the string in a cell that is treated as Boolean cell equals this text, the value in this cell is treated as Boolean True. All other strings will be treated as false.

This comparison is done is case sensitive on the trimmed (see table 1) value of a cell.

Null value

If the string in a cell matches this string the Application Server will treat this cell as Null.

This comparison is done is case sensitive.

Column separator

This specifies the character used in the source files to delimit columns. You can choose from a list of predefined characters.

Merge multiple separators

If this flag is not set one column separator character advices the Application Server to skip to the next column. If this flag is set, consecutive column separator characters will be treated as a single separator character.

Number of header lines

This defines the number of lines from in the beginning of the file that are treated as header lines and will therefore not be parsed.

Comment designator

If a line in the source file starts with this text, it is treated as comment line, i.e. it will not be parsed by the import.

Column mapping

This setting allows you to define the column mapping, i.e. the association of columns in the source file with specific properties of the feature or observation to be imported.

Localization

Settings that are used to handle localized data.

Date time format

The pattern that is used to parse date time values given as string in the source files. See Date-time formatting for more information on date time patterns. Note that the Culture setting (see below) will also used for the interpretation of the date time pattern defined here.

Culture

The culture used for parsing date time values (see also above), numeric values etc.

Table 2: Import type specific settings

Column Mapping

The flexibility of this import type is mainly founded in the possibility to map columns in the source files to specific properties or other characteristics of the feature or observation to be imported. The column mapping is part of the import type's specific settings. The column mapping consists of some general settings (see table 3) and a series of column associations (table 4).

Setting

Description

Store source files

If enabled the source files that were imported with each import session created from this import definition, will be stored on the Application Server and linked to the import session. This option gives you the possibility to store the original data files along with your import session.

Default: false

Feature type

Defines the feature type of the data to be imported.

Container feature type

This import format supports the creation of container features and the mapping of features to these container features. This setting defines the feature type of container features to be created.

Feature name resolving

With this import type you have different possibilities how the import shall detect the name of the feature for which data is imported.

Fixed feature: Import data for a fixed (see below) feature.

Read from file name: Read feature name from (a part of) the file name. You can use a regular expression to filter out the part of the file name that defines the feature name.

Read from column: Read feature name from a specific column of the source file. If set to this option you will have to define a column mapping for the feature name.

Feature

This setting is only available if the Feature name resolving is set to Fixed feature.

Defines the feature for which data shall be imported.

Container Feature

This setting is only available if the Feature name resolving is set to Fixed feature.

Defines the container feature for which data shall be imported.

Regular expression for feature name

This setting is only available if the Feature name resolving is set to Read from file name.

Here you specify a regular expression that is used to define the part of the file name that specifies the feature name. The actual feature name will be the match for the first sub-expression of that regular repression.

Example:

Original file name: AB_123-4.csv

Regular expression: AB_([0-9]{3}-[0-9])\.csv

Resulting feature name: 123-4

Regular expression for container feature name

This setting is only available if the Feature name resolving is set to Read from file name. See above for a description how to use regular expressions for defining the feature name.

Spatial system of feature

Defines the type of spatial system to use for newly created features.

Spatial System of container feature

Defines the type of spatial system to use for newly created container features.

Axis reference system

Only available if Spatial system of feature is set to Axis reference.

Defines the axis reference that shall be used with newly created features.

Table 3: General settings for the column mapping

Setting

Description

Column index in file

This is the 1-based (meaning first column as index 1) index of the column(s) that is/are mapped to a single property or characteristic of a feature ore observation.

User can define a single column index or a range or collection of column indices that will be read from the source file and written to a property that is mapped. Multiple columns can simply be defined by specifying a comma separated list of column indices, or column ranges.

Examples:

1-3, 5 will map column 1, 2, 3 and 5

7-9 or  will map column 7, 8, 9

1, 2, 3, 4, 7-9 will map column 1, 2, 3, 4, 7, 8, 9

Multiple column indices will only be respected if you choose Join Columns with delimiter as pre-processing type (see below). If not only the first index will be mapped.

Pre-processing type.

You can select a pre-processing step that is performed before the source string is treated as value. Following pre-processing types are available:

None: No pre-processing will be performed

Join Column with delimiters: Multiple columns in the source file are concatenated using the specified delimiter.

Example:

Input file has following data:
axis; point_id; datetime; value
*KW23*; *4*; 2015-07-23 09:34; -32,43

Expected output is
name: *KW23-4*
date: 2015-07-23 09:34
value: -32,43

Then user can configure the import like so:
Column index: 1,2
Preprocessing type: Join columns with delimiter

Post-processing type

Post processing takes the result of the pre-processing as input and processes the text accordingly. If there is no pre-processing then it simply takes the content of a single column.

Post processing takes place after pre-processing, the output of the post processing is then mapped to the property of the feature.

None: No post processing is performed.

Split and take word: Split the value with given split character and take the word at given (1 based) index .
Needs two settings: Split character(the character[s] that will be used to split the text  and Word number to take(which word to take after split). Example: if input text is 'KW23-4', user wants to extract the "KW23" and map to a property then the post processing definition should be: Split characters: -, Word number to take: 1.

Skip from beginning and end: Defines how many characters to be ignored from the beginning and from the end of the input text. This processor has two settings: Characters to skip at the beginning and characters to skip at the end. Example: the post processor gets an text: "abc4957xyz" as input. User wants to extract the numeric part of the text i.e. 4957 and map to a column. The post processing settings will be: Characters to skip at the beginning: 3, Characters to skip at the end: 3. Output is: 4957

Property unit

If a mapping is done for a property of property type quantity. you have to choose the unit of the values in the source file.

All imported values will be interpreted as values in this unit.

Table 4: Settings for each mapped column

Container relationship import

This import type also supports container relationships. It is possible to select name of a container feature as one of the columns. In this case all features created during import will be linked to a container feature, whose name is defined in the corresponding row. It is also possible to specify the spatial system tyoe and spatial reference system of container features. Non-existent container features will be created.

© 2021 AFRY Austria GmbH, www.redbex.com