This import format is a generic format that can be used to import features and observations data for any feature type. It can read data from a wide range of tabular-structured data files, by allowing you to define which column in the source file contains what information as well as many other settings. This import type gives you great flexibility to adapt to the structure of a given source file.
The file format expected by generic import type is a text file structured in lines and columns. The import type will try to detect the encoding of the source files automatically. Listing 1 shows the supported encodings.
A line in the source file must be terminated by line feed character (U+000A, UTF-8: 0x0A, typical for Unix, Linux, Android, Mac OS X, BSD, and other operating systems), a carriage return (U+000D, UTF-8: 0x0D typical for Mac OS till Version 9 and other operating systems), or a carriage return immediately followed by a line feed (typical for Windows operating systems).
Leading and trailing white space characters (see table 1) will be automatically trimmed from each line. If the column separator is set to tab (U+0009) tabs will not be trimmed. The first character of a line is defined as the first non-white space character, the last character as the last non-white space character.
Lines where the first characters (after trimming) equals the user-defined Comment Designator (see table 2) are treated as comment lines and will be ignored. If a comment designator is contained in a line, but is not the first character in the line it will be treated as data.
•UTF-8 •UTF-16 (BE and LE) •UTF-32 (BE and LE) •windows-1252 (mostly equivalent to iso8859-1) •windows-1251 and ISO-8859-5 (cyrillic) •windows-1253 and ISO-8859-7 (greek) •windows-1255 (logical hebrew. Includes ISO-8859-8-I and most of x-mac-hebrew) •ISO-8859-8 (visual hebrew) •Big-5 •gb18030 (superset of gb2312) •HZ-GB-2312 •Shift-JIS •EUC-KR, EUC-JP, EUC-TW •ISO-2022-JP, ISO-2022-KR, ISO-2022-CN •KOI8-R •x-mac-cyrillic •IBM855 and IBM866 •X-ISO-10646-UCS-4-3412 and X-ISO-10646-UCS-4-2413 (unusual BOM) •ASCII |
Listing 1: Encodings supported by this import format
Empty lines (i.e. lines that only contain white space characters) are ignored.
Each non-comment and non-empty line is separated into cells by the user-defined column separator character (see table 2).
Class |
Members |
Space Separators |
SPACE (U+0020), OGHAM SPACE MARK (U+1680), MONGOLIAN VOWEL SEPARATOR (U+180E), EN QUAD (U+2000), EM QUAD (U+2001), EN SPACE (U+2002), EM SPACE (U+2003), THREE-PER-EM SPACE (U+2004), FOUR-PER-EM SPACE (U+2005), SIX-PER-EM SPACE (U+2006), FIGURE SPACE (U+2007), PUNCTUATION SPACE (U+2008), THIN SPACE (U+2009), HAIR SPACE (U+200A), NARROW NO-BREAK SPACE (U+202F), MEDIUM MATHEMATICAL SPACE (U+205F), and IDEOGRAPHIC SPACE (U+3000) |
Line Separator |
LINE SEPARATOR character (U+2028) |
Other |
CHARACTER TABULATION (U+0009), LINE FEED (U+000A), LINE TABULATION (U+000B), FORM FEED (U+000C), CARRIAGE RETURN (U+000D), NEXT LINE (U+0085), and NO-BREAK SPACE (U+00A0). |
Table 1: White space characters that are removed from the beginning or end of a string when a string is trimmed
Setting |
Description |
General |
|
True value text |
If the string in a cell that is treated as Boolean cell equals this text, the value in this cell is treated as Boolean True. All other strings will be treated as false. This comparison is done is case sensitive on the trimmed (see table 1) value of a cell. |
Null value |
If the string in a cell matches this string the Application Server will treat this cell as Null. This comparison is done is case sensitive. |
Column separator |
This specifies the character used in the source files to delimit columns. You can choose from a list of predefined characters. |
Merge multiple separators |
If this flag is not set one column separator character advices the Application Server to skip to the next column. If this flag is set, consecutive column separator characters will be treated as a single separator character. |
Number of header lines |
This defines the number of lines from in the beginning of the file that are treated as header lines and will therefore not be parsed. |
Comment designator |
If a line in the source file starts with this text, it is treated as comment line, i.e. it will not be parsed by the import. |
Column mapping |
This setting allows you to define the column mapping, i.e. the association of columns in the source file with specific properties of the feature or observation to be imported. |
Localization Settings that are used to handle localized data. |
|
Date time format |
The pattern that is used to parse date time values given as string in the source files. See Date-time formatting for more information on date time patterns. Note that the Culture setting (see below) will also used for the interpretation of the date time pattern defined here. |
Culture |
The culture used for parsing date time values (see also above), numeric values etc. |
Table 2: Import type specific settings
Column Mapping
The flexibility of this import type is mainly founded in the possibility to map columns in the source files to specific properties or other characteristics of the feature or observation to be imported. The column mapping is part of the import type's specific settings. The column mapping consists of some general settings (see table 3) and a series of column associations (table 4).
Setting |
Description |
Store source files |
If enabled the source files that were imported with each import session created from this import definition, will be stored on the Application Server and linked to the import session. This option gives you the possibility to store the original data files along with your import session. Default: false |
Feature type |
Defines the feature type of the data to be imported. |
Container feature type |
This import format supports the creation of container features and the mapping of features to these container features. This setting defines the feature type of container features to be created. |
Feature name resolving |
With this import type you have different possibilities how the import shall detect the name of the feature for which data is imported. •Fixed feature: Import data for a fixed (see below) feature. •Read from file name: Read feature name from (a part of) the file name. You can use a regular expression to filter out the part of the file name that defines the feature name. •Read from column: Read feature name from a specific column of the source file. If set to this option you will have to define a column mapping for the feature name. |
Feature |
This setting is only available if the Feature name resolving is set to Fixed feature. Defines the feature for which data shall be imported. |
Container Feature |
This setting is only available if the Feature name resolving is set to Fixed feature. Defines the container feature for which data shall be imported. |
Regular expression for feature name |
This setting is only available if the Feature name resolving is set to Read from file name. Here you specify a regular expression that is used to define the part of the file name that specifies the feature name. The actual feature name will be the match for the first sub-expression of that regular repression. Example: Original file name: AB_123-4.csv Regular expression: AB_([0-9]{3}-[0-9])\.csv Resulting feature name: 123-4 |
Regular expression for container feature name |
This setting is only available if the Feature name resolving is set to Read from file name. See above for a description how to use regular expressions for defining the feature name. |
Spatial system of feature |
Defines the type of spatial system to use for newly created features. |
Spatial System of container feature |
Defines the type of spatial system to use for newly created container features. |
Axis reference system |
Only available if Spatial system of feature is set to Axis reference. Defines the axis reference that shall be used with newly created features. |
Table 3: General settings for the column mapping
Setting |
Description |
Column index in file |
This is the 1-based (meaning first column as index 1) index of the column(s) that is/are mapped to a single property or characteristic of a feature ore observation. User can define a single column index or a range or collection of column indices that will be read from the source file and written to a property that is mapped. Multiple columns can simply be defined by specifying a comma separated list of column indices, or column ranges. Examples: •1-3, 5 will map column 1, 2, 3 and 5 •7-9 or will map column 7, 8, 9 •1, 2, 3, 4, 7-9 will map column 1, 2, 3, 4, 7, 8, 9 Multiple column indices will only be respected if you choose Join Columns with delimiter as pre-processing type (see below). If not only the first index will be mapped. |
Pre-processing type. |
You can select a pre-processing step that is performed before the source string is treated as value. Following pre-processing types are available: •None: No pre-processing will be performed •Join Column with delimiters: Multiple columns in the source file are concatenated using the specified delimiter. Example: •Input file has following data: •Expected output is •Then user can configure the import like so: |
Post-processing type |
Post processing takes the result of the pre-processing as input and processes the text accordingly. If there is no pre-processing then it simply takes the content of a single column. Post processing takes place after pre-processing, the output of the post processing is then mapped to the property of the feature. •None: No post processing is performed. •Split and take word: Split the value with given split character and take the word at given (1 based) index . •Skip from beginning and end: Defines how many characters to be ignored from the beginning and from the end of the input text. This processor has two settings: Characters to skip at the beginning and characters to skip at the end. Example: the post processor gets an text: "abc4957xyz" as input. User wants to extract the numeric part of the text i.e. 4957 and map to a column. The post processing settings will be: Characters to skip at the beginning: 3, Characters to skip at the end: 3. Output is: 4957 |
Property unit |
If a mapping is done for a property of property type quantity. you have to choose the unit of the values in the source file. All imported values will be interpreted as values in this unit. |
Table 4: Settings for each mapped column
Container relationship import
This import type also supports container relationships. It is possible to select name of a container feature as one of the columns. In this case all features created during import will be linked to a container feature, whose name is defined in the corresponding row. It is also possible to specify the spatial system tyoe and spatial reference system of container features. Non-existent container features will be created.