Redbex Reference Manual - Data condensation: Simple

Zoom Window Out
Larger Text | Smaller Text
Hide Page Header
Show Expanding Text
Printable Version
Save Permalink URL

Navigation: Application Server > Processing > Calculated features > Calculation types > Generic calculation types

Data condensation: Simple

The aim of data condensation is to create a set of observations that has significant smaller count than the observation set of the source feature but is still a reasonable representation of the real world phenomenon. Condensing data might be a good choice to get a reasonable sized set of observations that can be manually examined, presented graphically, or that can be further processed with reasonable performance.

The data condensation: simple calculation type provides very simple condensation algorithms that do not find a reasonable representation of the real world phenomenon in all cases, but are fast in terms of processing and prove to be useful if you can make certain assumptions on the characteristics of your source data.

Characteristic	Description
Supports incremental execution	Yes
Output typing	Implicit by pin, derived from feature connected to input pin 1
Locking of source features	Observation modification, observation deletion
Spatial data handling	Copy

Table 1: Calculation brief

No	Name	Type, Constraint	Multiplicity (Min,Max)
1	Feature to process	Features or calculations	1,1

Table 2: Input pins

Configuration	Type	Notes	Default value
Include erroneous	Boolean	If set to true, erroneous observations of the source feature or calculation will be processed.	False
Condensation algorithm	Enumeration	Options are: •Each n-th •One per interval	100
Include last observation	Boolean	Setting for the Each n-th condensation algorithm. If true the last observation of the source feature or calculation will always be added to the output	false
n-th	Numeric	Setting for the Each n-th condensation algorithm. Defines which observations of the source feature are picked. Must be a integer number greater or equal 1.	10
Start timestamp	Date & time	Setting for the One per interval condensation algorithm. Defines the start time from which intervals are calculated. Can be left undefined.
Time interval	Time span	Setting for the One per interval condensation algorithm. Defines the time span. Must be a time span greater than 0.	24:00:00

Table 3: Configuration settings

If the calculation is the final calculation of the algorithm the used classifications of the source features or calculations has to be the used classification in the domain of the calculated feature. If this is not the case the calculation will fail.

Each n-th Algorithm

This algorithm simply takes each n-th observation of the source feature or calculation. Figure 1 shows a formal specification of the algorithm.

The configuration setting n-th influences the condensation ratio defined as NumberOfGeneratedObservations / NumberOfOriginalObservations. Higher values for n-th produce smaller condensation ratios. If set to true the configuration setting include last observation will always add the last observation of the source feature or calculation to the output.

Figure 2 shows an example output of the Each n-th algorithm.

The each n-th algorithm is useful when dealing with observations recorded at a high and regular sampling rate but with little changes in the observed property values and if the exact time when a property value changed is not important.

Figure 1: Selection of output observations by the Each n-th Algorithm.

Figure 2: Example of Each n-th algorithm

One per interval

This algorithm will choose only one observation per interval. The intervals are defined by the Start timestamp and Time interval configuration setting by using the algorithm shown in figure 3. Figure 4 shows which observation is picked from all the observations found in an interval.

This algorithm is useful if you have observations at a high and irregular sampling rate with little changes in the observed property values and if the exact time when a property value changed is not important.

Figure 3: Specification of the intervals for the on per interval algorithm

Figure 3: Picking one observation per interval

Examples for picking an observation from an interval:

•If the interval contains observations with sampling timestamps {t1, t2, t3, t4, t5}, t3 will be chosen.

•If the interval contains observations with sampling timestamps {t1, t2, t3, t4}, t2 will be chosen.

•If the interval contains observations with sampling timestamps {t1, t2}, t1 will be chosen.

•If the interval contains observations with sampling timestamps {t1}, t1 will be chosen.