2013-12-29

NullAttributeMapper Use Case: Translate Mesh Code to Geometry

(FME 2014 Beta build 14223)

There is a standard mesh system to divide all Japan area into uniform rectangular areas in geographic coordinate system.
mesh typemesh widthmesh height
Primary1 degree2/3 degrees (40 minutes)
Secondary7.5 minutes (Primary width / 8)5 minutes (Primary height / 8)
Tertiary45 seconds (Secondary width / 10)30 seconds (Secondary height / 10)

Every mesh has unique code to identify its location; the code is defined by the left-bottom corner coordinate (latitude, longitude) of the mesh.
mesh typecode formatdefinition of code elements
PrimaryAABBAA = North latitude * 1.5, BB = East longitude - 100
SecondaryAABB-CDC = Row index, D = Column index in the Primary Mesh
TertiaryAABB-CD-EFE = Row index, F = Column index in the Secondary Mesh
* Row / Column index are 0-based; start from south-east corner in the higher rank mesh.
* Hyphens may be omitted.

For example, the left-bottom coordinate of "5438-23-45" tertiary mesh is:
Lat. = 54 / 1.5 (deg.) + 2 * 5 (min.) + 4 * 30 (sec.) = 36.2 (deg.) = N 36d 12m 00s
Lon. = (38 + 100) (deg.) + 3 * 7.5 (min.) + 5 * 45 (sec.) = 138.4375 (deg.) = E 138d 26m 15s

The mesh system is a national standard of Japan. It's used frequently in various geographic data processing scenarios, I often translate mesh codes to geometries with FME workspace.
The following workflow example creates a rectangular polygon representing a mesh area based on the mesh code. Assume every input feature has a primary, secondary or tertiary mesh code as its attribute named "_mesh_code".

Bookmark 1
The three StringSearchers validate format of the mesh code and split it into code elements - AA BB [C D [E F]], the elements will be stored in a list attribute named "_mcode{}". Then, the AttributeCreator calculates mesh width and height depending on mesh type. Mesh type can be determined based on the number of code elements, i.e. a primary mesh has 2, a secondary mesh has 4 and a tertiary mesh has 6 elements.











Bookmark 2
The AttributeCreator calculates coordinate (latitude, longitude) of the mesh left-bottom corner based on the code elements. Here, the number of list elements from Bookmark 1 is variable - 2, 4, or 6. So I've used the NullAttributeMapper beforehand so that the number of elements would be 6 and also values of the ex-missing-elements would be 0. Thereby, the AttributeCreator can always calculate without any error.










The workflow above is a use case of the NullAttributeMapper. I think it can be also used in many cases other than handling <null>.
And I guess many countries have standards about geographic data. Hope the information will be exchanged through various channels.

====
2014-01-02: Number of StringSearchers (Bookmark 1) can be reduced to just only one with this regular expression.
-----
^([0-9]{2})([0-9]{2})(-?([0-7])([0-7])(-?([0-9])([0-9]))?)?$
-----
But in that case, since the number of list elements will be always 8, mesh type cannot be determined based on it. Other approach will be necessary instead of the ListElementCounter and the AttributeCreator. And, the NullAttributeMapper (Bookmark 2) settings should be:
- Map: Selected Attributes
- Selected Attributes: _mcode{3} _mcode{4} _mcode{6} _mcode{7}
- If Attribute Value Is: Empty
- Map To: New Value
- New Value: 0
=====

=====
Yes, I like scripting :-)
-----
# Python Example (FME 2014 Beta build 14223)
# 2013-12-30 Updated
import fmeobjects, re

def createMeshPolygon(feature):
    mcode = feature.getAttribute('_mesh_code')
    if mcode == None:
        return

    mcode = str(mcode)
    mtype, matched = None, None
    matchers = [(1, re.compile('^([0-9]{2})([0-9]{2})$')),
                (2, re.compile('^([0-9]{2})([0-9]{2})-?([0-7])([0-7])$')),
                (3, re.compile('^([0-9]{2})([0-9]{2})-?([0-7])([0-7])-?([0-9])([0-9])$'))]
    for t, m in matchers:
        matched = m.match(mcode)
        if matched:
            mtype = t
            break
    if matched:      
        codes = [int(n) for n in matched.groups()]
        for i in range(6 - len(codes)):
            codes.append(0)
         
        w = {1: 1.0, 2: 7.5 / 60.0, 3: 45.0 / 3600.0}
        h = {1: 40.0 / 60.0, 2: 5.0 / 60.0, 3: 30.0 / 3600.0}
        xmin = codes[1] + 100.0 + codes[3] * w[2] + codes[5] * w[3]
        ymin = codes[0] / 1.5 + codes[2] * h[2] + codes[4] * h[3]
        xmax, ymax = xmin + w[mtype], ymin + h[mtype]
     
        bndry = fmeobjects.FMELine([(xmin, ymin), (xmin, ymax), (xmax, ymax), (xmax, ymin)])
        feature.setGeometry(fmeobjects.FMEPolygon(bndry))
        feature.setAttribute('_mesh_type', mtype)
-----
=====
# Tcl Example (FME 2014 Beta build 14223)
# 2013-12-30 Updated
proc createMeshPolygon {} {
  array set formats [list  \
      1 {^([0-9]{2})([0-9]{2})$}  \
      2 {^([0-9]{2})([0-9]{2})-?([0-7])([0-7])$}  \
      3 {^([0-9]{2})([0-9]{2})-?([0-7])([0-7])-?([0-9])([0-9])$}]
  set mcode [FME_GetAttribute "_mesh_code"]
  set mtype 0
  foreach {t} {1 2 3} {
    if {[regexp $formats($t) $mcode m aa bb c d e f]} {
      set mtype $t
      break
    }
  }
  if {0 < $mtype} {
    if {[string length $e] < 1} {
      set e [set f 0]
      if {[string length $c] < 1} {
        set c [set d 0]
      }
    }
    array set w [list 1 {1.0} 2 [expr 7.5 / 60.0] 3 [expr 45 / 3600.0]]
    array set h [list 1 [expr 2 / 3.0] 2 [expr 5 / 60.0] 3 [expr 30 / 3600.0]]
    set xmin [expr $bb + 100.0 + $d * $w(2) + $f * $w(3)]
    set ymin [expr $aa / 1.5 + $c * $h(2) + $e * $h(3)]
    set xmax [expr $xmin + $w($mtype)]
    set ymax [expr $ymin + $h($mtype)]
    foreach {x y} [list $xmin $ymin $xmin $ymax $xmax $ymax $xmax $ymin] {
      FME_Coordinates addCoord $x $y
    }
    FME_Coordinates geomType fme_polygon
    return $mtype
  }
}
-----

2013-12-23

Null in FME 2014: Handling Null with Python / Tcl

Important, 2014-01-29: I heard that Safe is planning to change the implementation of Python API fmeobjects.FMEFeature.getAttribute() method, so that it returns an empty string when specified attribute stores <null>. Currently - FME 2014 build 14234 - it returns "None" in that case.
After confirming the change, I will revise related descriptions (underlined) in this article.
-----
2014-02-14: I noticed that the method in FME 2014 SP1 Beta (build 14255) returns an empty string for <null>. The change of implementation seems to be done for SP1.
-----
2014-02-25: The change about FME Objects Python API has been announced. I revised related descriptions in this article (underlined).

(FME 2014 Beta build 14223)

These Python API methods and Tcl procedures have been added in FME 2014 to handle <null> attributes appropriately.
New Python Methods   FMEFeature.getAttributeNullMissingAndType(attrName)
FMEFeature.setAttributeNullWithType(attrName, attrType)
New Tcl ProceduresFME_IsAttributeNull attrName
FME_SetAttributeNull attrName

I tried them to learn those functions and usage, this article summarizes the result. If there are wrong descriptions, please point them out.
Official descriptions on those methods / procedures can be seen in
- FME Objects Python API: [FME_HOME]/fmeobjects/python/apidoc/index.html
- FME pre-defined Tcl procedures: TclCaller transformer help documentation

1. Determine if an attribute contains <null>
Python: FMEFeature.getAttributeNullMissingAndType Method

FME 2014 SP1+ (build 14252 or later):
FMEFeature.getAttribute method returns an empty string when specified attribute is <null>.
If it's necessary to distinguish <null> from empty string in the script, we have to use the getAttributeNullMissingAndType method (added in FME 2014).

FME 2014 without SP*:
FMEFeature.getAttribute method returns None when specified attribute is <null> or <missing>.
If it's necessary to distinguish <null> from <missing> in the script, we have to use the getAttributeNullMissingAndType method (added in FME 2014).

The method returns a tuple consisting of 3 elements - null flag (boolean), missing flag (boolean) and data type identifier (int). The null flag indicates whether the attribute contains <null>, and the missing flag indicates whether the attribute is <missing>.
For example, when a feature has two attributes
  attrEmpty = <empty string>
  attrNull = <null>
a PythonCaller with this script replaces them to some string value indicating the original status.
-----
import fmeobjects
def testNullMissingEmpty(feature):
    for name in ['attrEmpty', 'attrNull', 'attrMissing']:
        value = ''
        isNull, isMissing, type = feature.getAttributeNullMissingAndType(name)
        if isMissing:
            value = 'this was missing'
        elif isNull:
            value = 'this was null'
        else:
            value = feature.getAttribute(name)
            if len(str(value)) < 1:
                value = 'this was empty'
        feature.setAttribute(name, value)
-----

Tcl: FME_IsAttributeNull Procedure
A pre-defined Tcl procedure named FME_IsAttributeNull has been added in FME 2014.
A TclCaller with this script does the same job as the PythonCaller above.
-----
proc testNullMissingEmpty {} {
  foreach name {"attrEmpty" "attrNull" "attrMissing"} {
    set value {}
    if {[FME_AttributeExists $name] == 0} {
      set value "this was missing"
    } elseif {[FME_IsAttributeNull $name]} {
      set value "this was null"
    } else {
      set value [FME_GetAttribute $name]
      if {[string length $value] < 1} {
        set value "this was empty"
      }
    }
    FME_SetAttribute $name $value
  }
}
-----









2. Set <null> to attributes
Python: FMEFeature.setAttributeNullWithType Method
If we need to set <null> to an attribute, the FMEFeature.setAttributeNullWithType method (added in FME 2014) can be used.
A PythonCaller with this script example replaces "toNull" with <null>, replaces "toEmpty" with <empty string>, and removes attributes containing "toMissing".
-----
import fmeobjects
def mapToNullMissingEmpty(feature):
    for name in feature.getAllAttributeNames():
        isNull, isMissing, type = feature.getAttributeNullMissingAndType(name)
        if not isNull and not isMissing:
            value = str(feature.getAttribute(name))
            if value == 'toNull':
                feature.setAttributeNullWithType(name, type)
            elif value == 'toEmpty':
                feature.setAttribute(name, '')
            elif value == 'toMissing':
                feature.removeAttribute(name)
-----

Tcl: FME_SetAttributeNull procedure
A pre-defined Tcl procedure named FME_SetAttributeNull has been added in FME 2014.
A TclCaller with this script does the same job as the PythonCaller above.
-----
proc mapToNullMissingEmpty {} {
  foreach name [FME_AttributeNames] {
    set value [FME_GetAttribute $name]
    if {[string compare $value "toNull"] == 0} {
      FME_SetAttributeNull $name
    } elseif {[string compare $value "toEmpty"] == 0} {
      FME_SetAttribute $name {}
    } elseif {[string compare $value "toMissing"] == 0} {
      FME_UnsetAttributes $name
    }
  }
}
-----









3. Handle <null> elements in a list attribute
Python FMEFeature.getAttribute method returns a string list when an existing list attribute name (e.g. "_list{}") is specified to its argument. It's very convenient functionality to handle list attributes easily; I have often used it.
But now (in FME 2014+), we should be aware that every <null> element will be interpreted to empty string in that case.

For example, "copyList1" function in the following script copies
_src{} = A,<null>,B,<null>,C
to
_dest{} = A,,B,,C
-----
import fmeobjects
def copyList1(feature):
    src = feature.getAttribute('_src{}')
    feature.setAttribute('_dest{}', src)
-----

If <null> elements in the source list have to be treated as <null> in the destination list too, the script should be like this. Assume there is no <missing> element in the source list.
-----
import fmeobjects
def copyList2(feature):
    i = 0
    while True:
        isNull, isMissing, type = feature.getAttributeNullMissingAndType('_src{%d}' % i)
        if isMissing:
            break
        if isNull:
            feature.setAttributeNullWithType('_dest{%d}' % i, type)
        else:
            feature.setAttribute('_dest{%d}' % i, feature.getAttribute('_src{%d}' % i))
        i += 1
-----

I think the difference between the results of "copyList1" and "copyList2" should be memorized.

Alternatively, if the number of list elements has been stored as an attribute (e.g. _element_count) beforehand, this script makes the same result. The number of list elements can be stored with the ListElementCounter transformer.
2014-01-29: If implementation of getAttribute changed, this script would not be able to determine whether the value is <null>. It would bring unexpected result.
I will remove the description and this script after confirming the change.
-----
2014-02-25: Removed.
-----
import fmeobjects
def copyList2(feature):
    num = int(feature.getAttribute('_element_count'))
    for i in range(num):
        value = feature.getAttribute('_src{%d}' % i)
        if value == None:
            type = feature.getAttributeType('_src{%d}' % i)
            feature.setAttributeNullWithType('_dest{%d}' % i, type)
        else:
            feature.setAttribute('_dest{%d}' % i, value)
-----

The Tcl script performing the same job could be simpler a little.
-----
proc copyList2 {} {
  for {set i 0} {[FME_AttributeExists "_src{$i}"]} {incr i} {
    if {[FME_IsAttributeNull "_src{$i}"]} {
      FME_SetAttributeNull "_dest{$i}"
    } else {
      FME_SetAttribute "_dest{$i}" [FME_GetAttribute "_src{$i}"]
    }
  }
}
-----

2013-12-21

Default Attribute Names of XLSXR in FME 2014

The Excel reader (XLSXR) creates default attribute names when the user doesn't specify "Field Names Row".
In FME 2013, those are formatted in "col_**" (** is 1-based sequential number), but the manner has been changed in FME 2014. The default attribute names will be equal to Excel column names, i.e. A, B, C, ...
Since many Excel users are familiar with the column names, this change would be welcomed. I also basically welcome it.

Now, I have several workspaces which have Excel readers. Since source Excel spread sheets don't have any available field names row, the workspaces rename old style default names (col_**) to appropriate names using the SchemaMapper. And I have created common schema definition tables for the SchemaMappers and Dynamic Schema writers.
The same way could be effective to new similar workspace which has to read Excel spread sheets without field names row.
In such a case, I want to use old style default attribute names. Because A, B, C, ... have to be typed manually when creating schema definition tables; it's not only troublesome but also easy to make typos. "col_**" can be input easily by drag copying on an Excel spread sheet.

As a workaround, I'm thinking of a processing with a PythonCaller which renames the Excel column names (A, B, C) to the old style default attribute names (col_**).
This script is a prototype, not tested enough.
-----
import fmeobjects, re

def replaceXlsColumnNames(feature):
    for name in feature.getAllAttributeNames():
        if re.match('^[A-Z]+$', name):
            num, m = 0, 1
            for i in range(len(name) - 1, -1, -1):
                num += (ord(name[i]) - 64) * m
                m *= 26
            value = feature.getAttribute(name)
            feature.setAttribute('col_%d' % num, value)
            feature.removeAttribute(name)
-----

Important, 2014-02-05: I heard that Safe is planning to change the implementation of Python API fmeobjects.FMEFeature.getAttribute() method, so that it returns an empty string when specified attribute stores <null>. Currently - FME 2014 build 14234 - it returns "None" in that case.
After confirming the change, I will revise related descriptions (underlined) in this article.
-----
2014-02-14: I noticed that the method in FME 2014 SP1 Beta (build 14255) returns an empty string for <null>. The change of implementation seems to be done for SP1.
-----
2014-02-25: The change about FME Objects Python API has been announced. I revised related descriptions in this article (underlined).
=====
2013-12-22: There are two issues in the script example above.

1) It cannot distinguish null and missing attributes from others.
FME 2014 SP1+ (build 14252 or later):
If specified attribute is missing, fmeobljects.FMEFeature.getAttribute method returns None, and setAttribute method throws an error when receiving it. getAttribute method returns an empty string for <null>.
FME 2014 without SP*:
If specified attribute is null or missing, fmeobjects.FMEFeature.getAttribute method returns None, and setAttribute method throws an error when receiving it.

Note: XLSXR has "Read blank cells as" parameter, user can select "Null" or "Missing".

2) When the schema of input features is the same (not vary), it's inefficient to create new attribute names for every input feature. But any feature may have missing attributes, so it's necessary to check all attribute names for every feature.

This is an improved version. Also a little more Pythonic? (FME 2014 Beta build 14223)
-----
import fmeobjects, re

class XlsColumnNamesReplacer(object):
    def __init__(self):
        self.mapper = []
        self.oldNames = set([])

    def input(self, feature):
        allNames = [a for a in feature.getAllAttributeNames() if re.match('^[A-Z]+$', a)]
        for name in set(allNames) - self.oldNames:
            num, m = 0, 1
            for i in [ord(c) - 64 for c in name[::-1]]:
                num += i * m
                m *= 26
            self.mapper.append((name, 'col_%d' % num))
            self.oldNames.add(name)
        for oldName, newName in self.mapper:
            isNull, isMissing, type = feature.getAttributeNullMissingAndType(oldName)
            if isNull:
                feature.setAttributeNullWithType(newName, type)
            elif not isMissing:
                feature.setAttribute(newName, feature.getAttribute(oldName))
            feature.removeAttribute(oldName)
        self.pyoutput(feature)
     
    def close(self):
        pass
-----
fmeobjects.FMEFeature.getAttributeNullMissingAndType and setAttributeNullWithType are new methods added in FME 2014. > Null in FME 2014: Handling Null with Python / Tcl

=====
2103-12-23: The TclCaller can be also used. Since FME pre-defines FME_RenameAttribute procedure, it's not necessary to care about <null> and <missing> when renaming attributes.
-- Why doesn't Python API provide a method to rename attributes?
Anyway, this is a Tcl script example.
-----
set oldNames {}
set newNames {}

proc replaceXlsColumnNames {} {
  global oldNames newNames
  foreach name [FME_AttributeNames] {
    if {[lsearch -exact $oldNames $name] < 0 && [regexp {^[A-Z]+$} $name]} {
      set num 0
      set m 1
      foreach ch [lreverse [split $name {}]] {
        set num [expr $num + ([scan $ch %c] - 64) * $m]
        set m [expr $m * 26]
      }
      lappend oldNames $name
      lappend newNames "col_$num"
    }
  }
  for {set i 0} {$i < [llength $oldNames]} {incr i} {
    FME_RenameAttribute [lindex $newNames $i] [lindex $oldNames $i]
  }
}
-----
For processing <null> attributes in Tcl script, FME_IsAttributeNull and FME_SetAttributeNull procedures have been added in FME 2014. > Null in FME 2014: Handling Null with Python / Tcl

Null in FME 2014: Converting Null to Non-Null

(FME 2014 Beta Build 14223)

The NullAttributeMapper transformer has been added to FME 2014.
It can be used to convert between <null> and non-null; the functionality of the NullAttributeReplacer transformer in FME 2013 has been integrated to it.

This workspace example creates a feature having these attributes:
attrValue = NULL  *string value. not <null>
attrNull = <null>
attrEmpty = <empty>
attrMissing = <missing>
Note: Saying strictly, the feature doesn't have "attrMissing" although it appears on the Canvas. The AttributeExposer exposes attribute names, but it doesn't create any real content of attributes.







The NullAttributeMapper with this setting replaces <null>, <empty> and <missing> with a non-null value - "NULL".









The conversion is reversible. This setting replaces an attribute value with <null> if the attribute holds "NULL" or <empty>,  or is <missing>.









One of "Null", "Missing", "Empty String" and "New Value" can be specified to the replacement value, i.e. "Map To" parameter. If specifying "New Value", "New Value" parameter has to be specified.
Both "Or If Attribute Value Is" and "New Value" parameter can be also set as Attribute Value, String / Math Expression, Parameter or Conditional Value.

The NullAttributeMapper has enough functionality for converting between <null> and non-null. I think it would become one of the most frequently used transformers for workspaces performing <null> operation.

If regular expression could be used as the matching condition, it could become more powerful.

2013-12-19

Null in FME 2014: Setting and Testing

(FME 2014 Beta Build 14220)

FME 2014 starts supporting null as attribute values.
In my understanding, null is a special value, should be distinguished from both an empty string and a missing (not existing) attribute.

We can set null to an attribute using the AttributeCreator.












And the Tester can be used to test whether an attribute value is null. This test condition passes the input feature if "attrNull" holds null.















The Tester provides "Attribute Is Null", "Attribute Is Empty String" and "Attribute Is Missing" operators, and those can strictly determine whether the Left Value is null, empty string or missing.

Well, "attrEmpty" holds an empty string now. If I set the test condition like this, the feature goes to which port, Passed or Failed?















I thought the feature would go to the Failed port, but the result was not so. The feature goes to the Passed port. I got a little confused. How should I think of this result?
Null is a special value, but it could be treated as an empty string in some cases?

Have to continue to explore about the null.
=====
2013-12-20: I noticed that the feature goes to the Passed port even if "attrEmpty" is missing. null seems to be also treated as missing attribute in some cases. hmm...

=====
2013-12-21: Tai clarified the design concept on comparing null, empty string and missing attribute. See his comment.
The mist in my brain has cleared up. Thanks, Tai.

2013-12-17

Japanese Character Problems in FME 2014 Beta

=====
2014-01-16: FME 2014 (build 14230) has been released.
Almost all the Japanese character problems has been solved. Wonderful!
This article ends up here. Thanks.
=====
2014-01-12: I'm happy that the remaining issue (PR#50788) has been fixed in build 14229 :-)
Release coming soon!
=====
2013-12-22: I tried FME 2014 Beta build 14223 for Mac OS X in Japanese environment.
It works fine. There were some problems related to Japanese characters in build 14205, but those have been solved now, and the workspace files (*.fmw) are compatible between Mac and Windows. Great.
=====
2013-12-19: FME 2014 Beta build 14220 works fine in Japanese Windows. Great!
I confirmed that all the problems I reported last week have been solved. Although garbled characters sometimes appear on Log window, those are not so serious issues for me. I believe that those will be also solved in the near future.
Many many thanks for Safe's efforts.
=====

Previous: Japanese character problem
I tried FME 2014 Beta build 14212 - 14218. The result, almost all the Japanese character problems occurring in FME 2013 or earlier have been solved. Wonderful!
Many thanks, Safe and the great developers.

However, found several new problems related to Japanese character unfortunately. I guess that some of them are side-effects caused by changing encoding of fmw file.
I've reported detailed situation to Safe via my reseller already, would appreciate some more efforts to solve them. Thanks in advance.

2013-12-15

Conditional Execution based on Feature Existence

(FME 2013 SP4 Build 13547)

From this thread. > Community: conditional execution of densifying
To simplify explanation, assume there are two feature types - e.g. TypeA and TypeB; TypeA features have to be processed only when one or more TypeB feature exists. In other words, it's not necessary to process TypeA features when there is no TypeB feature.

My first inspiration was to use unconditional merging like this image.
The Sampler and the AttributeKeeper are not essential but I think those are effective to prevent unnecessary processing in the FeatureMerger as much as possible.
About unconditional merging, see "Join On" Parameter of the FeatureMerger.










A PythonCaller followed by a FeatureTypeFilter would be also one of options.
-----
# Python Script Example
# Hold every features, output them only when a TypeB feature exists.
# If there is no TypeB feature, output a Signal feature.
import fmeobjects

class FeatureDispatcher(object):
    def __init__(self):
        self.typeAFeatures = []
        self.typeBFeatures = []
     
    def input(self, feature):
        type = feature.getAttribute('fme_feature_type')
        if type == 'TypeA':
            self.typeAFeatures.append(feature)
        elif type == 'TypeB':
            self.typeBFeatures.append(feature)
     
    def close(self):
        if 0 < len(self.typeBFeatures):
            map(self.pyoutput, self.typeAFeatures)
            map(self.pyoutput, self.typeBFeatures)
        else:
            logger = fmeobjects.FMELogFile()
            logger.logMessageString('No TypeB Feature.', fmeobjects.FME_WARN)
            signal = fmeobjects.FMEFeature()
            signal.setAttribute('fme_feature_type', 'Signal')
            self.pyoutput(signal)
-----








Not only for this case, I think that various Transformers (maybe within Workflow category) for flow control can be considered.

Note: fmeobjects.FMEFeature class (FME Objects Python API) has "getFeatureType" and "setFeatureType" methods. I thought I can use them to get / set "fme_feature_type" attribute value (i.e. feature type name), but it is not always so in my testing. I couldn't confirm exact functions of those methods.

2013-12-14

Convert Simple List to Complex List

(FME 2013 SP4 Build 13547)

List attributes often appear in FME workspace, and also take important role to achieve the project purpose in many cases.

"simple list" (or just called "list") and "complex list" are types of list attribute.
I don't know whether those type names are official terminologies, but "complex list" is used in this documentation. David pointed it out before, thanks.
> FMEpedia: List Attributes

Simple List Example:
_list{0}, _list{1}, _list{2}, ...

Complex List Example:
_list{0}.foo, _list{1}.foo, _list{2}.foo, ...
_list{0}.bar, _list{1}.bar, _list{2}.bar, ...

Note: There is one more list type - "nested list", but I don't touch it in this article.
=====
2013-12-18: "complex list" is also called "structured list" in other documentation.
"The function also accepts a "structured list" specification, such as "attrInfo{}.name", ..." 
-- description about FMEFeature.getAttribute function, FME Objects Python API, [FME_HOME]/fmeobjects/python/apidoc/index.html (FME 2014 Beta build 14218)
=====

Well, there are two feature types named "foo" and "bar". Both of them have a simple list attribute named "_list{}", and also have a merging key attribute named "_key".
Assume that both "foo" and "bar" don't have list attributes other than "_list{}".

Consider merging those features.  Merged features should have a complex list attribute which retains every element of the original list attributes from "foo" and "bar" features.
i.e.
_list{i} of "foo" should be converted to _list{i}.foo
_list{i} of "bar" should be converted to _list{i}.bar

I found the BulkAttributeRenamer can be used to do that (FME 2013 Build 13547).
This workflow does it. The functionality is similar to "zip" function of Python.















The "zip trick" with the BulkAttributeRenamer and the FeatureMerger flashed on this thread.
Although the SchemaMapper would be suitable in this case, I think there should be some cases that "zip trick" can be effective.

Moreover, I expect that the ListRenamer transformer will be upgraded to have new options for conversion between simple list and complex list.

=====
2013-12-17: A complex list (e.g. _list{}.foo) can be converted to a simple list (e.g. _list{}) using a BulkAttributeRenamer with this setting as well, for what it's worth.
-----
Rename: All Attributes
Action: Regular Expression Replace
Text to Find: }.foo$
String: }
-----
=====
2014-01-27: The BulkAttributeRenamer of FME 2014 (build 14234) also works for converting type of list with the way mentioned above. However, I noticed that the list name shown on the Canvas will not change, even though internal list name has been changed.
As a meantime workaround, the AttributeExposer can be used to expose the correct list name after renaming. But the original simple list name ("_list{}") cannot be hidden (:-(









=====
2014-01-28: I noticed the PythonCaller can be used to expose and hide list names. Just specify "Attributes to Expose" and "Lists to Hide" parameters. The script doesn't need to do any thing.
-----
def processFeature(feature):
    # do nothing
    pass
-----
Although it's a weird usage, can be effective.
Of course using the AttributeExposer or the PythonCaller is a temporary workaround. I expect the BulkAttributeRenamer will be fixed in the near future.

2013-12-13

Date/Time Calculation with DateFormatter

I didn't know that the DateFormatter transformer can be used to perform date/time calculation based on the current date/time using various representations.
> Community: Current Date function
> Community: Help with testing for date?
> FMEpedia: Formating Dates Using the DateFormatter Transformer

Followings are valid representation examples. An attribute value matched with one of these representations can be replaced with formatted date(time) string using the DateFormatter.
(tested in FME 2013 SP4 Build 13547)
-----
now
today
tomorrow
yesterday
last Sunday
next Sunday

next week
next month
next year

last week
last month
last year

2 days
2 weeks
2 months
2 years
2 years 2 months 2 days

2 days ago
2 weeks ago
2 months ago
2 years ago
2 years 2 months 2 days ago

2 seconds
2 minutes
2 hours
2 hours 2 minutes 2 seconds

2 seconds ago
2 minutes ago
2 hours ago
2 hours 2 minutes 2 seconds ago
-----
Interesting. There could be more variations.

Well, today is Friday 2013-12-13 (Friday the 13th!).
"last Sunday" was replaced with "2013-12-08", it's good.
"next Sunday" was "2013-12-22", but in Japanese, 「次の日曜日」 (means "next Sunday" by literal translation) usually points "2013-12-15". Is there any difference in culture?
中国朋友们,「下星期日」也是"2013-12-15"吧?

=====
2013-12-14: Mark gave me an answer to my last question. See his comment.
English I learned in junior high maybe was British. FME was born in Canada, the mother tongue would be North American English.
In the DateFormatter, just "Sunday" will be replaced with "2013-12-15".
Today is Saturday, 2013-12-14. I got the following results.
-----
RepresentationFormatted
last Saturday2013-12-07
last Sunday2013-12-08
last Monday2013-12-09
...
Saturday2013-12-14
Sunday2013-12-15
Monday2013-12-16
...
next Saturday2013-12-21
next Sunday2013-12-22
next Monday2013-12-23
-----
Probably I use those representations rarely in practical workspaces, but a rarer case should be treated more carefully. Anyway, we have to be aware it when making an appointment in North America.
Thanks, Mark.

2013-12-07

Tcl is Useful for Geometric Operations?

Through learning about the basic usage of Tcl in FME, I understand that Tcl is very useful for string processing. But, comparing with Python, I didn't think Tcl in FME has much capability for geometric operations. Really so?

Consider creating a line geometry based on a list attribute containing coordinates.
Assuming that the input feature has no geometry, but has a list attribute like this.
_coord{}.x
_coord{}.y
_coord{}.z

TclCaller with this compact script creates a 2D line geometry from the list.
-----
proc createLine {} {
  for {set i 0} {[FME_AttributeExists "_coord{$i}.x"] == 1} {incr i} {
    set x [FME_GetAttribute "_coord{$i}.x"]
    set y [FME_GetAttribute "_coord{$i}.y"]
    FME_Coordinates addCoord  $x $y
  }
}
-----
Fine.

Then, create a 3D line. I expected this script would work fine as well.
-----
proc create3DLine {} {
  FME_Coordinates dimension 3
  for {set i 0} {[FME_AttributeExists "_coord{$i}.x"] == 1} {incr i} {
    set x [FME_GetAttribute "_coord{$i}.x"]
    set y [FME_GetAttribute "_coord{$i}.y"]
    set z [FME_GetAttribute "_coord{$i}.z"]
    FME_Coordinates addCoord  $x $y $z
  }
}
-----
But it failed unfortunately. Created line was still in 2D, Z-values were missing.
It seems that "dimension" option of FME_Coordinates command is invalid when the feature has no geometry. If I set dimension to 3D after creating line, every Z-value becomes 0 although the line becomes 3D. It works like the 3DForcer transformer, but is not the expected functionality.

A workaround I found is: Add a dummy coordinate, set dimension to 3D, and remove the dummy coordinate before creating a line.
Note: There is more appropriate workaround as after-mentioned (2013-12-14).
-----
proc create3DLine {} {
  FME_Coordinates addCoord 0 0 0
  FME_Coordinates dimension 3
  FME_Coordinates resetCoords
  for {set i 0} {[FME_AttributeExists "_coord{$i}.x"] == 1} {incr i} {
    ...
  }
}
-----
It worked, but I don't like wasting steps for the dummy coordinate.
I expect the geometry to become 3D automatically when Z is given as the third argument for "FME_Coordinates addCoord".

=====
2013-12-14
I had requested Safe support about this problem; they provided a more appropriate workaround. That is, to set a geometry type beforehand.
If I set both geometry type and dimension like this, a 3D line will be created expectedly. It seems that "dimension" option will be valid after setting geometry type.
This is more desirable script. Thanks, Dan@Safe.
-----
proc create3DLine {} {
  FME_Coordinates geomType fme_line
  FME_Coordinates dimension 3
  for {set i 0} {[FME_AttributeExists "_coord{$i}.x"] == 1} {incr i} {
    ...
  }
}
=====

For geometric operations, FME_Execute procedure can be also used. FME_Execute calls an FME Function directly.
"Close" function, for example, can be used to change a line to a polygon when the line consists of 3 or more coordinates. It's similar to the LineCloser transformer.
Just append a line to the procedure like this.
-----
proc createPolygon {} {
  for { ...
    ...
  }
  FME_Execute Close
}
-----

There are many FME Functions for geometric operations. If we could see detailed documentations about them, Tcl could be used much more effectively.
I remember that we could see the documentation "FME Functions and Factories" a few years ago, but it cannot be accessed now (2013-12-07).
FMEpedia > Documentation: FME Functions and Factories

...so, my answer to the title is maybe a little negative, currently.
(FME 2013 SP4 Build 13547)
=====
2014-03-21: I discovered that the documentation has been enabled. So I now change my answer to the title into "positive"!
> FME Factory and Function Documentation

=====
2013-12-14: Found another way to change an unclosed line to a polygon.
FME_Coordinates procedure with "geomType" option can be also used to do that.
If I set explicitly the geometry type to "fme_polygon" before or after creating an unclosed line, then the line will be closed automatically and the resultant geometry will be a polygon.
-----
  FME_Coordinates geomType fme_polygon
-----
There seems to be many things which are not told in published documentations.

=====
2013-12-08: For comparison...
-----
# Python Script Example: Create 2D Line
import fmeobjects
def createLine(feature):
    xs = feature.getAttribute('_coord{}.x')
    ys = feature.getAttribute('_coord{}.y')
    coords = [(float(x), float(y)) for x, y in zip(xs, ys)]
    feature.setGeometry(fmeobjects.FMELine(coords))
-----
# Python Script Example: Create 3D Line
import fmeobjects
def create3DLine(feature):
    xs = feature.getAttribute('_coord{}.x')
    ys = feature.getAttribute('_coord{}.y')
    zs = feature.getAttribute('_coord{}.z')
    coords = [(float(x), float(y), float(z)) for x, y, z in zip(xs, ys, zs)]
    feature.setGeometry(fmeobjects.FMELine(coords))
-----
# Python Script Example: Create 3D Polygon
import fmeobjects
def createPolygon(feature):
    xs = feature.getAttribute('_coord{}.x')
    ys = feature.getAttribute('_coord{}.y')
    zs = feature.getAttribute('_coord{}.z')
    coords = [(float(x), float(y), float(z)) for x, y, z in zip(xs, ys, zs)]
    boundary = fmeobjects.FMELine(coords)
    feature.setGeometry(fmeobjects.FMEPolygon(boundary))
-----

Of course FME Transformers without any scripting can do those operations.
I think a general approach will be like this.
ListExploder --> 2D/3DPointReplacer --> PointConnector (--> LineCloser LineJoiner)
2013-12-15: corrected a typo of "LineCloser". sorry.

Middle Point of Curve

This is a frequent question.
Although there should be several ways, I think the following way is easiest.

1. General - Curve (Line, Arc, Path)
Calculate half length of the curve, then use the Snipper to create the middle point.
-----
Snipping Mode: Distance (Value)
Starting Location: <half length>
Ending Location: <half length>
-----
The Snipper outputs a point when "Ending Location" is equal to "Starting Location".
To calculate length of a curve, the LengthCalculator or @Length function (from FME Feature Functions) can be used in general.
=====
2014-09-17
There is "Distance (Percentage)" mode. You don't need to calculate the length.
-----
Snipping Mode: Distance (Percentage)
Starting Location: 50
Ending Location: 50
=====

2. Line Segment
Line segment is a special curve i.e. a straight line having only 2 end nodes; the CenterPointReplacer can be also used to get its midpoint.
The CenterPointReplacer creates center point of the bounding box, it's equivalent to midpoint if the original geometry is a straight line.

When testing for the Snipper method, I noticed the resultant coordinates could contain a slight computational error (very very slight, about 10-14 - 10-15 order in my testing). I guess that "FME stores all attributes as character strings" causes such a computational error.

I think such a very slight error will not be an issue in almost all the cases. But if it will be an issue, consider using a PythonCaller. Python could generate higher-precision result (but is not mathematically exact, computational error cannot be avoided).
-----
# Python Script Example: Replace Curve with Middle Point
# When input geometry is not a curve, do nothing.
# Measure length in 2D.
import fmeobjects
def replaceCurveWithMidpoint(feature):
    geom = feature.getGeometry()
    if isinstance(geom, fmeobjects.FMECurve):
        measure3D = False
        len = geom.getLength(measure3D) * 0.5
        geom.snip(fmeobjects.SNIP_DISTANCE, measure3D, len, -1)
        feature.setGeometry(geom.getStartPoint())
-----

(FME 2013 SP4 Build 13547)

2013-12-06

FeatureMerger vs. InlineQuerier for CROSS JOIN

The FeatureMerger is one of the most frequently used transformers, it can be said that the basic functionality is similar to JOIN in SQL.

There are these JOIN types.
INNER JOIN
LEFT / RIGHT OUTER JOIN
CROSS JOIN

Consider sets of REQUESTOR and SUPPLIER features as database tables.
In the general parameter settings, set of MERGED features corresponds to the result table of INNER JOIN; union of MERGED and NOT_MERGED features corresponds to LEFT OUTER JOIN; union of MERGED and UNREFERENCED features corresponds to RIGHT OUTER JOIN.
=====
2013-12-07: When multiple suppliers can match with a requester, set of output features is not strictly same as the result of "JOIN" unless processing duplicate suppliers.
=====

Well, how's CROSS JOIN?
To perform the operation like CROSS JOIN, the following parameter settings will be necessary.
1. Merge features unconditionally
Specify the same constant (e.g. "1") to "Join On" for both Requestor and Supplier.
2. Process duplicate suppliers
Specify "Yes" to "Process Duplicate Suppliers".
3. Create list attribute
Specify a list name to "Supplier List Name".

Then, every REQUESTOR feature goes to MERGED port, and will have a complex list attribute which contains all attributes of every SUPPLIER feature. We can change the list elements to non-list attributes using the ListExploder if necessary.

As mentioned above, it's also possible to perform CROSS JOIN with the FeatureMerger. However, it might be inefficient in certain cases.
For example, if the purpose of CROSS JOIN is to select just a few matched suppliers with a condition based on attributes of the requester, the process of creating and exploding list attributes will waste memory and time for many mismatched features.

Use the InlineQuerier instead of the FeatureMerger in such a case.
Since the InlineQuerier uses SQLite internally, it consumes certain time for creating temporary database. But the querying could be much faster than other ways, so it could be efficient solution as a result.

I posted a concrete example here.
> Community: Python Exception <error>: unbalanced parenthesis
Yes, I should have noticed the InlineQuerier first... orz

(FME 2013 SP4 Build 13547)

2013-12-01

Create Line Segments from Coordinates List

Consider creating line segments from a coordinates list like this.
id  |  x  |  y  |  attr
0  0  0  val0
1  2  2  val1
2  5  2  val2
3  6  0

Resultant line segments should be (id 0 - id 1), (id 1 - id 2) and (id 2 - id 3), and also they should have attributes of the start node ("id" and "attr").
Just to create line segments, connect all points to create a polyline (2DPointReplacer + PointConnector), and then chop it into individual line segments (Chopper). It would be the quickest way, but will be necessary to recover attributes for every segment after chopping.

About one year ago, I provided a workflow like this image for a similar subject in the Community. > Community: connect the dots











I think it was an enough elegant solution. However, in FME 2013 SP2+, since we can use both "Multiple Feature Attribute Support" and "Conditional Value" in the AttributeCreator, there could be more elegant and / or efficient solutions currently.
(FME 2013 SP4 Build 13547)

Example 1: Use the 2DPointReplacer and the 2DPointAdder
I think this is one of the plainest ways.










Example 2: Use @XValue and @YValue from FME Feature Functions
In this example I used the FMEFunctionCaller to call @XValue and @YValue. Those functions can be also called in the AttributeCreator etc..











Example 3: Create XML describing geometry and use the GeometryReplacer
Excessive?















Anyway, the point is the "Multiple Feature Attribute Support" functionality of the AttributeCreator.

=====
2013-12-02
Example 4: Call @XValue and @YValue in the AttributeCreator
This is just an experiment. It works, but maybe is not a standard usage.



















Example 5: Python script
And, a PythonCaller of course can do that. No need to use other transformers.
-----
import fmeobjects

class LineSegmentCreator(object):
    def __init__(self):
        self.prior = None

    def input(self, feature):
        xy = lambda f: (float(f.getAttribute('x')), float(f.getAttribute('y')))
        if self.prior:
            line = fmeobjects.FMELine([xy(self.prior), xy(feature)])
            self.prior.setGeometry(line)
            self.pyoutput(self.prior)
        self.prior = feature

    def close(self):
        pass
-----

Example 6: Tcl script
I don't know whether Tcl is suitable in this case, but it's not impossible.
For example, two FMEFunctionCallers in the Example 2 can be replaced with a TclCaller which calls this procedure.
-----
proc createLineSegment {} {
  FME_Execute XValue "_x{}"
  FME_Execute YValue "_y{}"
}
-----
Naturally, the 2DPointReplacer and the 2DPointAdder in the Example 1 can be also replaced with a TclCaller.
-----
proc createLineSegment {} {
  foreach {x y} {"x" "y" "_x1" "_y1"} {
    FME_Execute XValue [FME_GetAttribute $x]
    FME_Execute YValue [FME_GetAttribute $y]
  }
}
-----

Although there could be more variations, end here.