2013-12-06

FeatureMerger vs. InlineQuerier for CROSS JOIN

The FeatureMerger is one of the most frequently used transformers, it can be said that the basic functionality is similar to JOIN in SQL.

There are these JOIN types.
INNER JOIN
LEFT / RIGHT OUTER JOIN
CROSS JOIN

Consider sets of REQUESTOR and SUPPLIER features as database tables.
In the general parameter settings, set of MERGED features corresponds to the result table of INNER JOIN; union of MERGED and NOT_MERGED features corresponds to LEFT OUTER JOIN; union of MERGED and UNREFERENCED features corresponds to RIGHT OUTER JOIN.
=====
2013-12-07: When multiple suppliers can match with a requester, set of output features is not strictly same as the result of "JOIN" unless processing duplicate suppliers.
=====

Well, how's CROSS JOIN?
To perform the operation like CROSS JOIN, the following parameter settings will be necessary.
1. Merge features unconditionally
Specify the same constant (e.g. "1") to "Join On" for both Requestor and Supplier.
2. Process duplicate suppliers
Specify "Yes" to "Process Duplicate Suppliers".
3. Create list attribute
Specify a list name to "Supplier List Name".

Then, every REQUESTOR feature goes to MERGED port, and will have a complex list attribute which contains all attributes of every SUPPLIER feature. We can change the list elements to non-list attributes using the ListExploder if necessary.

As mentioned above, it's also possible to perform CROSS JOIN with the FeatureMerger. However, it might be inefficient in certain cases.
For example, if the purpose of CROSS JOIN is to select just a few matched suppliers with a condition based on attributes of the requester, the process of creating and exploding list attributes will waste memory and time for many mismatched features.

Use the InlineQuerier instead of the FeatureMerger in such a case.
Since the InlineQuerier uses SQLite internally, it consumes certain time for creating temporary database. But the querying could be much faster than other ways, so it could be efficient solution as a result.

I posted a concrete example here.
> Community: Python Exception <error>: unbalanced parenthesis
Yes, I should have noticed the InlineQuerier first... orz

(FME 2013 SP4 Build 13547)

No comments:

Post a Comment