Using settings for case-sensitivity of text comparisons

Upper and lower case lettering affect comparison of written English values during query selection and result ordering. Often, you want to ignore the difference between “A” and “a” when queries select text values.

Languages other than English have other character modifications that affect comparison for selection and ordering. For example, many written European languages have characters with accents and other diacritical marks on a base letter. Sometimes you want to ignore the differences between base letters with and without diacritics, such as the differences between “A”, “À”, and “Á”. Asian languages have other concerns, such as single-byte and double-byte characters or between katakana and hirigani characters in Japanese.

The result of comparing the value of a character field with another character value differs depending on the language, search collation strength, and database that your PolicyCenter uses. For example, case-insensitive comparisons produce the same results as case-sensitive comparisons if PolicyCenter has a linguistic search strength of primary.

Some query API methods support specifying whether to ignore differences between letter case. When you specify that you want to ignore case, PolicyCenter uses your localization settings to control the results. For each locale configured in your PolicyCenter instance, you specify which dimensions of difference you want to ignore. For example, you can configure a locale to ignore case only, to ignore case and accents, or to ignore other dimensions of difference that are relevant to the locale.

Query performance can suffer when you ignore case in query predicate methods. To improve query performance if you typically need to ignore case in text comparisons, set the SupportsLinguisticSearch attribute on column elements in entity definitions. For a column that has the SupportsLinguisticSearch set, PolicyCenter creates a corresponding denormalization column in addition to the standard column that stores the field values. When PolicyCenter stores a value in the standard column, PolicyCenter also stores a value in the denormalization column. PolicyCenter saves a denormalized value by converting the regular value to one that ignores character differences, based on the dimensions of difference that you specified in your localization settings. When a query applies a predicate that specifies ignoring the case of a field, PolicyCenter uses the same algorithm that converted values to store in the denormalization column. The relational database compares the converted bound value to the values in the denormalization column, not the values in the standard column.

For example, you specify the following query:

uses gw.api.database.Query

// Query the Person instances for a specific last name.
var query = Query.make(Person).withLogSQL(true)
query.compare(Person#LastName, Equals, "smith")

Print the rows that the query returns by adding the following lines:

// Fetch the data and print it.
var result = query.select()
for (person in result) {
  print (person.DisplayName)
}

Gosu creates a query that looks like the following one.

SELECT /* KeyTable:pc_contact; */ 
  gRoot.ID col0, 
  gRoot.Subtype col1 
FROM pc_contact gRoot 
WHERE gRoot.LastName = ? 
AND gRoot.Retired = 0 [smith (lastname)]

Depending on your PolicyCenter and database settings, you see either no output or output that looks like:

Steve Smith
Kerry Smith
Alice Smith
John Smith
...

To do a case-insensitive text comparison, use the compareIgnoreCase method. This method uses the same parameters as the compare method. To see the effect on the SQL query of using the compareIgnoreCase method, change the comparison line in the code to:

query.compareIgnoreCase(Person#LastName, Equals, "smith")

Gosu creates a query that looks like the following one.

SELECT /* KeyTable:pc_contact; */ 
  gRoot.ID col0, 
  gRoot.Subtype col1 
FROM pc_contact gRoot 
WHERE gRoot.LastNameDenorm = ? 
AND gRoot.Retired = 0 [smith (lastname {linguistic=true})]

The standard column has values like “Smith” and “Menzies”. The denormalization column has corresponding values like “smith” and “menzies”. The preceding query returns Person entity instances with the last name “Smith” in the standard column, because it compared the bound value “smith” to the value in the denormalization column.

You see output that looks like:

Steve Smith
Kerry Smith
Alice Smith
John Smith
...