Skip to main content

Custom Search Config

Changes to the Datahub search behaviour have highlighted the need for control to retain the ability to provide an optimal search experience.

For example, the 0.14.0.2 release introduced changes that disabled prefix matching for quoted queries and those containing periods and underscores, resulting in the failure to return results for partial queries.

prefixMatchQuery: false
- queryRegex: >-
    ^["'].+["']$|^[a-zA-Z0-9]\S+[_.-]\S+[a-zA-Z0-9]$
    simpleQuery: false
    prefixMatchQuery: false
    exactMatchQuery: true
    functionScore:
      functions:
        - filter:
            term:
              deprecated:
                value: true
          weight: 0.25
      score_mode: multiply
      boost_mode: multiply

We have enabled a custom search configuration by overriding the base helm values. This provides a way to identify query strings through the use of regex and apply different treatments accordingly.

As a result we have been able to remediate the changes in the 0.14.0.2 release by re-enabling prefix matching, and breaking out categorisation of each query as follows.

# Criteria for queries containing underscores
- queryRegex: >-
    ^[a-zA-Z0-9]\S+[_.-]\S+[a-zA-Z0-9]$
    simpleQuery: false
    prefixMatchQuery: true
    exactMatchQuery: true
    functionScore:
    functions:
        - filter:
            term:
            deprecated:
                value: true
        weight: 0.25
    score_mode: multiply
    boost_mode: multiply

# Criteria for quoted searches
- queryRegex: >-
    ^["'].+["']$
    simpleQuery: false
    prefixMatchQuery: true
    exactMatchQuery: true
    functionScore:
    functions:
        - filter:
            term:
            deprecated:
                value: true
        weight: 0.25
    score_mode: multiply
    boost_mode: multiply

Documentation

This page was last reviewed on 1 October 2024. It needs to be reviewed again on 1 April 2025 by the page owner #data-catalogue .
This page was set to be reviewed before 1 April 2025 by the page owner #data-catalogue. This might mean the content is out of date.