Skip to main content

Ingestion failures

This runbook documents failures we have encountered with the ingestion process.

Null pointer exceptions

We have occasionally seen failures where Datahub rejects a small number of models.

For example:

2024-10-23T08:23:53.5205940Z  'failures': [{'error': 'Unable to emit metadata to DataHub GMS: java.lang.NullPointerException',
2024-10-23T08:23:53.5206850Z                'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
2024-10-23T08:23:53.5207654Z                         'message': 'java.lang.NullPointerException',
2024-10-23T08:23:53.5208145Z                         'status': 500,
2024-10-23T08:23:53.5209032Z                         'urn': 'urn:li:dataset:(urn:li:dataPlatform:athena,athena_cadet.awsdatacatalog.avature_stg.stg_people_fields_pivoted_wide,PROD)',
2024-10-23T08:23:53.5210539Z                         'workunit_id': 'urn:li:dataset:(urn:li:dataPlatform:athena,athena_cadet.awsdatacatalog.avature_stg.stg_people_fields_pivoted_wide,PROD)-upstreamLineage'}},
2024-10-23T08:23:53.5211753Z               {'error': 'Unable to emit metadata to DataHub GMS: java.lang.NullPointerException',
2024-10-23T08:23:53.5212614Z                'info': {'exceptionClass': 'com.linkedin.restli.server.RestLiServiceException',
2024-10-23T08:23:53.5213333Z                         'message': 'java.lang.NullPointerException',
2024-10-23T08:23:53.5213814Z                         'status': 500,
2024-10-23T08:23:53.5214679Z                         'urn': 'urn:li:dataset:(urn:li:dataPlatform:athena,athena_cadet.awsdatacatalog.avature_stg.stg_job_fields_pivoted_wide,PROD)',
2024-10-23T08:23:53.5216091Z                         'workunit_id': 'urn:li:dataset:(urn:li:dataPlatform:athena,athena_cadet.awsdatacatalog.avature_stg.stg_job_fields_pivoted_wide,PROD)-upstreamLineage'}}],

The cause is as yet unclear, but this does not prevent the rest of the ingestion from completing.

Server version mismatch

When the datahub server version changes, the acryl-datahub python library should also be updated to a compatable version.

If these get out of sync you will get errors like Unable to emit metadata to DataHub GMS, likely because the server version is too old relative to the client.

You will need to upgrade/downgrade the python library for the ingestion to work again.

An error occurred (NoSuchKey) when calling the GetObject operation

The CaDeT ingestion expects to find dbt catalog/manifest files in an S3 bucket: s3://mojap-derived-tables/prod/run_artefacts/latest/target.

Normally this is set by the “dbt docs” job in create-a-derived-table.

If this fails to run properly, the catalog/manifest files may be removed due to the bucket’s retention policy, and the ingestion will fail until dbt docs has successfully run again.

Recursion errors

If the Datahub ingestion process exceeds python’s recursion limit, it may interfere with ingestion; for example:

2024-07-03T11:05:37.3320639Z [2024-07-03 11:05:37,331] INFO     {datahub.ingestion.source.dbt.dbt_common:1161} - Failed to parse compiled code for snapshot.mojap_derived_tables.derived_delius_dim__snapshot_sentences_at_disp: maximum recursion depth exceeded

We have seen this happen when calling sqlglot, which formats compiled SQL code we ingest from DBT.

Whether the recursion limit is reached depends on:

  • the SQL used in Create a Derived Table (only very long expressions will trigger the issue)
  • the compiled_code being output in the DBT manifest (it’s possible this depends on the exact DBT command that generated the manifest)
  • the runtime behaviour of Datahub (exactly when the recursion limit is reached depends on the runtime call stack)

To mitigate this, we switched off include_compiled_code and include_column_lineage for the DBT ingestion source, however recent versions of datahub should handle this gracefully, and fixes have since gone into sqlglot, so we don’t expect this to happen again if either of these settings is switched back on.

This page was last reviewed on 2 December 2024. It needs to be reviewed again on 2 June 2025 by the page owner #data-catalogue .
This page was set to be reviewed before 2 June 2025 by the page owner #data-catalogue. This might mean the content is out of date.