Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elasticsearch Ingestion: Issue to Detect Sub-Properties in 'nested' Type Mapping #10911

Open
Bumyu opened this issue Jul 15, 2024 · 0 comments · May be fixed by #11338
Open

Elasticsearch Ingestion: Issue to Detect Sub-Properties in 'nested' Type Mapping #10911

Bumyu opened this issue Jul 15, 2024 · 0 comments · May be fixed by #11338
Labels
bug Bug report

Comments

@Bumyu
Copy link

Bumyu commented Jul 15, 2024

Describe the bug
When ingesting an Elasticsearch source into DataHub, if the index mapping includes a nested property with multiple sub-properties, DataHub fails to detect the sub-properties of the nested field after ingestion. For example, with the following mapping:

{
  "example-index": {
    "mappings": {
      "properties": {
        "addresses": {
          "type": "nested",
          "properties": {
            "address": {
              "type": "keyword"
            },
            "type": {
              "type": "keyword"
            }
          }
        },
        "id": {
          "type": "keyword"
        },
        "name": {
          "type": "keyword"
        }
      }
    }
  }
}

In this case, DataHub does not detect the address and type properties within the addresses nested field.

To Reproduce
Steps to reproduce the behavior:

  1. Setup the elastic mapping provided in the example
PUT /example-index
{
  "mappings": {
    "properties": {
      "addresses": {
        "type": "nested",
        "properties": {
          "address": {
            "type": "keyword"
          },
          "type": {
            "type": "keyword"
          }
        }
      },
      "id": {
        "type": "keyword"
      },
      "name": {
        "type": "keyword"
      }
    }
  }
}
  1. Go to 'Elasticsearch Source Configuration' in DataHub.
  2. Click on 'Ingest Elasticsearch Index'.
  3. Scroll down to the section with the nested properties in the index mapping.
  4. See error: Sub-properties of the nested field are not detected.
    image

Expected behavior
The sub-properties of the nested field should be detected and ingested properly by DataHub, allowing them to be visible and usable within the DataHub interface.

Desktop (please complete the following information):

  • OS: Ubuntu
  • Browser Chrome
  • Version 126.0.6478.126 (Official Build) (64-bit)

Additional context
The source of this issue resides here: https://github.com/datahub-project/datahub/blob/master/metadata-ingestion/src/datahub/ingestion/source/elastic_search.py

I tried to find a possible solution that can be used and that worked for my situation:
image

@Bumyu Bumyu added the bug Bug report label Jul 15, 2024
Bumyu pushed a commit to Bumyu/datahub that referenced this issue Sep 9, 2024
…mapping

Addresses the issue with Elasticsearch ingestion where sub-properties within the 'nested' type are not being detected during the mapping process, leading to incomplete ingestion results.

Fixes: datahub-project#10911
Bumyu pushed a commit to Bumyu/datahub that referenced this issue Sep 9, 2024
…mapping

Fixes an issue where sub-properties in 'nested' type mappings are not detected
during Elasticsearch ingestion, leading to incomplete ingestion of data.

Fixes: datahub-project#10911
Bumyu pushed a commit to Bumyu/datahub that referenced this issue Sep 10, 2024
…mapping

fix issue where sub-properties in 'nested' type mappings are not detected during
Elasticsearch ingestion, resulting in incomplete ingestion of data. Prior to this
fix, only top-level properties were recognized, missing critical sub-fields.

Fixes: datahub-project#10911
Bumyu pushed a commit to Bumyu/datahub that referenced this issue Sep 13, 2024
…mapping

fix issue where sub-properties in 'nested' type mappings are not detected during
Elasticsearch ingestion, resulting in incomplete ingestion of data. Prior to this
fix, only top-level properties were recognized, missing critical sub-fields.

Fixes: datahub-project#10911
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug report
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant