Skip to content

aws-kinesisstreams-gluejob: Unknown Classification, classification instead set as table property #184

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
FinnIckler opened this issue May 18, 2021 · 2 comments · Fixed by #185
Labels
bug Something isn't working needs-triage The issue or PR still needs to be triaged

Comments

@FinnIckler
Copy link

When you create a Glue Table using the KinesisstreamToGlue Job Construct, the classification is set to unknown. Instead, a table property "classification" : "json" is set.

Here is a screenshot of the resulting stream
Screenshot 2021-05-18 at 18 06 01

Reproduction Steps

I created this with an existing stream, but I am not sure if this is what caused the bug, or if it also would be there without it.

SCHEMA: glue.CfnTable.ColumnProperty = [
    {
        "name": "user",
        "type": "string",
        "comment": "Name of the user"
    },
    {
        "name": "package_id",
        "type": "string",
        "comment": "A unique package id"
    },
    {
        "name": "product_id",
        "type": "int",
        "comment": "A list of product ids that are in the package"
    },
    {
        "name": "origin",
        "type": "string",
        "comment": "Which FC the package came from"
    },
    {
        "name": "destination",
        "type": "string",
        "comment": "The address where the package is delivered to"
    },
    {
        "name": "weight",
        "type": "float",
        "comment": "The weight of the package"
    }
]

input_data_stream = kinesis.Stream(self, "input-stream", stream_name="input-data-stream",
                                                shard_count=2,
                                                retention_period=core.Duration.hours(24))

streaming_job = KinesisstreamsToGluejob(self, "streaming-job", glue_job_props={"command": {
            "name": "gluestreaming",
            "pythonVersion": "3",
            "scriptLocation": assets.Asset(self, "SparkCode", path="spark_code/transform.py").s3_object_url,
        }}, field_schema=SCHEMA, existing_stream_obj=input_data_stream)

Error Log

I can't execute any jobs on the table as the classification is set to unknown

Environment

  • **CDK CLI Version :1.104
  • **CDK Framework Version:1.99
  • **AWS Solutions Constructs Version :1.99
  • **OS :MacOs
  • **Language :Python

Other


This is 🐛 Bug Report

@FinnIckler FinnIckler added bug Something isn't working needs-triage The issue or PR still needs to be triaged labels May 18, 2021
@FinnIckler
Copy link
Author

This might be a glue issue, speaking to the service team about this.

@biffgaut
Copy link
Contributor

Thanks - let us know what you hear.

biffgaut pushed a commit that referenced this issue May 19, 2021
* initial commit for construct

* initial commit for construct

* adding a new construct

* updates to construct default glue attributes

* updating package.json

* updating ts files

* adding unit tests and integ tests

* adding unit tests and integ tests

* adding unit tests and helper methods to complete the construct

* adding unit tests

* adding unit tests

* adding unit tests

* fix for linting error

* adding glue job example

* update readme.md

* update snapshots and kms arn

* update description in package.json

* integ test cases

* added simulator for kinesis stream

* added simulator for kinesis stream

* integ test cases

* updating construct for scriptlocation

* updating python generator file

* updating example code

* updating example code

* updating example code

* updating example code

* updating example code

* updating example code

* updating example code

* update README and Architecture

* updating README

* updating README

* updating construct

* updating construct

* updating construct

* updating construct

* updating construct

* updating construct

* updating construct

* updating construct

* updating construct

* updating sample

* updating construct

* updating construct

* updating construct

* updating construct

* updating sample

* updating sample

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct kinesis policy

* updating construct policy

* updating readme

* updated README

* updated README

* updated README

* updated README

* updated README

* updated README

* updated README

* updated README

* updated README

* updates based on review

* incoporating review comments

* incoporating review comments

* incoporating review comments

* incoporating review comments

* incoporating review comments

* update to construct

* update to construct

* update to construct

* update to construct

* update to construct

* update to construct

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* refactoring code to match construct patterns

* fix for readme file

* eslint fixes

* update sample after refactoring construct

* update sample after refactoring construct

* remove _ from variable names

* updating glue version 2.0 as recommended in Glue service documentation

* removed snapshot (that was not required) to fix build failure

* update viperlight to fix build issue

* update viperlight to fix build issue

* cfn_nag fix for build issues

* updating integ snapshots

* updating integ snapshots

* update header

* update header

* incorporate publisher review comments

* incorporate publisher review comments

* incorporate publisher review comments

* reorganization unitt tests

* reorganization unitt tests

* reorganization unitt tests

* reorganization unitt tests

* reorganization unitt tests

* reorganization unitt tests

* incorporate review comments

* fix for output  s3 bucket

* fix for output  s3 bucket

* fix for output  s3 bucket

* update README with the correct class name

* updating README

* updates to readme

* updates to readme

* updates to readme

* updates to readme

* removing duplicate words from README

* fix for bug #184

* fix for bug #184

* fix for bug #184

Co-authored-by: nihitkas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage The issue or PR still needs to be triaged
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants