Article

AWS Kinesis Firehose and Teradata Vantage

Many Teradata customers are interested in integrating Vantage with Amazon AWS First Party Services. This Getting Started Guide will help you to connect Vantage with AWS Kinesis service.

Wenjie Tehan
Wenjie Tehan
22 septembre 2021 3 min de lecture
AWS and Teradata

Many Teradata customers are interested in integrating Teradata Vantage with Amazon AWS First Party Services. This Getting Started Guide will help you to connect Teradata Vantage with AWS Kinesis service. 

Although this approach has been implemented and tested internally, it is offered on an as-is basis. Neither AWS nor Teradata provide validation of Teradata Vantage with AWS services. 

We encourage your feedback. We want to understand what you found useful and how we can improve this guide.  

Please send your feedback to shamira.joshua@teradata.com and wenjie.tehan@teradata.com

Disclaimer: This guide includes content from both AWS and Teradata product documentation. 

Overview 

AWS Kinesis is a streaming service that makes it easy to collect, process, and analyze real-time, streaming data. 

Kinesis streaming data platform offers Kinesis Data Streams, Kinesis Data Firehose, Kinesis Data Analytics, and Kinesis Video Streams. Kinesis Data Streams is manually managed and can store data in the stream for up to seven days, in which transformation can be done with the data. Kinesis Firehose is fully managed, and collects the data and stores it in Amazon S3, Redshift, Splunk and Elasticsearch. Kinesis Video streams is used to stream live video and Kinesis Data Analytics can process and analyze streaming data using standard SQL. 

Teradata Vantage Native Object Store (NOS) makes it easy for users to explore data in external object stores like Amazon S3 using standard SQL and application interfaces like ODBC, JDBC, .NET, Python and R native drivers. No special object storage-side compute infrastructure is required to use NOS. You can explore data located in Amazon S3 bucket by simply creating a NOS table definition to point to the bucket you are authorized to access.  

This guide describes the process to stream data from source to Amazon S3 via AWS Kinesis firehose,  transform it to JSON format by an AWS Glue ETL job, and then use Teradata NOS to access data from Amazon S3. Lambda functions and a CloudWatch event rule is also created to automate the whole process. 
Picture1-(1).png

Prerequisites

You are expected to be familiar with AWS Kinesis, Lambda, CloudWatch services, and Teradata Vantage.
You will need the following accounts, and systems:

•    An AWS account
•    A Teradata Vantage instance with SQLE 17.0+
•    An Amazon S3 bucket to store streaming data
•    An Amazon S3 bucket to store JSON files
•    IAM roles that allow Glue Crawler, ETL and Lambda services
•    AccessKeyId and SecretAccessKey

Getting Started

Create Amazon S3 buckets
Amazon S3 buckets can be created using instructions here. Two buckets are needed in this example: one to store streaming data (i.e., ptctstoutput), and another one to store JSON files (i.e., awspilbucket) after transformation.

Create IAM role
AWS services require you to use roles to allow the service to access resource in other services on your behalf. In this example, three roles are needed – a role for Kinesis Firehose, a role for Glue, and a role for Lambda. 
Kinesis Firehose role will be created on the fly. Instructions below create roles for Glue and Lambda services.
Screen-Shot-2021-09-23-at-9-29-08-AM.pngScreen-Shot-2021-09-23-at-9-29-46-AM.png



Screen-Shot-2021-09-23-at-9-31-48-AM.png

Screen-Shot-2021-09-23-at-9-22-59-AM.png
Screen-Shot-2021-09-23-at-9-20-43-AM.png
Screen-Shot-2021-09-23-at-9-21-45-AM.png

Create Firehose Delivery System

Screen-Shot-2021-09-23-at-9-25-03-AM.pngScreen-Shot-2021-09-23-at-9-25-29-AM.pngScreen-Shot-2021-09-23-at-9-25-51-AM.pngScreen-Shot-2021-09-23-at-9-26-17-AM.pngScreen-Shot-2021-09-23-at-9-26-45-AM.png

Screen-Shot-2021-09-23-at-9-32-47-AM.pngScreen-Shot-2021-09-23-at-9-33-12-AM.pngCreate Glue ETL Transformation Job

Screen-Shot-2021-09-23-at-9-33-58-AM.pngScreen-Shot-2021-09-23-at-9-35-51-AM.pngScreen-Shot-2021-09-23-at-9-36-28-AM.pngScreen-Shot-2021-09-23-at-9-36-52-AM.pngScreen-Shot-2021-09-23-at-9-37-16-AM.pngScreen-Shot-2021-09-23-at-9-37-39-AM.pngAccessing Streaming Data Using NOS

Screen-Shot-2021-09-23-at-9-38-32-AM.pngScreen-Shot-2021-09-23-at-9-39-05-AM.pngScreen-Shot-2021-09-23-at-9-39-42-AM.pngScreen-Shot-2021-09-23-at-9-40-05-AM.pngCreate Lambda functions, Trigger, and CloudWatch event

Screen-Shot-2021-09-23-at-9-41-02-AM.pngScreen-Shot-2021-09-23-at-9-42-17-AM.pngScreen-Shot-2021-09-23-at-9-42-41-AM.pngScreen-Shot-2021-09-23-at-9-43-02-AM.pngScreen-Shot-2021-09-23-at-9-43-33-AM.pngScreen-Shot-2021-09-23-at-9-44-00-AM.pngScreen-Shot-2021-09-23-at-9-44-25-AM.pngScreen-Shot-2021-09-23-at-9-45-05-AM.pngRun

Screen-Shot-2021-09-23-at-9-45-39-AM.pngScreen-Shot-2021-09-23-at-9-46-02-AM.png
Tags

À propos de Wenjie Tehan

Wenjie is a Technical Consulting Manager, currently working with the Teradata Global Alliances team. 
 
With over 20 years in the IT industry, Wenjie has worked as developer, tester, business analyst, solution designer and project manager. This breadth of roles makes her perfect for the current role, understanding how the business needs data and how this data can be managed to meet those business needs.  
 
Wenjie has a BS in computer science from University of California at San Diego, and ME in computer engineering at Cornell University. Wenjie is also certified on both Teradata and AWS. Voir tous les articles par Wenjie Tehan

Restez au courant

Abonnez-vous au blog de Teradata pour recevoir des informations hebdomadaires



J'accepte que Teradata Corporation, hébergeur de ce site, m'envoie occasionnellement des communications marketing Teradata par e-mail sur lesquelles figurent des informations relatives à ses produits, des analyses de données et des invitations à des événements et webinaires. J'ai pris connaissance du fait que je peux me désabonner à tout moment en suivant le lien de désabonnement présent au bas des e-mails que je reçois.

Votre confidentialité est importante. Vos informations personnelles seront collectées, stockées et traitées conformément à la politique de confidentialité globale de Teradata.