Zero-ETL connectors let you integrate your data across apps and data sources for thorough insights and the breaking down of data silos. Their fully managed, no-code, practically real-time solution allows petabytes of transactional data written into Amazon Relational Database Service (Amazon RDS) for MySQL to be made available in Redshift Amazon in just a few seconds.
Because you won't have to create your own ETL jobs, you may streamline data entry, reduce operational overhead, and perhaps even save your overall data processing costs. They announced last year that zero-ETL connection with Redshift Amazon for Amazon Aurora MySQL-Compatible Edition was now generally available, along with preview versions of Amazon DynamoDB, RDS for MySQL, and Aurora PostgreSQL-Compatible Edition.
AWS is pleased to announce that Amazon RDS for MySQL zero-ETL with Redshift Amazon is now generally available. This version also includes data filtering, support for numerous integrations, and the ability to set up zero-ETL integrations in your AWS Cloud Formation template.
Filtering of data
Filtering may be beneficial for most firms, regardless of size, when it comes to ETL processes. A popular use case is lowering data processing and storage costs by selecting just the data needed for replication from production databases. Another step is to remove personally identifiable information (PII) from the report's dataset. For example, a healthcare organization may decide to exclude sensitive patient information when duplicating data to generate aggregate reports on recent patient cases.Similarly, an online store might decide to keep all personally identifying information confidential while giving its marketing section access to customer purchasing habits. However, there are other circumstances when filtering should not be used, such as when giving data to fraud detection teams who need all of the data relatively instantly in order to make judgments. These are just a few of examples; we encourage you to investigate and discover other use cases that may be applicable to your business.
Integration of Zero-ETL
You have two options for adding filtering to your zero-ETL integrations: either build the integration from the ground up or modify an existing integration. In any event, the "Source" step of the zero-ETL construction procedure is where you may find this choice.Filter expressions are entered into the format database.table gives you the option to add filters to the dataset that include or exclude databases or tables. It is possible to add more than one expression, and they will be evaluated sequentially from left to right.
Redshift Amazon will delete tables that are no longer included in the filter if you're altering an existing integration, and the new filtering rules will go into effect as soon as you verify your changes.
If you want to learn more, we recommend reading this blog post since the concepts and processes are somewhat similar. The setup of data filters for Amazon Aurora zero-ETL connectors is covered in great depth.
Redshift Data Warehouse on Amazon
Make many zero-ETL integrations from a same database
Furthermore, a single RDS for MySQL database may now be used to establish connections to up to five Redshift Amazon data warehouses. The sole limitation is that further integrations cannot be added until the first one has been set up properly.This allows you to share transactional data with other teams yet provide them authority over their own data warehouses for specific use cases. To distribute different data sets to development, staging, and production Redshift Amazon clusters from the same Amazon RDS production database, for example, you may use this in conjunction with data filtering.
Consolidating Redshift Amazon clusters via zero-ETL replication to several warehouses is another fascinating use case for this. You may also explore your data, train tasks in Amazon SageMaker, trade data, and use Amazon Redshift materialized views to power your dashboards.
To sum up
RDS for MySQL zero-ETL interfaces with Redshift Amazon allow you to duplicate data for near real-time analytics, doing away with the need to build and manage complex data pipelines. It is now widely available and may use filter expressions to include or exclude databases and tables from the duplicated data sets. Furthermore, you may now create connections between many sources to consolidate data into a single data warehouse or establish several connectors between different Amazon Redshift warehouses and the same source RDS for MySQL database.This zero-ETL integration works with Redshift Amazon Serverless, Redshift Amazon RA3 instance types, and RDS for MySQL versions 8.0.32 and above in supported AWS Regions.
In addition to the AWS Management Console, the AWS Command Line Interface (AWS CLI) and the official AWS SDK for Python, boto3, may also be used to establish a zero-ETL connection.
0 Comments