文库

文库
字符
转换
加密
网络
更多

图表

数学

坐标

图片

文件
文库

字符

转换

加密

网络

更多

图表

数学

坐标

图片

文件

在线工具大全

所有

中文

英语

最新

热度

47 条查询结果

Flink SQL: Queries, Windows, and Time - Part 2

In the previous article, we covered some aspects of time windows and time attributes that you should consider when planning your data collection strategy. This article will provide a more in-depth look at how to create a time window.

flink

52 技术 lddgo 分享于 2024-01-23

Flink SQL: How to detect patterns with MATCH_RECOGNIZE

Flink SQL has emerged as the de facto standard for low-code data analytics. It has managed to unify batch and stream processing while simultaneously staying true to the SQL standard. In addition, it provides a rich set of advanced features for real-time use cases. In a nutshell, Flink SQL is the best of both worlds: it gives you the ability to process streaming data using SQL, but it also supports batch processing.

flink

58 技术 lddgo 分享于 2024-01-23

How to test your Flink SQL Application

Testing your Apache Flink SQL code is a critical step in ensuring that your application is running smoothly and provides the expected results. Flink SQL applications are used for a wide range of data processing tasks, from complex analytics to simple SQL jobs. A comprehensive testing process can help identify potential issues early in the development process and ensure that your application works as expected. This post will go through several testing possibilities for your Flink SQL

flink

53 技术 lddgo 分享于 2024-01-23

Streaming modes of Flink-Kafka connectors

This blog post will guide you through the Kafka connectors that are available in the Flink Table API. By the end of this blog post, you will have a better understanding of which connector is more suitable for a specific application. Flink DataStream API provides Kafka connector, which works in append mode and can be used by your Flink program written in the Scala/Java API. Besides that, Flink has the Table API which offers two Kafka connectors:

flink

55 技术 lddgo 分享于 2024-01-23

Generic Log-based Incremental Checkpoint --- Performance Evaluation & Analytics

Generic Log-based Incremental Checkpoint (GIC for short in this article) has become a production-ready feature since Flink 1.16 release. We previously discussed the fundamental concept and underlying mechanism of GIC in our blog post titled "Generic Log-based Incremental Checkpoints I" [1]. In this blog post, we aim to provide a comprehensive analysis of GIC’s advantages and disadvantages by conducting thorough experiments and analysis.

flink

52 技术 lddgo 分享于 2024-01-23

How-to guide: Synchronize MySQL sub-database and sub-table using Flink CDC

This tutorial will show you how to use Flink CDC to build a real-time data lake for the above-presented scenario. The examples in this article will all be based on Docker with the use of Flink SQL. There is no need for a line of Java/Scala code or installation of an IDE. The entire content of this guide contains the docker-compose file.

flink

56 技术 lddgo 分享于 2024-01-23

Flink SQL Secrets: Mastering the Art of Changelog Event Out-of-Orderness

Alice is a data engineer taking care of real-time data processing in her company. She found that Flink SQL sometimes can produce update (with regard to keys) events. But, with the early versions of Flink, those events can not be written to Kafka directly because Kafka is an append-only messaging system essentially. Fortunately, the Flink community released the connector upsert-kafka in a later version that supports writing update events. Later, she found that the Flink SQL jobs

flink

53 技术 lddgo 分享于 2024-01-23

Flink's Test Harnesses Uncovered

When working with Apache Flink, developers often face challenges while testing user-defined functions (UDFs) that utilize state and timers. In this article we will answer a question "How to test user-defined functions (UDFs) using Flink's test harnesses".

flink

62 技术 lddgo 分享于 2024-01-23

Joining Highly Skewed Streams in Flink SQL

Flink SQL is a powerful tool which unifies batch and stream processing. It provides low-code data analytics while complying with the SQL standard. In production systems, our customers found that as the workload scales, the SQL jobs that used to work well may slow down significantly, or even fail. And data skews is a common and important reason. Data skew refers to the asymmetry of the probability distribution of a variable about its mean. In other words

flink

55 技术 lddgo 分享于 2024-01-23

Performance Analysis and Tuning Guides for Hybrid Shuffle Mode

The Apache Flink community introduced the Hybrid Shuffle Mode[1] in 1.16, which combines traditional Batch Shuffle with Pipelined Shuffle from stream processing to give Flink batch processing more powerful capabilities. The core idea of Hybrid Shuffle is to break scheduling constraints and decide whether downstream tasks need to be scheduled based on the availability of resources, while supporting in-memory data exchange without spilling to disk when conditions permit.

flink

52 技术 lddgo 分享于 2024-01-23

简体中文