文库

文库
字符
转换
加密
网络
更多

图表

数学

坐标

图片

文件
文库

字符

转换

加密

网络

更多

图表

数学

坐标

图片

文件

在线工具大全

所有

中文

英语

最新

热度

47 条查询结果

Stream Processing Scalability: Challenges and Solutions

Stream processing is a programming paradigm which views data streams, or sequences of events in time, as the central input and output objects of computation. This enables organizations to harness the value of data immediately, making it a valuable tool for time-sensitive applications and scenarios requiring up-to-the-minute insights. Stream processing systems excel at handling high-velocity, unbounded data streams, such as click streams, log streams, live sensor data, social media feeds

flink

52 技术 lddgo 分享于 2024-01-23

All You Need to Know About PyFlink

PyFlink serves as a Python API for Apache Flink, providing users with a medium to develop Flink programs in Python and deploy them on a Flink cluster. In this post, we will introduce PyFlink from the following aspects: The structure of a fundamental PyFlink job and some basic knowledge surrounding it The operational mechanisms of PyFlink jobs, the high-level architecture, and its internal workings Essential performance optimization strategies for PyFlink Future projections for PyFlink

flink

57 技术 lddgo 分享于 2024-01-23

Stream Enrichment in Flink

Imagine a photo without its vibrant colors; intriguing but lacking depth. Stream enrichment works similarly for data. It infuses raw data streams with added context, transforming them from grayscale to full color. Going beyond the simple transmission of information, stream enrichment breathes life into data, augmenting it with additional context and details. By embedding supplementary data into an existing data stream, businesses and organizations can paint a clearer picture, driving enhanced

flink

198 技术 lddgo 分享于 2024-01-23

Batch Processing vs Stream Processing

Batch processingand stream processing are two very different models for processing data. Both have their strengths but suit different use cases. In this post we cover the differences, provide examples of use cases, and look at the ways the two models can work together.

flink

56 技术 lddgo 分享于 2024-01-23

Bootstrap Data Pipeline via Flink HybridSource

A common requirement in the area of data engineering is to first process existing historical data before processing continuously live data. Processing existing data first is also referred to as bootstrapping the system. How to easily achieve this with Apache Flink? In this blog-post we will look at Flink's HybridSource which is specifically designed for such a task. If you want to clone the repository with the code from this blog post, use

flink

208 技术 lddgo 分享于 2024-01-23

Streamhouse Unveiled

Every year, Apache Flink® sets new records in its development journey. Standing as a testament to its growing popularity, Flink now boosts over 1.6k contributors, 21k GitHub stars, and 1.4M downloads. In operational environments, Flink clusters are reaching impressive scales, with some individual clusters surpassing 2000 nodes. The largest known Flink infrastructure in production boasts over 4 million CPU cores, processing a staggering 4.1B events per second. If scalability is a concern

flink

51 技术 lddgo 分享于 2024-01-23

Streamhouse: Data Processing Patterns

In October, at Flink Forward 2023, Streamhouse was officially introduced by Jing Ge, Head of Engineering at Ververica. In his keynote, Jing highlighted the need for Streamhouse, including how it sits as a layer between real-time stream processing and Lakehouse architectures, and discussed the business value it provides.

flink

55 技术 lddgo 分享于 2024-01-23

Building real-time data views with Streamhouse

In this blog post, you will learn how to build a real-time data view on top of your Streamhouse using Apache Paimon table format. If you are coming from the Data Management world, you might know that Data engineers are generally concerned about implementing a data analytics pipeline, minimizing compute-infrastructure cost, and achieving the smallest end-to-end latency for the target users.

flink

52 技术 lddgo 分享于 2024-01-23

Stateful Functions Internals: Behind the scenes of Stateful Serverless

Stateful Functions (StateFun) simplifies the building of distributed stateful applications by combining the best of two worlds: the strong messaging and state consistency guarantees of stateful stream processing, and the elasticity and serverless experience of today’s cloud-native architectures and popular event-driven FaaS platforms. Typical StateFun applications consist of functions deployed behind simple services using these modern platforms, with a separate StateFun cluster playing the role

flink

71 技术 lddgo 分享于 2022-09-14

From Aligned to Unaligned Checkpoints - Part 1: Checkpoints, Alignment, and Backpressure

Apache Flink’s checkpoint-based fault tolerance mechanism is one of its defining features. Because of that design, Flink unifies batch and stream processing, can easily scale to both very small and extremely large scenarios and provides support for many operational features like stateful upgrades with state evolution or roll-backs and time-travel.

flink

101 技术 lddgo 分享于 2022-09-14

简体中文