文库

文库
字符
转换
加密
网络
更多

图表

数学

坐标

图片

文件
文库

字符

转换

加密

网络

更多

图表

数学

坐标

图片

文件

在线工具大全

所有

中文

英语

最新

热度

47 条查询结果

Exploring fine-grained recovery of bounded data sets on Flink

Apache Flink is a very versatile tool for all kinds of data processing workloads. It can process incoming data within a few milliseconds or crunch through petabytes of bounded datasets (also known as batch processing).

flink

84 技术 lddgo 分享于 2022-09-14

Using RocksDB State Backend in Apache Flink: When and How

To best understand state and state backends in Flink, it’s important to distinguish between in-flight state and state snapshots. In-flight state, also known as working state, is the state a Flink job is working on. It is always stored locally in memory (with the possibility to spill to disk) and can be lost when jobs fail without impacting job recoverability. State snapshots, i.e., checkpoints and savepoints, are stored in a remote durable storage, and are used to restore the local state

flink

113 技术 lddgo 分享于 2022-09-14

How to natively deploy Flink on Kubernetes with High-Availability (HA)

Flink has supported resource management systems like YARN and Mesos since the early days; however, these were not designed for the fast-moving cloud-native architectures that are increasingly gaining popularity these days, or the growing need to support complex, mixed workloads (e.g. batch, streaming, deep learning, web services). For these reasons, more and more users are using Kubernetes to automate the deployment, scaling and management of their Flink applications.

kubernetes flink

122 技术 lddgo 分享于 2022-09-14

Stateful Functions 3.0.0: Remote Functions Front and Center

This new release brings remote functions to the front and center of StateFun, making the disaggregated setup that separates the application logic from the StateFun cluster the default. It is now easier, more efficient, and more ergonomic to write applications that live in their own processes or containers. With the new Java SDK this is now also possible for all JVM languages, in addition to Python.

flink

77 技术 lddgo 分享于 2022-09-14

Scaling Flink automatically with Reactive Mode

Streaming jobs which run for several days or longer usually experience variations in workload during their lifetime. These variations can originate from seasonal spikes, such as day vs. night, weekdays vs. weekend or holidays vs. non-holidays, sudden events or simply the growing popularity of your product.

flink

85 技术 lddgo 分享于 2022-09-14

Implementing a Custom Source Connector for Table API and SQL - Part One

Part one of this tutorial will teach you how to build and run a custom source connector to be used with Table API and SQL, two high-level abstractions in Flink. The tutorial comes with a bundled docker-compose setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.

flink

72 技术 lddgo 分享于 2022-09-13

Sort-Based Blocking Shuffle Implementation in Flink - Part Two

In part two of this blog post, we will give you insight into some core design considerations and implementation details of the sort-based blocking shuffle in Flink and list several ideas for future improvement.

flink

211 技术 lddgo 分享于 2022-09-13

Sort-Based Blocking Shuffle Implementation in Flink - Part One

Part one of this blog post will explain the motivation behind introducing sort-based blocking shuffle, present benchmark results, and provide guidelines on how to use this new feature.

flink

68 技术 lddgo 分享于 2022-09-13

How We Improved Scheduler Performance for Large-scale Jobs - Part Two

Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.

flink

92 技术 lddgo 分享于 2022-09-13

How We Improved Scheduler Performance for Large-scale Jobs - Part One

When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink tasks)

flink

60 技术 lddgo 分享于 2022-09-13

简体中文