• 文库
  • 字符
  • 转换
  • 加密
  • 网络
  • 更多
    图表
    数学
    坐标
    图片
    文件
  • 文库
    字符
    转换
    加密
    网络
    更多
    图表
    数学
    坐标
    图片
    文件
logo 在线工具大全
所有 中文 英语 最新 热度
47 条查询结果

Apache Flink is a very versatile tool for all kinds of data processing workloads. It can process incoming data within a few milliseconds or crunch through petabytes of bounded datasets (also known as batch processing).

73 技术 lddgo 分享于 2022-09-14

To best understand state and state backends in Flink, it’s important to distinguish between in-flight state and state snapshots. In-flight state, also known as working state, is the state a Flink job is working on. It is always stored locally in memory (with the possibility to spill to disk) and can be lost when jobs fail without impacting job recoverability. State snapshots, i.e., checkpoints and savepoints, are stored in a remote durable storage, and are used to restore the local state

102 技术 lddgo 分享于 2022-09-14

Flink has supported resource management systems like YARN and Mesos since the early days; however, these were not designed for the fast-moving cloud-native architectures that are increasingly gaining popularity these days, or the growing need to support complex, mixed workloads (e.g. batch, streaming, deep learning, web services). For these reasons, more and more users are using Kubernetes to automate the deployment, scaling and management of their Flink applications.

118 技术 lddgo 分享于 2022-09-14

This new release brings remote functions to the front and center of StateFun, making the disaggregated setup that separates the application logic from the StateFun cluster the default. It is now easier, more efficient, and more ergonomic to write applications that live in their own processes or containers. With the new Java SDK this is now also possible for all JVM languages, in addition to Python.

70 技术 lddgo 分享于 2022-09-14

Streaming jobs which run for several days or longer usually experience variations in workload during their lifetime. These variations can originate from seasonal spikes, such as day vs. night, weekdays vs. weekend or holidays vs. non-holidays, sudden events or simply the growing popularity of your product.

78 技术 lddgo 分享于 2022-09-14

Part one of this tutorial will teach you how to build and run a custom source connector to be used with Table API and SQL, two high-level abstractions in Flink. The tutorial comes with a bundled docker-compose setup that lets you easily run the connector. You can then try it out with Flink’s SQL client.

66 技术 lddgo 分享于 2022-09-13

In part two of this blog post, we will give you insight into some core design considerations and implementation details of the sort-based blocking shuffle in Flink and list several ideas for future improvement.

205 技术 lddgo 分享于 2022-09-13

Part one of this blog post will explain the motivation behind introducing sort-based blocking shuffle, present benchmark results, and provide guidelines on how to use this new feature.

61 技术 lddgo 分享于 2022-09-13

Part one of this blog post briefly introduced the optimizations we’ve made to improve the performance of the scheduler; compared to Flink 1.12, the time cost and memory usage of scheduling large-scale jobs in Flink 1.14 is significantly reduced. In part two, we will elaborate on the details of these optimizations.

85 技术 lddgo 分享于 2022-09-13

When scheduling large-scale jobs in Flink 1.12, a lot of time is required to initialize jobs and deploy tasks. The scheduler also requires a large amount of heap memory in order to store the execution topology and host temporary deployment descriptors. For example, for a job with a topology that contains two vertices connected with an all-to-all edge and a parallelism of 10k (which means there are 10k source tasks and 10k sink tasks and every source task is connected to all sink tasks)

52 技术 lddgo 分享于 2022-09-13