这是本节的多页打印视图。点击此处打印.

Archtecture & Design

用于梳理架构与设计相关的笔记，包括 Enterprise Architecture、Scalability/Performance、Design、Microservice、Service Mesh、Patterns、Security。

1: 模型推理架构
2: 云原生模式 CLOUD NATIVE PATTERNS
3: 消息队列资料汇总
4: Prometheus Practise
5: Notes for Patterns of Enterprise Application
6: GPU 那些事儿
7: PlantUML + Archimate 记录
8: Reliability Engineering 可用性工程
9: About Cncf Projects
10: Microservice Arch 点点滴滴
11: Big Data Arch
12: RPC vs Http
13: GCP Study
14: 限流的那些事
15: 微服务架构下Deloyment最佳方式探讨
16: Cloud Computing Service Modeling
17: 控制资源的访问权限
18: APM资料整理
19: API Trend 现状
20: API 兼容性设计
21: Web API Spec管理平台
22: 迁移线上服务方案

1 - 模型推理架构

概念解释

Squared loss

a popular loss function, also known as L2 Loss.

Mean square error (MSE) is the average squared loss per example over the whole dataset.

Gradient Descent

SGD(Stochastic Gradient Descent) & Mini-Batch Gradient Descent.

Predict Protocol

This document proposes a predict/inference API independent of any specific ML/DL framework and model server.

This protocol is endorsed by NVIDIA Triton Inference Server, TensorFlow Serving, and ONNX Runtime Server.

Seldon has collaborated with the NVIDIA Triton Server Project and the KServe Project to create a new ML inference protocol. The core idea behind this joint effort is that this new protocol will become the standard inference protocol and will be used across multiple inference services.

详细见： [Predict Protocol - Version 2][2]

对标方案 - Run computer vision inference on large videos

Real-time Inference

Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements.

TensorRT through NVIDA Triton

TensorRT is a C++ library for high performance inference on NVIDIA GPUs and deep learning accelerators.

Tutorial: https://github.com/NVIDIA/TensorRT/blob/main/quickstart/SemanticSegmentation/tutorial-runtime.ipynb

Load TensorRT Engine
Run inference

TensorRT to Triton: https://github.com/NVIDIA/TensorRT/tree/main/quickstart/deploy_to_triton

Asynchronous inference

An Example for SageMaker: Amazon SageMaker Processing Video2frame Model Inference [3]

end-to-end flow with Asynchronous inference endpoint

Prerequisites

实操方式： https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference-create-endpoint-prerequisites.html

Create an IAM role for Amazon SageMaker.
Add Amazon SageMaker, Amazon S3 and Amazon SNS Permissions to your IAM Role.
Upload your inference data (e.g., machine learning model, sample data) to Amazon S3.
Select a prebuilt Docker inference image or create your own Inference Docker Image.
- Use Your Own Inference Code
Create an Amazon SNS topic (optional)
- Check Your S3 Bucket: https://docs.aws.amazon.com/sagemaker/latest/dg/async-inference-check-predictions.html

Use Your Own Inference Code with Hosting Services

https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-main.html，

how Amazon SageMaker interacts with a Docker container that runs your own inference code for hosting services.

How SageMaker Loads Your Model Artifacts

How Containers Serve Requests:

Containers need to implement a web server that responds to /invocations and /ping on port 8080.

How Your Container Should Respond to Inference Requests:

A customer’s model containers must respond to requests within 60 seconds.

Adapting Your Own Inference Container

https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html

详细介好了基于 sagemaker 来自定义的推理过程。 https://sagemaker-examples.readthedocs.io/en/latest/frameworks/pytorch/get_started_mnist_deploy.html

Create

Create a model in SageMaker with CreateModel.
Create an endpoint configuration with CreateEndpointConfig.
Create an HTTPS endpoint with CreateEndpoint.

Prebuilt SageMaker Docker Images for Deep Learning

https://docs.aws.amazon.com/sagemaker/latest/dg/pre-built-containers-frameworks-deep-learning.html

https://aws.amazon.com/cn/blogs/machine-learning/bring-your-own-pre-trained-mxnet-or-tensorflow-models-into-amazon-sagemaker/

Entire process:

Step 1: Model definitions are written in a framework of choice.
Step 2: The model is trained in that framework.
Step 3: The model is exported and model artifacts that can be understood by Amazon SageMaker are created.
Step 4: Model artifacts are uploaded to an Amazon S3 bucket.
Step 5: Using the model definitions, artifacts, and the Amazon SageMaker Python SDK, a SageMaker model is created.
Step 6: The SageMaker model is deployed as an endpoint.

TensorFlow BYOM: https://github.com/aws/amazon-sagemaker-examples/blob/main/advanced_functionality/tensorflow_iris_byom

MXNet BYOM: https://github.com/aws/amazon-sagemaker-examples/tree/main/advanced_functionality/mxnet_mnist_byom

Pytorch: https://github.com/aws/amazon-sagemaker-examples/tree/main/advanced_functionality/pytorch_bring_your_own_gan

Batch Transform

Serverless Inference

Amazon SageMaker Serverless Inference is a purpose-built inference option that makes it easy for you to deploy and scale ML models. Serverless Inference is ideal for workloads which have idle periods between traffic spurts and can tolerate cold starts.

对标方案 - Machine Learning Platform for AI - EAS

一健部署
高性能
蓝绿部署
弹性扩缩
编译优化

参考

·End·

2 - 云原生模式 CLOUD NATIVE PATTERNS

随着底层基础设施的云原生的流行，对上层应用的开发模式带来了变化：

Refs

Cloud Native Pattern: https://github.com/ContainerSolutions/cloud-native-patterns
Cornelia Davis: Cloud Native Patterns_ Designing change-tolerant software.pdf
Pini Reznik, Jamie Dobson & Michelle Gienow: cloud-native-transformation-practical-patterns-for-innovation

·End·

3 - 消息队列资料汇总

汇总收集到的关于 MQ 方面的资料，包括 RabbitMQ、Kafka、Pulsar等

基础概念

模式	说明
简单原型	生产者往消息队列中扔消息，消费者从消息队列取消息，一次取一个。缺点是局域网内最大消费速度是2000QPS
批量处理消息	每次可以取多条消息处理
多个 Consumer 处理消息 - Push 模式	push模式很难适应消费速率不同的消费者，因为消息发送速率是由broker决定的。push模式的目标是尽可能以最快速度传递消息，但是这样很容易造成consumer来不及处理消息，典型的表现就是拒绝服务以及网络拥塞。中心节点需要监听Consumer 的 ACK 消息用于判断消息是否处理成功，并处理当前的消息。
多个 Consumer 处理消息 - Pull 模式	而pull模式则可以根据consumer的消费能力以适当的速率消费消息。这种模式好处，消费者不需要返回 ACK 信息，因为当消费者申请消费下一条消息可以认为上一条领取的消息已经处理完，也不需要处理超时的问题，Consumer 愿意处理到啥时候就到啥时候。如何保证多个 consumer 处理的消息不会重复了？
多个消息队列 - Pull 模式	kafka模式点击查看图片

详细查看：流式计算 - kafka(1)

案例分析

知乎千万级高性能长连接网关揭秘

我们怎么设计通信协议
- 业务解耦
  - 基于经典的发布订阅模式
  - 传输的消息是纯二进制数据，网关也无需关心业务方的具体协议规范和序列化方式。
- 权限设计
  - 基于回调的鉴权
  - Topic 模板变量
- 消息可靠性保证
  1. 回执和重传
  2. 基于消息队列的接收和发送方式
我们怎么设计系统架构
我们如何构建长连接网关
- 接入层
  - 负载均衡
    - 为什么不用IP Hash：分布不均匀和不能准确的标识客户端。
    - 基于七层负载均衡，使用 Nginx 的 Preread机制。实现基于客户端的唯一标识来进行一致性 Hash
- 订阅
- 发布
- 会话
- 持久化
- 滑动窗口

基于 Flink 的资讯场景实时数仓

基于Flink的资讯场景实时数仓

如何将 Kafka 和 Flink 进行整合，通过消息队列 Kafka 版和实时计算 Flink 实现实时 ETL 和数据流。

参考资料

[1] Alicloud: 基于Flink的资讯场景实时数仓

·End·

4 - Prometheus Practise

背景

为什么要自建 Promethues, 云服务商提供的挺好的。

名词解释

刚接触, 还是很多新词汇

名词	解释
Exportor	用于向Prometheus Server暴露数据采集的endpoint，Prometheus轮训这些Exporter采集并且保存数据；
ServiceMonitor	a ServiceMonitor describes the set of targets to be monitored by prometheus.
Prometheus Operator	简单运行 Promethues 在 kubernetes 上，并保持 Kubenetes 本土化的配置选项。Prometheus operator design

Operator

An Operator is an application-specific controller that extends the kubernetes API to create , configure, and manage instances of complex stateful applications on behalf of a kubernetes user.

it builds upon the basic kubernetes resource and controller concepts but includes domain or application-specific knowledge to automate common taks.

An Operator is software that encodes this domain knowledge and extends the kubernetes API through the third party resources mechanism, enabling users to create, configure, and manage applications.

Operator 与 Controller 区别在于：
a Controller with the following characteristics qualify as an operator :
1. Contains workload-specific knowledge
2. Manages workload lifecycle
3. Offers a CRD

Prometheus

Monitoring Stacks consist of a collector, a time-series database to store metrics, and a visualization layer.

A popular open-source stack is Prometheus, used along with Grafana as the visualization tool to create rich dashboards.

Reference Prometheus architecture.

设备插件 device-plugin

Kubernates Device Plugin 为 Kubernetes 提供了一个设备插件框架，你可以用它来将硬件资源发布到 Kubelet。好处有：

供应商可以实现设备插件，手动部署或者 DaemonSet 来部署，而不必定制 Kubernetes 本身的代码。目标设备可以是 GPU、高性能的 NIC、 FPGA、 InfiniBand 适配器。

GPU Telemetry using dcgm-exporter in Kubernetes

上图说明一个问题：通过 device-plugin 监控里的 pod-resources socket, 确定跟某一个 pod 相关联的设备信息。

监控指标

GPU Grafana 面板配置： GPU NODES V2

指标	说明
DCGM_FI_DEV_FB_USED	GPU 已用显存
DCGM_FI_DEV_FB_FREE	GPU 未用显存
DCGM_FI_DEV_GPU_UTIL	GPU 使用率
DCGM_MEM_COPY_UTILIZATION	见内存利用率对比

开始

使用 dcgmproftester

Generating a load

有感

K8s CRD 无处不在

FAQ

DCGM_FI_DEV_FB_FREE 数值和实际的内存对不上的？
- A: 使用的型号是T4系列，具体的型号是 ecs.gn6i-c16g1.4xlarge, GPU 显存为 16G，系统内存为 62G。而 NVIDIA dcgm-exporter 中监控的指标 DCGM_FI_DEV_FB_USED 为 GPU 显存大小，系统的内存通过 node_memory_MemTotal_bytes、node_memory_Buffers_bytes、node_memory_Cached_bytes、node_memory_MemFree_bytes 指标（来自 GPU NODES V2 ) 来监控.
- A: 当在 GPU 环境下提到内存时，须要区分下说的是 GPU 显存还是系统内存。
DCGM_MEM_COPY_UTILIZATION 内存利用率，内存利用率对比
- Utilization = time over the past sample period / global (device) memory was being read or writted * 100%

参考

Monitoring GPUs in Kubernetes with DCGM

Introducing Operators : Putting Operational Knowledge into Software .

Promethues Operator User Guide

Controllers and Operators, https://octetz.com/docs/2019/2019-10-13-controllers-and-operators/

Promethues Book 中文

NAIDIA GPU monitoring tools on Linux

Integrating GPU Telemetry into Kubernetes

Promethues 动态发现 Target 和 Relabel 的应用

GPU-Nodes-Metrics 12027 设置

·End·

5 - Notes for Patterns of Enterprise Application

数据源架构模式

表数据入口 (Table Data GateWay)

表数据入口包含了用于访问单个表或视图的所有SQL，如选择、插入、更新、删除等。其他代码调用它的方法来实现所有与数据库的交互.

每个方法都将输入参与映射为一个SQL调用并在数据库连接上执行该语句。由于表数据入口用于数据读写，因此通常是无状态的。

行数据入口 (Row Data GateWay)

充当数据源中单条记录入口的对象。每行一个对象。

活动记录 (Active Record)

一个对象，它包装数据库表或视图中某一行，封装数据库访问，并在这些数据上增加领域逻辑。

本质是一个领域模型

Pros

容易创建、易于理解

Cons

要求对象的设计和数据库的设计紧耦合
业务逻辑复杂时。

数据映射器(Data Mapper)

在保持对象和数据库（以及映射器本身）彼此独立的情况下，在二者之间移动数据的一个映射层。

对象和关系数据库用来组织数据的机制不同。对象的很多部分（如集合和继承）在关系数据库中不存在。

处理查找方法

分离接口解决这一难题：从领域对象到数据映射器的依赖关系。利用领域代码把所有需要查找方法放到一个可以置于领域包中的接口类中。

在一个包中定义接口，而在另一个与这个包分离的包中实现这个接口。

把数据映射到领域对象的域

映射器需要访问领域对象中的域（属性）。这往往是个问题，因为需要一些公共方法支持领域逻辑不需要的映射器。???

基于元数据的映射

如何将领域对象中的域映射到数据库列的信息。

用显式代码实现，
把元数据作为数据存储在类或单独的文件中。这就是元数据映射。
- 好处是映射器的所有变化通过数据处理，而不用更多的源代码，也不用代码生成或者反射程序。

Pros

解耦数据库和领域对象

Cons

引入新的层次。

分布模式

远程模式（Remote Facade)

为细粒度对象提供粗粒度的外观来改进网络上的效率

一个远程外观是一个粗粒度的外观(facade)，它建立在大量的细粒度对象之上，所以细粒度对象都没有远程接口，并且远程外观不包括领域逻辑。远程外观所要完成功能是把粗粒度的方法转化到底层的细粒度对象上。

在粗粒度对象和细粒度对象之间的一层薄薄的皮肤。

@startuml
participant a as "an address facade"
participant b  as "an address" 
[-> a: getAddressData 
a -> b: getCity 
a -> b: getState
a -> b: getZip
@enduml

远程外观功能：

提供一个粗粒度的接口
提供安全检查
事务控制：开启一个事务，当做完许多工作之后提交事务。

远程外观没有领域逻辑

数据传输对象（Data Transfer Object）

一个为了减少方法调用次数而在进程间传输数据的对象。

数据传输对象的价值在于它允许你在一次调用中传输几部分的信息。

数据传输对象通常不仅仅包含一个服务器对象。

通常不能从领域模型中传输对象。原因有：

对象常常在复杂的 web 中连接起来，并且能够序列化的话也很难
你通常还不想在客户端看到领域对象类，应该从领域对象中传输一个简单格式的数据。

数据传输对象的常见格式有

记录集，它是一系列的表格记录。
集合数据结构

使用时机

当你需要在一个方法调用中在两个进程之间传输多个数据项，应使用数据传输对象模式。
做为不同软件层中各种组件的通用数据源

FAQ

是用单一数据传输对象来处理整个交互，还是用不同的数据传输对象来处理不同的请求。
是为请求房和发送方各自准备一个数据传输对象，还是用一个单一的数据传输对象来负责交付。

离线并发模式

乐观离线锁，悲观离线锁，粗粒度锁，隐含锁

乐观离线锁

通过检查在会话读取一条记录后，没有其他的会话修改该数据来保证数据的一致性。

悲观离线锁

每次只允许一个业务事务访问数据以防止并发业务事务中的冲突

运用机制

通过 3 步来实现悲观离线锁：决定需要使用哪种锁类型，构建一个锁管理对象，定义业务事务使用锁的过程。

锁类型有：

exclusive write lock 独占写锁
exclusive read lock 独占读锁
read / write lock 读写锁

分布式锁

分布式环境下锁的全局唯一资源，使请求串行化，实际表现互斥锁，解决业务幂等问题。

强一致性、服务本身高可用使最基本的需求，其他的比如支持自动续费，自动释放机制，高度抽象接入简单，可视化，可管理等。

基于 Redis 缓存的分布式锁^[1]
- 存在单点问题，一旦涉及到 redis 集群，就会出现重复加锁的情况。
- 基于超时时间无法续租问题，随机数(fencing token^[2])解决了锁被其他任务释放的问题，但是还是无法解决超时导致的锁释放的问题。Redission 采用了 Watch dog 模式来解决这个问题的，具体是后台开启一个线程，每隔一定的时间去检查该锁还有多久超时，然后给这个锁进行续租。
- 异步主从同步问题
基于存储层的可靠的解决方案，比如 zookeeper / ETCD

会话状态模式

客户端会话模式、服务器会话模式、数据库会话模式

当服务器会话状态也需要持久化时，服务器会话状态和数据库会话状态之间区别是： 是否将服务器会话状态中的数据转化为表格形式。

参考

[1] Somersames: Redis 实现的分布式锁是完美的吗？

[2] Martin Kleppmann: How to do distributed locking

·End·

6 - GPU 那些事儿

概念解释

MapReduce

为离线批处理服务的模型，分成 Map 任务分发和 Reduce 结果收集两个阶段，能以相同方式处理不同数据来源的大数据作业。

根据数据来源的不同和作业特点分为

批计算
流计算, 处理连续的大规模数据流，将无界的数据流划分成固定大小的有界批处理子集
内存计算, 中间数据需要大量的迭代处理，把Map任务中的中间数据保存到内存中。
图计算，有依赖关系的多个子任务的 MapReduce 迭代计算形式。
交互计算, 对计算延迟敏感，或是包含多种计算模式的复杂作业。

Spark 和 Flink

大数据处理框架，提供基于 MapReduce 模式的批处理、流计算、内存计算和图计算等多种计算模式，来处理不同输入形式的大数据作业。

Flink

虽然没有使用 Flink, 但是需要了解 Flink 能做什么。

事件驱动型应用
数据分析应用
数据管道应用

详细查看 Flink 应用场景

Spark

BSP

并行计算模型中，大同步并行模型，该模型可表示由多个超级步组成的计算过程。在每个超级步中各处理器执行局部计算，再完成点点数据同行，最后全局同步检查来确定所有处理器是否完成运算。

DOT 模型

在BSP基础上，研究者扩展可用于大数据计算计算模型。

采用矩阵形式化描述大数据处理的计算和通信行为. 将计算过程分为三个层次：数据层、操作层、转换层。

DOTA 模型

中科院徐志伟团队提出, 在原来的 DOT 模型基础上，增加聚合层（A-Layer）回合处理转移层的中间数据并且完成最终结果。

p-DOT 模型

该模型沿用 BSP 的思路把大数据计算任务视为 p 阶段的 DOT 模型。在多次迭代的计算阶段内，p-DOT 模型由数据层、计算层、通信层构成。

并行计算

来自《并行计算》

MPI

MPI 是一个跨语言的通信协议，支持高效方便的点对点、广播和组播。

MPI 属于 OSI 参考模型的第五层或者更高，他的实现可能通过传输层的 sockets 和 TCP 覆盖大部分的层。

MPI 标准不断演化，MPI-1 模型不包括共享内存概念。

MPI 有很多实现，例如 mpich 或者 openmpi

MPS

The Multi-Process Service (MPS) is an alternative, binary-compatible implementation of the CUDA Application Programming Interface (API).

The MPS runtime architecture is designed to transparently enable co-operative multi-process CUDA applications, typically MPI jobs, to utilize Hyper-Q capabilities on the latest NVIDIA (Kepler-based) GPUs.

Kuberentes下 GPU 资源的使用

在一个 1 GPU 上跑多个 Job?, 按照文章的思路是可以实现的。

社区讨论

TKE 上实现 GPU Share 从测试数据以及 GIGAStack 产品，这种方案在正式环境上run起来的

GPU Sharing Scheduler Extender in Kubernetes 阿里云上实现 GPU Share

GPU Sharing Scheduler Extender Now Supports Fine-Grained Kubernetes Clusters
GPU Sharing in Kubernetes 阿里云共享 GPU 的设计。

KubeShare - Share GPU between Pods in Kubernetes 学校试验，不确定在生产环境稳定运行。

Supporting MIG in Kubernetes k8s-device-plugin 从0.7.0版本开始，支持 MIG（Multi-Instance GPUs)

GPU 虚拟化方案

想解决 GPU 资源合理分配问题，先将设备虚拟化。

Kubernetes 多卡GPU使用和分析文章里将 GPU 节点先进行虚拟化，更改上报给 kubelet 的 DeviceID，以及在 kubelet 调用 Allocate() 请求时将虚拟 DeviceID 转化为对应的实际 DeviceID.

[ GPU 虚拟化技术（四） - GPU 分片虚拟化 ]（http://cloud.it168.com/a2018/0611/3208/000003208253.shtml?1）提到分片从两个维度来定义：

是对 GPU 在时间片段上的划分，一个物理 GPU 的计算 engine 在几个 vGPU 之间共享，而调度时间片一般都在 1ms - 10ms 左右，
是对 GPU 资源的划分，主要是指对 GPU 显存的划分，由于安全隔离的要求，每个 vGPU 独享分配给它的显存，不会与其他 vGPU 共享。

该文章提到更深入的 GPU 分片技术框架。

GPU 非虚拟化方案

todo

NVIDIA GPU OPERATOR

在 Prometheus Proactise之设备插件 device-plugin 提到 Kubernetes 下如何支持新的硬件。

Configuring and Managing nodes with these hardware resources require configuration of multiple software components such as drivers, container runtimes or other libraries which are difficult and prone to errors.

The Nvidia GPU Operator uses the operator framework within kubernetes to automate the management of all NVIDIA software components needed to provison GPU. These components include the NVIDIA drivers (to enable CUDA), Kubernetes device plugin for GPUs, the NVIDIA Container Toolkit, automatic node labelling using GFD, DCGM based monitoring and others.

参考

[1] xtaohub.com: 一切靠自己的 MPI 框架

[2] stackoverflow: How do I use Nvidia Multi-process Service (MPS) to run multiple non-MPI CUDA applications?

[3] Enward: NVIDIA MPS总结

[4] NVIDIA: 《CUDA_Multi_Process_Service_Overview》

NVIDIA GPU OPERATOR

深入了解 GPU 硬件架构及运行机制非常全面的介绍

并行计算在线课程，必读

论文：面向大数据复杂应用的 GPU 协同计算模型

Flink 应用场景

·End·

7 - PlantUML + Archimate 记录

背景

画好一张图(能表达清楚内容)不容易，画出漂亮的的一张图更难。好的工具对于完成这件事已经完成一半，剩下另外一半需要作图者发挥了。

工具

PlantUML, PlantUML能记录所有的变更，也能转化为漂亮的图型.
VS Code, 其中PlantUML的插件可以实时查看图形
Archimate, 一套能解决各种架构图的成熟的方案, 覆盖非常多的Case

详细

参考

ArchiMate Cookbook, 在Kanmi APP里阅读
ArchiMate 各种图介绍， https://www.hosiaisluoma.fi/blog/category/archimate/
《企业架构建模 - ArchiMate 语言》 https://www.slideshare.net/zhoujg/archimate
TOGAF 学习， https://www.cnblogs.com/zhoujg/
PlantUML 语法速查手册
PlantUML 参数设置手册

·End·

8 - Reliability Engineering 可用性工程

可用性工程到底指什么？当提到可用性建设的时候，其实是需要建设什么内容？

行业标杆

2019年的时候了解到Google SRE团队在建设的可靠性工程， 2019年百度对外公布其可用性工程建设的情况。

百度可用性功能建设

可用性的问题

可用性工程的需求

可用性工程技术标准

预防故障发生能力
程序代码、测试、变更规范
操作操作规范，操作审计
运营容量规划
防攻击
平台/第三方服务服务SLA
基础设施基础设施SLA
预防故障扩散能力
故障快速发现能力
故障定位止损能力
灾难恢复能力

落地

如何落地

参考

百度服务可用性工程建设, 形成了一套可用性功能技术标准。 https://www.infoq.cn/article/C4PddPgiGNFGTqD6pZAK

·End·

9 - About Cncf Projects

CNCF Project vs CNCF Member Project 这有什么区别？

Projects

OpenTelemetry

OpenTelemetry 在遇到以下无法解决的问题情况下出现了。

应用程序被锁定在特定解决方案的仪表中
针对开源软件的特定解决方案的仪表基本上是不可能的

于是需要设计一个可观测的系统来解决以上问题，在设计上需要满足基本的需求^[4]：

要求：独立的仪表(instrumentation，这个翻译总觉得怪异，)、遥测和分析
要求：零依赖性
要求：严格的后向兼容和长期支持

概念

信号 signal

不同类型的遥测，我们称之为信号，主要的信号是追踪

OpenTelemetry 是一个跨领域的关注点(cross-cutting concern)

Instrumentation

仪表（在《OpenTelemetry 可观测性的未来》这样翻译的）

但是，来自谷歌：the particular instruments used in a piece of music / measuring instruments regarded collectively.

OTLP

OpenTelemetry protocol (OTLP)，定义了 Open Telemetry 里 Tracing\Metrics\Logging 的 protobuf 的协议格式。比如 Tracing

Propagators and Context

Propagators: Used to serialize and deserialize specific parts of telemetry data such as span context and Baggage in Spans.

Traces can extend beyond a single process. This requires context propagation, a mechanism where identifiers for a trace are sent to remote processes.

otel.SetTextMapPropagator(propagation.TraceContext{})

TextMapPropagator injects values into and extracts values from carries as text.

Carrier

A carrier is the medium used by Propagators to read values from and write values to.

结合 Newrelic Tracing 的实践

OpenTelemetry and Newrelic 结合 , 中间通过 opentelemetry-go 来连接。

也可以通过 opentelemetry-collector (e.g. binary, sidecar, or daemonset). 方式来做。

结合 Alibaba Tracing Analysis（ARMS）的实践

What is Tracing Analysis，从 Architecture 看是支持 Opentracing Basing SDK

Tracing Analysis is compatible with SDKs from various open source communities and supports the OpenTracing standard.

Apache DolphinScheduler

A distributed and easy-to-extend visual workflow scheduler system, undergoing incubation at ASF.

MegaEase

开源、自主可控、低层本、高可用的 Cloud Native 平台

服务编排和服务治理
流量调度和流量管理
应用服务观测性 & DevOps
关键中间件运维及管理
基础资源调度

参考

OpenTelemetry-可观察性的新时代
CNCF 项目或者成员项目
OpenTelemetry: Propagators API
Ted Young，译者 Jimmy Song: OpenTelemetry可观察性指南
Uptrace: OpenTelemetry instrumentations for Go

·End·

10 - Microservice Arch 点点滴滴

image via: https://www.infoq.com/presentations/uber-microservices-distributed-tracing/

Pattern: microservice architecture

“An architectural style that structures an application as a set of deployable/executable units, a.k.a. services”

Highly maintainable and testable
Minimal lead time (time from commit to deploy)
Loosely coupled
Independently deployable
Implements a business capability
Owned/developed/tested/deployed by a small team

行业架构最佳实践

Best practices framework for Oracle Cloud Infrastructure TODO

行业架构分享

不断更新行业的一些架构分享，进行分析总结。

Designing loosely coupled services[Slides]

by Chris Richardson, 介绍了几种类型 coupling 及其缺点和如何设计 loosely coupled 微服务.

Runtime coupling，订单服务需要等待客户服务返回时才给出响应，减少了可用性

Design time coupling，当客户服务变化时，订单服务也跟着变化。减少了开发的独立性

Minimizing design time coupling

DRY
Consume as little as possible
Icebergs: expose as little as possible
Using a database-per-service

Reducing runtime coupling

Use resilience patterns for synchronous communication
Self-contained service
Improving availability: replace service with module
Use asynchronous messaging
Improving availability: sagas
Improving availability: move responsibility + CQRS

Avoiding infrastructure coupling

Use private infrastructure: minimizes resource contention and blast radius
Use “Private” message brokers
Fault isolated swim lanes

模式 PATTERNS

Data Management

Database per Microservice

CQRS

Event Sourcing

Materialized View Patterns

Generate prepopulated views over the data in one or more data stores when the data isn’t ideally formatted for required query operations. this can help support efficient querying and data extraction, and improve application performance.

Context and problem

选择存储数据的方式跟数据本身的格式、数据大小、数据完整性以及所使用的存储种类，但是，这样带来查询的不好的影响。比如当查询数据的子集时，必须取出所有的相关的数据，比如查询一些客户的订单概览

Solution

为了支持高效率的查询，通用的解决办法是，提前生成数据视图（materializes the data in a format suited to the required results set.）

Messaging

Design and Implementation

BFF

API GateWay

Strangler

Consumer-Driven Contract Tracing

Externalized Configuration

Facilitators

Facilitators^[5] are simple a new type that has access to the type you wished you had generic methods on. 比如，如果你是 ORM framework 的设计者，想提供一些查询表格的方法。你提供了一个中间类型（Querier），这个中间类型允许你写一些 generic querying functions。

Resilience Patterns - Sagas

Saga distributed transactions , a way to manage data consistency across microservices in distributed transaction scenarios。

Choreography
Orchestration.

Resilience Patterns - Circuit Breaker

" used to limit the amount of requests to a service based on configured thresholds – helping to prevent the service from being overloaded " – 断路器

同时，通过监控多少个请求失败了，来阻止其他的请求进入到服务里

CircuitBreaker 使用 sliding window 来存储和集合发生的请求。可以选择 count-based 也可以选择 time-based。

Circuit Breakers in Go Golang语言的实现.

Image via: https://docs.microsoft.com/en-us/azure/architecture/patterns/circuit-breaker

Resilience Patterns - Bulkhead

" Isolates services and consumers via partitions “, 舱壁模式, 在航运领域，舱壁是船的一部分，合上舱口后可以保护船的其他部分。

SemophoreBulkhead, work well across a variety of threading and io models. it is based on a semaphore.
ThreadPoolBulkhead, uses a bounded queue and a fixed thread pool.

防止级联失败发生. 但对应用来说该模式增加了负担.

image via: https://www.jrebel.com/blog/microservices-resilience-patterns

什么时候使用：

Isolate resources used to consume a set of backend services, especially if the application can provide some level of functionality even when one of the services is not responding.
Isolate critical consumers from standard consumers
Prodect the application from cascading failures

Request_id

Better Logging Approach For Microservices request_id在日志中打印，由请求方生成发起

从What is the X-REQUEST-ID http header?说明来看，建议是client生成x-request-id.

Resilience Patterns - RateLimiter

限流的基础算法

漏桶算法
- 漏桶算法的实现往往依赖于队列，请求到达如果队列未满则直接放入队列，然后有一个处理器按照固定频率从队列头取出请求进行处理。如果请求量大，则会导致队列满，那么新来的请求就会被抛弃。

令牌桶算法
- 一个存放固定容量令牌的桶，按照固定速率往桶里添加令牌。桶中存放的令牌数有最大上限，超出之后就会被丢弃或拒绝。当流量或者网络请求到达时，每个请求都要获取一个令牌，如果能获取到，则直接处理，并且令牌桶删除一个令牌。如果获取不到，则该请求就要被限流，要么直接丢弃，要不再缓冲区等待。
- 长期来看，所限制的请求速率的平均值等于 rate（每秒向桶添加令牌的速率r）的值
- 实际请求达到的速率为 M，达到的最大速率为 M = b + r (其中b 为令牌桶的最大值)

参考

What is the X-REQUEST-ID http header? https://stackoverflow.com/questions/25433258/what-is-the-x-request-id-http-header)

Resilience4j is a fault tolerance library designed for Java8 and functional programming https://github.com/resilience4j/resilience4j

Azure Cloud Design Patterns, used in the cloud for building reliable, scalable, secure applications https://docs.microsoft.com/en-us/azure/architecture/patterns/index-patterns

·End·

.NET Microservices: Architecture for Containerized .NET Applications （PDF 在Kami 上）https://docs.microsoft.com/en-us/dotnet/architecture/microservices/

[5] JBD: Generics facilitators in Go

11 - Big Data Arch

背景

行业案例

腾讯游戏大数据应用

大数据背后的价值 - 腾讯游戏大数据应用

大数据落地应用 = 数据 + 系统 + 算法 + 应用场景
腾讯游戏用户数据分层体系
腾讯游戏数据处理系统架构

饿了么数据仓库治理及数据应用

大数据背后的价值 - 饿了么数据仓库治理及数据应用

数据仓库的建设

标准化和规范化

统一日志搜集框架

原则

主题划分
数据一致性
维度建设

TODO

数据权限管理
数据使用记录
数据开放平台

Uber Freight Carrier Metrics With Near-Real-Time Analytics

Uber Final System Design

Introduction
How We Did It
- Backend Requirement
- Potential Solution Considered
- Final System Design
- Data Schema
- Flink stateful Stream Process
- Hybrid Pinot Table
- Golang GRPC Service
Impact
Conclusion

通用方案

存储读取/写入

Data Lake 系列：关于 EMRFS S3 优化的提交程序，你了解吗文章与 FileOutputCommitter 进行了比较。

同时在 github repo s3committer 引出了 multi-part upload API 技术，可以用于处理大文件上传慢的问题。

但是对于小文件上传问题，是否可以就并发上传就行了呢？ No

方案：压缩上传，上传完成后通过 AWS Lambda 来解压缩。其中通过流（Stream）的方式解决 Only 500MB of disk space per instance 的限制^[1]，但执行时间有15分钟的限制，对于超大文件还是有。

参考

[1] John Paul Hayes: How to extract a HUGE zip file in an Amazon S3 bucket by using AWS Lambda and Python

·End·

12 - RPC vs Http

在评估 4G SDK 方案中，嵌入同事方案中使用 RPC 方案与服务器通信，让我感觉很奇怪，因为在我们的 web server 里通信一般都是 https 通信方式。

RPC

wikipedia’s List of network protocols (OSI model)

RPC 属于 Session Layer,

HTTP vs RPC

rpc 是远端过程调用，其调用协议通常包含传输协议和序列化协议。

传输协议包含：如著名的 [gRPC](grpc / grpc.io) 使用的 http2 协议，也有如 dubbo 一类的自定义报文的 tcp 协议。序列化协议包含：如基于文本编码的 xml json，也有二进制编码的 protobuf hessian 等。

HTTP 长连接

在参考^[2] 中提到 httpServer 怎么处理长连接的： httpServer 创建一个 goroutine，更确切的说，是为了为一个新的 tcp 连接去创建一个 goroutine，详细参考文章的源码。

Compare gRPC services with HTTP APIs

gRPC is designed for HTTP/2, vs HTTP 1.x:

binary framing and compression.
Multiplexing of multiple HTTP/2 calls over a single TCP connection. Multiplexing eliminates head-of-line-blocking

Feature	gRPC	HTTP APIs with JSON
Contract	Required (`.proto`)	Optional (OpenAPI)
Protocol	HTTP/2	HTTP
Payload	Protobuf (small, binary)	JSON (large, human readable)
Prescriptiveness	Strict specification	Loose. Any HTTP is valid.
Streaming	Client, server, bi-directional	Client, server
Browser support	No (requires grpc-web)	Yes
Security	Transport (TLS)	Transport (TLS)
Client code-generation	Yes	OpenAPI + third-party tooling

表格来自 https://docs.microsoft.com/en-us/aspnet/core/grpc/comparison

Head-of-line Blocking

详细描述了HOL https://engineering.cred.club/head-of-line-hol-blocking-in-http-1-and-http-2-50b24e9e3372

Key Points:

Frame
Message
Stream

The HOL Blocking issue is resolved at the HTTP layer in HTTP/2, but it now moves to the TCP layer.

HTTP/3 or QUIC solves HOL Blocking at TCP layer by leveraging UDP instead of TCP as the transport protocol.

参考

既然有 http，为什么还要 RPC 调用？ https://www.zhihu.com/question/41609070
Golang httpServer 对 KeepAlive 长连接的处理方式 https://blog.csdn.net/jeffrey11223/article/details/81222774

·End·

13 - GCP Study

问题

场景一 用户在客户端 A 上传了数据到数据处理服务 X，并对数据进行了处理，得到处理结果 用户在客户端 B 上能够获取数据处理服务 X的处理结果 ( 数据处理服务 X 对于此用户来说是透明的 )

场景二 数据运营管理平台，用户登录后在直接访问各个服务数据 API 的时候需要鉴权。

名词解释

API 表面 (API surface) API 的公共接口。API surface 包含各种方法，以及这些方法中使用的参数和返回类型。

Service Management Google Cloud 基础架构服务，创建并管理 API 和服务。

Extensible Service Proxy(ESP) 基于 NGINX 的服务代理，类似 Isto Service Mesh 方式。

Cloud endpoints 架构

它给出了 API 管理系统，通过 ESP 或 Endpoints Frameworks 提供比较可扩展服务代理和 Endpoints Frameworks

ESP 可扩展服务代理

endpoint-introduce

供给侧：

配置 endpoints: 在 OpenAPI 配置文件中描述 API Surface 并配置 Endpoints 功能（例如 API 密钥或者身份验证规则）
部署 endpoints 配置：定义的 API 后，使用 Cloud SDK 将其部署到 Service Managerment.
部署 API 后端：将 ESP 和 API 后端部署到受支持的 Google Cloud 后端，例如 Compoute Engine。ESP 会与 Endpoints 后端服务协同运作，以在运行时保护和监控您的 API。

Endpoints auchitect

组件：

ESP
Service Control
Cloud SDK
Google Cloud Console

K8S ESP

这张图更清晰了给出 endpoints 的架构

Endpoints Frameworks

如果需要开发一个基于 GCP 上的 Restful 服务，需要使用 Endpoint Frameworks, 它解决了什么问题？内置了一个 API 网关，拦截所有请求并执行所有必要的检查（例如身份验证），然后再将请求转发到 API 后端。后端响应后，会收集遥测数据并进行报告。

Endpoints Frameworks

如下功能都在 Endpoints Management 里面完成。

身份验证
API 密钥
监控
日志
配额
开发者门户
- 示例 Google cloud endpoints 从 openapi 文件解析并展示 API 页面。我们 API 平台可以参考下。

A python framework for building RESTful APIs on Google App Engine Cloud Endpoint for Go 已经 DEPRECATED，但是有借鉴的意义。

实践

选择身份验证方式在 endpoints 中实践身份验证。Cloud Endpoints 身份验证和 API 密钥

Identity and security

Authentication

None
API key
- Identifies your project using a simple API key to check quota（限额） and access
- API 密钥用于识别正在调用API的调用方项目（应用或网站），而身份验证令牌用于识别正在使用应用或者网站的用户（人员）
- API 密钥的用途
  - 项目识别 - 识别正在调用相应API的应用或项目
  - 项目授权 - 检查调用方应用是否拥有API的权限，以及是否已在其他项目中启用API
OAuth Client ID（以最终用户身份进行身份验证） requests user consent（同意） so your app can acces the user’s data
- 你需要代表应用的最终用户访问资源，如您的应用需要访问应用用户的 Google BigQuery 的数据集
- 你需要以用户身份而非作为你的应用进行身份认证。
- 两个用途
  - 用户身份验证 - 安全的验证调用方用户的真实的身份是否与宣称的一致
  - 用户授权 - 检查用户是否应具备发出此请求的权限
Service account Enables server-to-server, app-level authentication using robot account

GCP API 使用 OAuth 2.0 协议进行用户账号和服务账号的身份验证。 OAuth2.0 身份验证过程确定主账号和应用

用户账号作为 Google 账号进行管理
服务账号由 Cloud IAM 管理，代表非人类用户。

哪种适合，查看 Authentication strategies

API 密钥用于识别项目，身份验证用于识别用户

API密钥是不安全的；由于客户端通常可以访问API密钥，因此API密钥容易被他人窃取。密钥被窃取后，由于没有到期时间，因此可以无限期的使用，除非项目所有者撤销密钥或者重新生成密钥，API 密钥的安全性没有身份验证令牌高。

实践

Cloud EndPoints 快速入门实践文档。点击查看示例仓库地址案例代码

心得

当增加 API 密钥和限流的功能时，不需要更改后端服务任何代码
API 的监控通过 Google Cloud Console 来查看，非常方便。

分析

针对场景一

客户端有自己的业务后台服务，客户端 A 的业务后台服务 A，简称”后台 A“，客户端 B 则简称”后台 B“，那么后台 A、B 访问数据处理服务 X属于 server-to-server 的方式来进行身份认证的。缺点是应用级的鉴权。访问跨项目 BigQuery 数据集 Cross project management using service account

给后台 A 创建 Service-Account A, 并授予数据处理服务 X的资源创建、计算、查看角色
给后台 B 创建 Service-Account B, 并授予数据处理服务 X的资源查看、计算角色
~~问题是两个 service-account 之间资源理应是隔离的，没办法解决同一用户获得数据的问题。~~
在 Console 中将 Service-Account B 添加到后台 A，并赋予查看、计算角色

客户端没有自己的业务后台服务在 google cloud 里，Firebase 解决移动端用户身份验证 Firebase 用户认证和 IAM 的结合？

客户端 A 开发在同一登录 Congnito 中创建 Service-Account A, 并授予数据处理服务 X的资源创建、计算、查看角色
用户在客户端 A 使用 Congito 登录后，根据 Service-Account A 生成 STS Token，返回给客户端 A
客户端 A 使用 STS Token 直接访问数据处理服务 X，进行资源的创建，计算，查看
客户端 B 开发也在 Congnito 中创建 Service-Account B, 并授予数据处理服务 X的资源查看、计算角色
用户在客户端 B 使用 Congito 登录后，根据 Service-Account B 生成 STS Token，返回给客户端 B
客户端 B 使用 STS Token 直接访问数据处理服务 X，进行资源的创建，计算，查看用户访问的资源是隔离的。怎么处理？

针对场景二

使用 OAuth Client ID 方式登录 ”以最终用户身份进行身份验证“ 你的应用如何验证用户身份，Google Cloud 使用的 firebase 身份验证 以下步骤为实现无后台的数据运营平台。

完成服务 X 的上线，服务 X 需要接入数据运营平台，通过 IAM 并对用户 A 授予所有资源的查看权限。
用户 A 通过 OAuth2.0 登录后，提示他需要哪些服务的权限（Auth Scope），
进入”服务 X"，通过查询服务 X 的资源，并展示。

参考

Cloud EndPoints 简介 https://cloud.google.com/endpoints/docs/openapi/about-cloud-endpoints?hl=zh-cn
Cloud Endpoints 架构概览 https://cloud.google.com/endpoints/docs/images/endpoints_arch.png?hl=zh-cn
Cross project management using serivce account https://stackoverflow.com/questions/35479025/cross-project-management-using-service-account
以最终用户身份进行身份验证 https://cloud.google.com/docs/authentication/end-user
使用 firebase 在 App Engine 上对用户进行身份验证 https://cloud.google.com/appengine/docs/standard/python/authenticating-users-firebase-appengine#managing-user-data-in-datastore 例子比较简单，并没有结合 IAM 来实现权限控制。
IBM OAuth 2.0 工作流程 https://www.ibm.com/support/knowledgecenter/zh/SSPREK_9.0.2/com.ibm.isam.doc/config/concept/con_oauth20_workflow.html#con_oauth20_workflow
《OAuth 2.0 实战》百度网盘 -> 我的文档, 在Kami上阅读
[认证 & 授权] https://www.cnblogs.com/linianhui/category/929878.html
[OIDC in Action] 详细流程展示OIDC过程 https://www.cnblogs.com/linianhui/category/1121078.html
sample for oidc 上面文档的代码 https://github.com/linianhui/oidc.example
Identity Server 4 - Hybird Flow - MVC 客户端身份认证 https://www.cnblogs.com/cgzl/p/9253667.html https://www.cnblogs.com/cgzl/tag/OAuth2/

·End·

14 - 限流的那些事

背景

限制 API 的请求数量是网络安全的一部分，大量的 API 请求导致高负载。在学习 rudr 过程中也提到 rate limiting 可以做为 trait，那么使用起来非常方便，业务开发并不需要关心限流逻辑。

Glossary

traffic shaping

packet are delayed until they conform

traffic policing

non-conforming packets may be discarded(dropped) or may be reduced in priority.

What and The Importance

What Is API Rate Limiting

集群流控：集群流量不均匀导致总体限流效果不佳的问题，仅靠单机纬度去限制的话无法精准限制总体流量。

Best Practices For API Rate Limiting

How to Throttle API Calls

Three Methods of Implementing API Rate-Limiting

Adaptive System Protection （系统自适应限流）

Load 自适应
CPU usage
平均 RT
并发线程数
入口 QPS

Request Queues

Throttling

API Used by setting up a temporary state, allowing the API to assess each request

Rate-limiting Algorithms

Leaky Bucket

as a meter，与 Token bucket 算法互为 mirror，
- Token bucket 固定的 rate 增加 token；而另外一个是固定的 rate 漏水
- Token bucket 请求从桶中获取 token, 获取不到时则限流；而另外一个是往漏斗中滴水，滴满时则限流。
as a queue，用于匀速排队的方式严格控制请求通过的间隔时间。

Fixed Window

Sliding Log

Sliding Window

Rate Limiting 影响

参考

Everything You need to Know About API Rate Limiting https://nordicapis.com/everything-you-need-to-know-about-api-rate-limiting/
Sentinel: 集群流控，https://github.com/alibaba/Sentinel/wiki/集群流控

·End·

15 - 微服务架构下Deloyment最佳方式探讨

背景

服务的升级和重构上线，跟服务如何平滑的部署、流量迁移有着密不可分的关系。

参考

Flagger
Flagger is a Kubernetes operator that automates the promotion of canary deployments using Istio, Linkerd, App Mesh, Nginx, Contour or Gloo routing for traffic shifting and prometheus metrics for canary analysis.

·End·

16 - Cloud Computing Service Modeling

简介

2020年需要重新思考架构模型，我们以怎样的方式对外提供服务，是Service，一起看下有哪些Service吧

Service Model

常见的应该有Iaas(Host)、Paas(Build)、Saas(Consume)、Faas, 而在这边文章"Future of Cloud Computing Architectrue"里提到的Service更多类型

Cloud Computing Stack

每一种提供了不同的灵活性和控制，如下图：

Saas vs Paas vs Iaas Service Model

Function-as-a-service TODO

Software-as-a-Service

The Saas model 为你的业务提供基于云的web应用的访问能力，无须install new infrastructure

The Twelve-Factor APP

https://12factor.net/ 为提供Saas服务提供了方法论:

使用标准化流程自动配置
和操作系统之间尽可能的划清界限，在各个系统中提供最大的可移植性
适合部署在现代的云计算平台，从而在服务器和系统管理方面节省资源
将开发环境和生产环境的差异降至最低，并使用持续交付实施敏捷开发。
可以在工具、架构和开发流程不发生明显变化的前提下实现扩展。

身份认真和授权 TODO

License Model for Saas or alias “Sass License”, 感觉类似“AWS cognito”
vs IAM

文章展示了AWS各个服务之间的交互逻辑

Platform-as-a-service

With this model, a third-party vendor provides your business with a platform upon which your business can develop and run application.

Infrastructure-as-a-service

allow your business to have complete, scalable control over the management and customization of your infrastructure .

Patterns in microservice

architecture trends 2020

EDA

Service composition - anit pattern

tightly coupled, because the calling service needs to know the URL payload and related detail of the service it calls
a change in functionality require a coordinated effort between multiple teams

Event notifications and event-driven architectures

AsyncAPI #TODO

Data Architecture

Data Mesh

the next enterprise data platform architecture is in the convergence of Distributed Domain Driven Architecture, Self-serve Platform Design, and Product Thinking with Data^[5]. — Zhamak Dehghani

Data Gateways

somewhat like API gateways but focus on the data aspect.

Policy as Code #TODO

Designing for ___

Designing for resilience

Designing for abservability

Designing for portability

whether that’s for multi-cloud or hybrid-cloud. In most cases, there are no reasons for architects to design for the lowest common denominator to enable true multi-cloud portability or avoiding vendor lock-in.

Designing for sustainability

This is emerging because people are realizing the software industry is responsible for a level of carbon usage comparable to the aviation industry^[6].

Dapr

It is describing as a set of"microservice building blocks for cloud and edge" also is meant to be agnostic

Dapr is completely platform agnostic, meaning you can run your applications locally, on any Kubernetes cluster, and other hosting environments that Dapr integrates with. This enables developers to build microservice applications that can run on both the cloud and edge with no code changes,"

Dapr is a portable, event-driven, runtime for building distributed applications across cloud and edge.

Dapr building blocks

Service Invocation
State management
Plubish and subscribe messaging between services
Event driven resource bindings
Virtual actors – A pattern for stateless and stateful objects that make concurrency simple with method and state encapsulation. Dapr provides many capabilities in its virtual actor runtime including concurrency, state, life-cycle management for actor activation/deactivation and timers and reminders to wake up actors^[7].
Distributed tracing between services
Resillency

Sidecar architecture and supported infrastructures

Dapr exposes its APIs as a sidecar architecture, either as a container or as a process, not requiring the application code to include any Dapr runtime code.

Dapr running as a side-car process

Multi-Cloud, open components (bindings, pub-sub, state) from Azure, AWS, GCP

Dapr is completely platform agnostic, meaning you can run your applications locally, on any Kubernetes cluster, and other hosting environments that Dapr integrates with. This enables developers to build microservice applications that can run on both the cloud and edge with no code changes.

Dapr 和 Service-Mesh 的区别

service-mesh vs dapr

共同点：

基于 mTLS 加密的服务到服务的安全通信
服务到服务的度量指标收集
服务度到服务的分布式跟踪
故障重试恢复能力

Dapr 以开发者为中心，提供了通过了名称进行服务发现和调用的方式。Dapr 还提供了其他应用级的构建块，如状态管理、发布/订阅、参与者等

Principles in Microservice

Create an organizational model that provide independence and antonomy to teams
services are independently deployable
services are independently scalable
they do not have a single point of failure - only degradation
the design employ asynchronous communication between services
no shared functionality , code or data exists in the system .
Component are easy to understand and the are small services with boundary

思考

接触到“AWS解决方案架构师”，负责企业客户应用在AWS的架构咨询和设计。在微服务架构设计，数据库等领域有丰富的经验。
是技术产品还是技术架构师呢？那AWS这些云产品由什么位置来规划的？
是技术架构师又偏技术业务，这是云服务架构师的之路, 最后做技术架构咨询。

TODO

[ ] 企业软件架构模式，见kami app

参考

[1] Futrue of Cloud Computing Architecture.pdf

[2] IBM: IaaS vs. PaaS vs. SaaS, Understand and compare the three most popular cloud computing service models

[3] InfoQ: Software Architecture and Design InfoQ Trends Report—April 2020

[4] InfoQ 趋势报告：架构和设计领域技术演变详解 2019

[5] Zhamak Dehghani: How to Move Beyond a Monolithic Data Lake to a Distributed Data Mesh

[6] Thomas Betts, Holly Cummins: Software Architecture and Design InfoQ Trends Report – April 2021

[7] Microsoft Open Source Blog: Announcing Distributed Application Runtime (Dapr), an open source project to make it easier for every developer to build microservice applications

·End·

17 - 控制资源的访问权限

这是一个最好的时代，也是一个最焦虑的时代，如果你停滞不前，时代抛弃你的时候，连一句再见都不会说。

What

Identity & Access Mangement, 身份验证以及访问控制，一种对资源提供可控安全的访问解决方案。用来控制对 AWS 资源的访问权限，把资源 (Resouuce) 上的操作 (Action) 授权给谁 (identity)

IAM: https://docs.aws.amazon.com/zh_cn/IAM/latest/UserGuide/intro-structure.html

图中三种 Authorization，看完这个链接，不止 three type :

Identity-based polices To provide your users with permission to access the AWS resource in their own account
- Managed policy （托管策略）
  - AWS Managed policy,
  - Customer managed policy,
- inline policy（内联策略）: embedded in an IAM identity(a user, group, or role)
Resource-based polices Popular for granting cross-account access, Resource-based policies are inline only, not managed
Other polices Should be used carefully

Q: 这三种跟 Policy 的三种类型有什么区别？ A: 下面三种隶属于 Identity-based policy, 详细见：Identity-Based Policies

How

如何优雅的定义 Resource
如何优雅的定义 Action
Identity
Policy

Resource

AWS 给出的答案是： ARN（AWS Resource Namespace), ARN 是一个命名规则，用于无歧义的对 AWS 的资源进行命名。

AWS 针对调用 API 的许可控制。对 API 许可的资源格式定义非常全面，值得借鉴

Identity

User, Group, Roles

Action

Action 也就是针对 AWS 上的服务提供的 API。 Condition Context Keys，是 AWS IAM 中支持的一个功能。即在定义 Policy 时可以使用一些变量，支持复杂的表达式。

Policy TODO

Policies Type, 详细描述有哪些 Policy 种类，几乎覆盖大部分的场景

Identity-based policies
Resource-based policies
Permissions boundaries
Organizations SCPS
Accss control lists(ACLs)
Session Policies

用来描述授权策略的一种描述语言，用于描述谁在 xx 条件下对 xx 资源具有 xx 操作。组成如下：

“Version”
Statement：具体策略的内容，可以是一个或者多个
- Effect: Allow 或者 Deny
- Action: 具体操作，参见 AWS Service Actions and Condition Context Keys for Use in IAM Policies.
- Resource: 具体的资源

一个 identity 所有的多个 Policy 会发生冲突，IAM 采用的策略可以概括为 8 个字： ~凡事声明，一票否决~

凡事声明：默认情况下， Resource 是禁止访问的，只有显式声明了对资源的 Allow 权限，才允许访问。
一票否决：即便是有 Policy 开启了 Allow, 一旦其他的 Policy 中出现对 Resource 的 Deny 声明，一律 Deny

Open Policy Agent

入门： Open Policy Agent:简化了微服务授权

OPA定义一套DSL语言rego,
上文中提到微服务架构下借助nginx实现权限控制的一种方式

OPA 入门系列

中文资料

IAM 实践 TODO

AWS IAM is supporting a role-based access control(RBAC), paradigm by defining permissions within policies and attaching those to applicable principlas(IAM users and roles)

besides supporting both identity- and resource- based policies, IAM has alway supported aspects of attribute-based access control(ABAC) via the optional condition policy element and “expressions in which you use condition operators(equal, less than, etc.) to match the condition in the policy against values in the request”, such as IP address or time of day.

furthermore it has supported authorization based on tags.

Implement TODO

~~怎么实现这个系统，以及各个系统怎么接入~~ 当我继续深入了解IAM，接触到一个全新的概念零安全架构，来源于 Google Enterprise Security BeyondCorp: a new approach to enterprise security, 同时同行也在实践这个方案，比如中通安全中通下一步的 IAM 架构设计, 接下来需要充分对这个架构的理解。

各个系统依赖于这个 IAM 权限控制系统，要实现的话两步走：

怎么实现这个系统
- 基于 hydra 来实现 OAuth2.0 和 OIDC
- 基于 keto，参考实现权限系统
各个系统怎么接入

所有服务以一种语言实现， IAM 的功能作为 SDK 的方式集成在服务之中，SDK 依赖弹性伸缩的数据服务，辅以上层负载均衡依照 AK 将请求路由到不同的分区，以期保证性能的同时，达到更高的灵活度。

hydra

拉取hydra代码，本地启动服务:

docker-compose -f quickstart.yml -f quickstart-postgres.yml up --build

TODO

细读 BeyondCorp: A new approuch to enterprise security
细读浅谈助力零信任安全架构的云 IAM 设计理解整个系统的架构
细读 Sequence diagrams of OAuth 2.0 in Authelete

参考

AWS IAM 从入门再到入门提到一些有用的工具

IAM 身份验证以及访问控制提到临时访问凭证的方案。

阿里云 RAM 策略整理对 Policy 详细的讲解。对于“RAM 角色身份的授权策策略检查逻辑” 最后两步不是很理解。为什么“检查 RAM 角色所属的主账号是否有授权”，以及检查“该资源是否支持跨账号 ACL 许可” ？

阿里云访问控制 pdf 文件产品简介，了解 RAM-User 和 RAM-Role 的定义。

Restful Api 的访问控制方式

当前国内外云计算平台的访问控制机制分析图挂了。

零信任的 5W1H 解释什么是零信任。回答了这个概念。

Amazon Cognito allow secure authentication in a world where mobile apps are regularly being accessed by individuals using multiple smart devices 资源怎么被 APP 使用。

贝壳找房权限服务的探索和实践贝壳找房权限服务的实践分享

AWS Identity and Access Management Gains Tags and Attribute-Based Access Control 2019.2.8 IAM 这一发布表示 Support the ability to embrace attribue-based access control (ABAC) and match aws resources with IAM principals dynamically to “simplify permissions management at scale”

Takahiko Kawasaki Co-founder and representative director of Authlete, Inc., working as a software engineer since 1997. 图做的非常清晰.

·End·

18 - APM资料整理

背景

NewRelic工作上使用、到Sentry接入所有应用，是时候需要对这些资料进行系统化的梳理。随着深入了解这些工具，APM出现在视线里，究竟什么是APM，它的来龙去脉是怎样的呢，它解决了什么问题（或者目的是什么）？这篇文章根据自己检索到资料，对这些问题进行阐述

名词解释

管理模型

APM

Application Performance Management的缩写, “应用性能管理”，由Gartner归纳抽象出的一个管理模型。
APM模型中一共分五个层次：

End User Experience

首先关注的是终端用户对应用性能的真实体验。目的是帮助管理者准备、详尽地了解真实用户体验是什么样子。

Runtime Application Architecture

应用架构映射，目的是解决企业应用架构黑盒或灰盒的现状。

应用的完整架构
单词请求的应用架构

Business Transactions

应用事务分析
GA大量的埋点，怎么做到不需要修改任何一行代码，我们并可以对应用取得的数据分析应用事务

确定上下文的事务操作，是同一个用户
确定所有事务操作的每一个步骤，是唯一一个动作

Deep Dive Component Monitoring

深度应用诊断，

在不修改用户代码的前提下，取得代码运行时性能数据
终端用户数据、运行时性能数据、数据指标数据、服务运行指标数据，有效关联
有太多的关注点，怎么方便的部署采集端
不影响原应用的性能。

Analytics / Reporting

处理数据要及时，必要时要做到实时的处理，问题可能随时都会发生；
数据的分析报告要准确，大量的数据本身无价值的，按照无业务模型进行分析、预测才能有其价值体现。

VS Sentry

“错误日志监控”也可称为“业务逻辑监控”，旨在对业务系统运行过程中产生的错误日志进行收集归纳和监控告警。
就是“APM应用性能监控”。但又与APM不同，APM系统主要注重应用层的行为分析，收集到更多是运营方向的数据。
而 Sentry所做是收集应用底层代码的奔溃信息，便于排查代码异常。简单来说，排障工具！
Sentry解决的问题：

无法第一时间感知错误
错误信息的获取相对低效
日志的处理方式不灵活
监控覆盖面有限

Sentry 是一个现代化的错误日志记录和聚合平台，支持所有的主流开发语言和平台，并提供了现代化UI。

参考

什么是真正的APM
运维开发实践——基于Sentry搭建错误日志监控系统

·End·

19 - API Trend 现状

GraphQL

2012年 facebook内部开发，2015年公开公布，2018年 GraphQL项目转移到新成立的GraphQL基金会。
有利有弊, 对于简单的API并不是好的选择。

避免服务器大量冗余数据的返回
不能有效利用查询结构的web缓存
带来的灵活性和丰富性的同时增加了复杂度

The GitHub GraphQL API 采用GraphQL API解决两大问题：

scalability, 一次性同时提供客户端需要的信息，而不是像RESTFul API那样重复的请求几次获取需要的数据
Collect some meta-information about our endpoint 给不同endpoint不同的OAuth权限范围，更灵活的分页，只获取用户需要的数据.

OPEN API(Swagger API)

OPEN API 从Swagger API 2.0发展而来
What’s the difference between Swagger and REST

the net result is that OAS(OPEN API Spec) is considered to be a standard specification for describing REST APIS. not just for developers to consume, it is also intended for usage by software .

是什么，一大特点

Versus older architectural styles, the specifics of the REST architectural style — their simplicity, their elegance, and their ability to rely on existing standard networking protocols like the one that makes the World Wide Web work (aka the “Hypertext Transfer Protocol” or “HTTP”) — have made it one of the more enduring and popular architectural styles for networkable APIs. This is one reason that REST APIs are sometimes also called “Web APIs.” Although it is not a requirement, most REST APIs rely on HTTP (the Web’s official protocol) to perform their magic.

这里解释REST架构API 为什么被称为 “Web API”

Restful API

RESTFul API设计指南
前端设备与后端进行通信，导致API架构的流行，甚至出现“API First”的思想。
RESTFul API是目前比较成熟的一套互联网应用程序的API设计理论。

协议
域名
版本
路径
HTTP动词
过滤信息
状态码
错误处理
返回结果
Hypermedia API
其他

Rest API backword compatibility

Tests
Always add parameters
Do not make optional parameters be mandatory
Always add additional HTTP response code returned by the API
Never delete or modify existing HTTP Response code behavior
Change URLs wisely

WSDL(Web Services Description Language )

网络服务描述语言

are complimentary to other older but still deeply entrenched networkable API architecture like “remote procedure call " or “RPC "

VS REST API

OAS is complimentary to the REST architectural style

参考

How to maintain Rest API backward compatibility? https://zubialevich.blogspot.com/2018/09/backward-compatibility.html

·End·

20 - API 兼容性设计

简介

向后兼容的一般目标是：服务升级到新的minor版本或者patch后客户端不应该被破坏

名词解释

Source Compatibility

code that compiled against version X of an API will also compile against version Y .

Binary Compatibility

code that compiled against version X of an API will run correctly in an environment that has version Y of the same API.

向后兼容性的改变

为API服务添加一个API接口
为API接口添加一个方法
为方法添加一个HTTP绑定
为请求消息添加一个字段
为响应消息添加一个字段
为枚举类型添加一个值
添加output-only的资源字段

不向后兼容的更改

删除或重命名一个服务，字段或者枚举值
更改HTTP绑定
更改某个字段类型
更改资源名称格式
修改已有请求的可见性
在HTTP定义中改变URL格式
在资源消息中添加读/写字段

参考

Google API 设计指南 - 兼容性
 Backward Compatibility Guidelines
API Gesign Guide Google针对网络API的通用设计指南
API Compatibility v2

·End·

21 - Web API Spec管理平台

背景

后台服务对外最重要的是提供API接口，上百上千的接口，怎么保持一种规范/约束，才能让接口在创建、更新、废弃生命周期中是有据可循的呢？

接口的创建是按照什么规范?
接口的更新是对业务方无影响或者影响非常小的？
接口的废弃是怎样的方式？

调研

业界解决方案

apiary

Support API Blueprint\Swagger API

DNA for your API — powerful, open sourced and developer-friendly. The ease of Markdown combined with the power of automated mock server , tests, validations, proxies, and code samples in your language bindings

Server Mock
Documentation
Traffic Inpspector

MuleSoft

the most widely used integration platform

总结：有接口管理、实时使用数据指标、流量突变告警、

APIMATIC

instant SDK, Code Samples, Test Cases
Continuous Code Generation
API Transformer

总结：其中代码生成、持续的代码生成、API转换都是非常需要的功能。

Apigee

google提供跨云环境的API管理

Web API Design
Mastering Full Lifecycle API
Management with Analytics
Securing APIs in the Age of Connected Experiences

总结：基本上满足文章开头的三个问题。

Kong

相结合的方案

drupal kong api publisher

https://git.drupalcode.org/project/kong_api_publisher/-/tree/1.0.x

kong slack api Governance

Luca Maraschi: API Butler for the Enterprise at Your Commands

How to set up Ops to seamlessly integrate with Slack
How to create an API based on a pre-defined template within Slack
How easy an API registration in Kong can be done via Slack
How to bring command, pipeline and services together seamlessly
Allow for transparent and analytic feedback loops within Slack
How to discover APIs in an easy and immediate way

Kong-Slack-API-Governance

Insomnia

类似于 postman，可以通过插件与 Kong Dev Portal 结合。

Kong Dev Portal & Workspace

在线编辑没有版本管理，管理了 Application 和 Service。

provides a single source of truth for all developers to locate, access, and consume services

https://konghq.com/blog/api-gateway-governance

其他

apigility
Falcon Python web framework
Amazon API Gateway traffic management, authorization and access control, monitoring and API version management
SPECCY a handy toolkit for OpenAPI, with a linter to enforce quality rules, documention rendering and resolution

公司解决方案

结论

一期

API生命周期管理：API新建和显示（For Humans, For Machines），API使用，API弃用，版本管理，授权与访问控制

TODO LIST

二期

API的健康状态管理： API流量、API错误率等。打通与APM之间的数据。

TODO LIST

三期

支持更多的功能： API在线编写， API不规范的提示，API调式，API MOCK，API自动测试，代码生成（SDK和sample）

Traffic Inspector
提供代理方式，开发者可以把数据发送到调试代理上，通过比对数据和协议内容，来定位问题。想法来自 apiary.io

TODO LIST

参考

The Web API Checklist – 43 Things To Think About When Designing, Testing, and Releasing your API
理解HTTP幂等性
 API EVANGELIST
API Blueprint
13 Free & Open Source Tools For API Creation, Management & Testing

·End·

22 - 迁移线上服务方案

背景

迁移服务器或者服务升级的事情时，需要对使用方来说是无感知的
那怎么做到新旧代码或者新旧服务的平稳过渡呢？根据不同的情况具体的方案也是不一样的。

调研

对各大公司进行做服务升级或者迁移的方案学习。吸取有用的方案。

模拟客户端的请求，前端做灰度上线

部署新旧服务，完成AB测试，灰度上线
数据库数据怎么解决的？

可行的方案

结合目前的我们的基础设施，可执行的方案。

最佳实践

最后，我们的最佳实践是什么，我们实践后的总结

参考

记一次从Rails直Golang的接口迁移

·End·

Archtecture & Design

1 - 模型推理架构

概念解释

对标方案 - Run computer vision inference on large videos

Real-time Inference

TensorRT through NVIDA Triton

Asynchronous inference

Prerequisites

Use Your Own Inference Code with Hosting Services

Adapting Your Own Inference Container

Create

Prebuilt SageMaker Docker Images for Deep Learning

Batch Transform

Serverless Inference

对标方案 - Machine Learning Platform for AI - EAS

参考

2 - 云原生模式 CLOUD NATIVE PATTERNS

Refs

3 - 消息队列 资料汇总

基础概念

案例分析

知乎千万级高性能长连接网关揭秘

基于 Flink 的资讯场景实时数仓

参考资料

4 - Prometheus Practise

背景

名词解释

Operator

Prometheus

设备插件 device-plugin

监控指标

开始

使用 dcgmproftester

有感

FAQ

参考

5 - Notes for Patterns of Enterprise Application

数据源架构模式

表数据入口 (Table Data GateWay)

行数据入口 (Row Data GateWay)

活动记录 (Active Record)

数据映射器(Data Mapper)

处理查找方法

把数据映射到领域对象的域

基于元数据的映射

分布模式

远程模式（Remote Facade)

数据传输对象（Data Transfer Object）

使用时机

FAQ

离线并发模式

乐观离线锁

悲观离线锁

运用机制

分布式锁

会话状态模式

参考

6 - GPU 那些事儿

概念解释

MapReduce

Spark 和 Flink

Flink

Spark

BSP

DOT 模型

DOTA 模型

p-DOT 模型

并行计算

MPI

MPS

Kuberentes下 GPU 资源的使用

GPU 虚拟化方案

GPU 非虚拟化方案

NVIDIA GPU OPERATOR

参考

7 - PlantUML + Archimate 记录

背景

工具

详细

参考

3 - 消息队列资料汇总