Skip to content

IPAM: recycle the IP-unused eni in IPRelease handling#4

Draft
AuZhoomLee wants to merge 2 commits into1.15.1-volcenginefrom
ft/lijinzhu/ip-release
Draft

IPAM: recycle the IP-unused eni in IPRelease handling#4
AuZhoomLee wants to merge 2 commits into1.15.1-volcenginefrom
ft/lijinzhu/ip-release

Conversation

@AuZhoomLee
Copy link
Copy Markdown

  1. add ReleaseInterface On NodeOperations that represents releasing interface action if needed when handling ip release
  2. add metrics for release interface static and report at prometheus metric
  3. fully implementation for volcengine CRD Operator: will detect whether is the last private ips release for current interface in every ip release procedure, and then delete the interface which have no any private IPs used if so

Signed-off-by: Jinzhu Lee lijinzhu.azl@bytedance.com

1. add ReleaseInterface On NodeOperations that represents releasing interface action if needed when handling ip release
2. add metrics for release interface static and report at prometheus metric
3. fully implementation for volcengine CRD Operator: will detect whether is the last private ips release for current interface in every ip release procedure, and then delete the interface which have no any private IPs used if so

Signed-off-by: Jinzhu Lee <lijinzhu.azl@bytedance.com>
rename constant action type mark

Signed-off-by: Jinzhu Lee <lijinzhu.azl@bytedance.com>

scopedLog.Info("Deleted the ENI")

// Add the information of the created ENI to the instances manager
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"delete" the ENI from...

func (n *Node) ReleaseInterface(ctx context.Context, release *ipam.ReleaseAction, scopedLog *logrus.Entry) (string, error) {
defer logs(context.TODO(), constant.ModuleNameNodeOps, "CreateInterface")()

var rsc v2.CiliumNode
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rsc是什么的缩写?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource -> rsc

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resource一般缩写为res

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🆗,看了下 resource 的标准缩写为 res或者是 rsrc, 前面一个还会用于response的缩写,我改为后面一个缩写形式吧,没有歧义

n.manager.metricsAPI.AddIPRelease(string(r.PoolID), int64(len(r.IPsToRelease)))
n.manager.metricsAPI.IncInterfaceRelease(string(r.PoolID))

return true, nil
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inuse的时候应该返回false

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里是Release成功的情况,代表着该interface已经被释放,当上面err != nil的时候,表示释放时失败了或者不能释放,则表示没有删除该网卡的在资源池中的记录。这里的第一个参数也是对齐了createInterface的返回,但是由于调用的地方,mark released也需要返回ENI的修改状态,所以这里的第一个参数暂时并未被实际的代码使用到,预留以标识网卡是否正确的回收成功。

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

	if rsc.Status.IPAM.IsResourceUsing(release.InterfaceID) {
		return "", nil
	}

inuse的时候返回false是预期的吗?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

是的,如果inuse就是 releasd 为false

}
n.mutex.Unlock()

if _, err := n.releaseInterface(ctx, a.release); err != nil {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrepareIPRelease()可以计算出来本次release后是否要删除网卡,是不是可以考虑计算后mark一下,这里就不用每次都调用release了

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里原本的意思是在真正进行released mark后才进行检测该网卡是否要被回收,检测方法是判断一下该网卡是否在Used中还被使用,但是我发现还不能简单的这么判断,因为这个网卡的其他ip也有可能存在released前序的状态,需要同时在released mark后判断,其他的状态字典是否也为空才可以确保所有的ip都没被使用,或者是完全释放(released),才可以考虑被回收。

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

因为这个网卡的其他ip也有可能存在released前序的状态

这里的处理没有并发吧

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这里没有并发,我指的是,一个网卡的ip下所有的ip,在ip release这个步骤里面,会存在多种ip release的状态,有的ip已经被标记released, 有的可能晚于前面的ip释放目前是 mark for release, 这个时候如果该网卡所有的ip 如果没有完全被包含 released ,就可能在其他状态被使用吧,比如又从mark for release 被扭转成为 do not release。

ruicao93 pushed a commit that referenced this pull request Apr 22, 2024
* IPAM: add allocator provider for Volcengine

support gc tags by user custom or cilium managed by default, but not support VKE (Volcengine K8s Engine) cluster tags detect

Signed-off-by: Jinzhu Lee <lijinzhu.azl@bytedance.com>
ruicao93 pushed a commit that referenced this pull request Apr 22, 2024
* IPAM: add allocator provider for Volcengine

support gc tags by user custom or cilium managed by default, but not support VKE (Volcengine K8s Engine) cluster tags detect

Signed-off-by: Jinzhu Lee <lijinzhu.azl@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants