ESXi 5.0 host experiences a purple diagnostic screen with the errors "Failed to ack TLB invalidate" or "no heartbeat"
/
5094 次浏览/
容器虚拟化
ESXi 5.0 host experiences a purple diagnostic screen with the errors "Failed to ack TLB invalidate" or "no heartbeat" on HP servers with PCC support (2000091)
Symptoms
- ESXi 5.0 host fails with a purple diagnostic screen
- The purple diagnostic screen or core dump contains messages similar to:
PCPU 39 locked up. Failed to ack TLB invalidate (total of 1 locked up, PCPU(s): 39).
0x41228efc7b88:[0x41800646cd62]Panic@vmkernel#nover+0xa9 stack: 0x41228efe5000
0x41228efc7cb8:[0x4180064989af]TLBDoInvalidate@vmkernel#nover+0x45a stack: 0x41228efc7ce8@BlueScreen: PCPU 0: no heartbeat, IPIs received (0/1).
...
0x4122c27c7a68:[0x41800966cd62]Panic@vmkernel#nover+0xa9 stack: 0x4122c27c7a98
0x4122c27c7ad8:[0x4180098d80ec]Heartbeat_DetectCPULockups@vmkernel#nover+0x2d3 stack: 0x0
...
NMI: 1943: NMI IPI received. Was eip(base):ebp:cs [0x7eb2e(0x418009600000):0x4122c2307688:0x4010](Src 0x1, CPU140)
Heartbeat: 618: PCPU 140 didn't have a heartbeat for 8 seconds. *may* be locked up
Cause
Some HP servers experience a situation where the PCC (Processor Clocking Control or Collaborative Power Control) communication between the VMware ESXi kernel (VMkernel) and the server BIOS does not function correctly.
As a result, one or more PCPUs may remain in SMM (System Management Mode) for many seconds. When the VMkernel notices a PCPU is not available for an extended period of time, a purple diagnostic screen occurs.
As a result, one or more PCPUs may remain in SMM (System Management Mode) for many seconds. When the VMkernel notices a PCPU is not available for an extended period of time, a purple diagnostic screen occurs.
Resolution
This issue has been resolved as of ESXi 5.0 Update 2 as PCC is disabled by default. For more information, see VMware ESXi 5.0, Patch ESXi500-Update02: VMware ESXi 5.0 Complete Update 2 (2033751) and the ESXi 5.0 Update 2 Release Notes.
To work around this issue in versions prior to ESXi 5.0 U2, disable PCC manually.
To disable PCC:
To work around this issue in versions prior to ESXi 5.0 U2, disable PCC manually.
To disable PCC:
- Connect to the ESXi host using the vSphere Client.
- Click the Configuration tab.
- In the Software menu, click Advanced Settings.
- Select vmkernel.
- Deselect the vmkernel.boot.usePCC option.
- Restart the host for the change to take effect.
Additional Information
To be alerted when this article is updated, click Subscribe to Document in the Actions box.
For more information, see the HP Customer Advisory article c03543898.
Note: This is a specific case of a Failed to ack TLB invalidate based purple diagnostic screen. For more information about general cases:
For more information, see:
For more information, see the HP Customer Advisory article c03543898.
Note: This is a specific case of a Failed to ack TLB invalidate based purple diagnostic screen. For more information about general cases:
- Understanding a Failed to ack TLB invalidate purple diagnostic screen (1020214)
- Interpreting an ESX/ESXi host purple diagnostic screen (1004250)
For more information, see:
- Collecting diagnostic information for VMware products (1008524)
- Filing a Support Request in My VMware (2006985)
For more information, see ESXi hosts that use HP CRU driver fail with a purple diagnostic screen when ECC events occur (2001207).
更多文章推荐
- 华为云21天转型微服务实战营全部资源
- kubernetes离线安装KubePi
- OpenEuler/Centos安装containerd容器,cni,nerdctl,buildkit,runc
- K8s网络组件之Flannel:VXLAN模式
- 在 Kubernetess 中使用 DNS 和 Headless Service 发现运行中的 Pod
- K8s网络组件之Calico:IPIP工作模式
- K8s网络组件之Calico:Route Reflector 模式(RR)
- K8s 高性能网络组件 Calico 入门教程
- 华为云基于ServiceStage的微服务开发与部署的实验过程问题
- 如何体验华为云ServiceStage的源码部署功能?