庆祝lv4升级成功,搬一旧博客的笔记
Java NIO的空轮询问题描述
若Selector的轮询结果为空,也没有调用wakeup或新消息处理,Selector.select()被唤醒而发生空轮询,CPU使用率100%。
官方的BUG描述
BUG只在Linux下产生,因为java的epoll实现存在bug,而linux下NIO底层使用的是epoll来实现的而windows不是。
java
selector = Selector.open();
serverSocketChannel = ServerSocketChannel.open();
serverSocketChannel.configureBlocking(false);
serverSocketChannel.socket().setReuseAddress(true);
serverSocketChannel.socket().bind(new InetSocketAddress(port));
log.info("服务器启动,等待连接...");
for(;;){
//int num = selector.select(): 一般情况下是阻塞模式,但确被唤醒,num==0即selectionKey为空,则 while (it.hasNext()) { ..}不会执行。
int num = selector.select();
if (num==0){
log.error("select wakes up with zero!!!");
}
//或者 while(true){ selector.select();
Set<SelectionKey> readyKeys = selector.selectedKeys();
Iterator<SelectionKey> it = readyKeys.iterator();
while (it.hasNext()) {
SelectionKey key = null;
try {
key = it.next();
if (key.isReadable()){
receive(key);
}
//...
}
}
}
Netty中解决该bug的方法
- 对Selector的select操作周期进行统计,每完成一次空的select操作进行一次计数,若在某个周期内连续发生N次空轮询,则触发了epoll死循环bug。
- 重建Selector,判断是否是其他线程发起的重建请求,若不是则将原SocketChannel从旧的Selector上去除注册,重新注册到新的Selector上,并将原来的Selector关闭。
详解代码
- 1、selector.select(timeout) 设置了一个超时时间,selector有以下4种情况跳出阻塞
- 1.有事件发生
- 2.wakeup
- 3.超时
- 4.空轮询bug
当前两种返回值不为0,可以跳出循环,超时有时间戳记录。所以当每次空轮询发生时会有专门的计数器+1,如果空轮询的次数超过了512次,就认为其触发了空轮询bug。
- 2、触发bug后,netty直接重建一个selector,将原来的channel重新注册到新的selector上,将旧的 selector关掉,代码如下
java
private void select(boolean oldWakenUp) throws IOException {
for(;;)
int selectedKeys = selector.select(timeoutMillis);
selectCnt ++;
if (selectedKeys != 0 || oldWakenUp || wakenUp.get() || hasTasks() || hasScheduledTasks()) {
// - Selected something,
// - waken up by user, or the task queue has a pending task.
// - a scheduled task is ready for processing
break;
}
long time = System.nanoTime();
if (time - TimeUnit.MILLISECONDS.toNanos(timeoutMillis) >= currentTimeNanos) {
// timeoutMillis elapsed without anything selected.
// 超时
selectCnt = 1;
} else if (SELECTOR_AUTO_REBUILD_THRESHOLD > 0 &&
selectCnt >= SELECTOR_AUTO_REBUILD_THRESHOLD) {//默认值512
// The code exists in an extra method to ensure the method is not too big to inline as this
// branch is not very likely to get hit very frequently.
// 空轮询一次 cnt+1 如果一个周期内次数超过512,则假定发生了空轮询bug,重建selector
selector = selectRebuildSelector(selectCnt);
selectCnt = 1;
break;
}
}
}
/**
* Replaces the current {@link Selector} of this event loop with newly created {@link Selector}s to work
* around the infamous epoll 100% CPU bug.
* 新建一个selector来解决空轮询bug
*/
public void rebuildSelector() {
if (!inEventLoop()) {
execute(new Runnable() {
@Override
public void run() {
rebuildSelector0();
}
});
return;
}
rebuildSelector0();
}
private void rebuildSelector0() {
final Selector oldSelector = selector;
final SelectorTuple newSelectorTuple;
//新建一个selector
newSelectorTuple = openSelector();
// 将旧的selector的channel全部拿出来注册到新的selector上
int nChannels = 0;
for (SelectionKey key: oldSelector.keys()) {
Object a = key.attachment();
if (!key.isValid() || key.channel().keyFor(newSelectorTuple.unwrappedSelector) != null) {
continue;
}
int interestOps = key.interestOps();
key.cancel();
SelectionKey newKey = key.channel().register(newSelectorTuple.unwrappedSelector, interestOps, a);
if (a instanceof AbstractNioChannel) {
// Update SelectionKey
((AbstractNioChannel) a).selectionKey = newKey;
}
nChannels ++;
}
selector = newSelectorTuple.selector;
unwrappedSelector = newSelectorTuple.unwrappedSelector;
// time to close the old selector as everything else is registered to the new one
//关掉旧的selector
oldSelector.close();
}
[参考]
- 李林峰的《Netty权威指南》