Hedged Read 功能是否支持 hbase hfile seek

一个疑问,hbase scan 过程中,会 seek block 的位置,看 HDFS 代码中只有 DFSIuputStream 的read 方法会使用 hedged Read,

之所以提这个问题,是因为发现线上 regionserver 配置了 hedged Read 仍然存在访问 HDFS 超时的情况,超时时间远高于hedged Read设置的超时值,可以确定的是 环境的 hadoop 版本的代码中是有 hedged read 功能的,hedged read 线程池配置为64,超时为200毫秒。
hdfs 本身配置的的读超时为5s,结果是每次发生超时都是5s而不是在200ms后重试另外一个datanode,
看代码 read block 最后会调用的是 DFSInputStream的read 方法,应该会生效的。会不会是 seek 导致的?
 
 
已邀请:

hmaster

赞同来自: libis

问题出在
 
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:847) at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
HFileBlock$AbstractFSReader.readAtOffset
StoreFileScanner.seek
 

HFileBlock$AbstractFSReader.readAtOffset 会调用 DFSInputStream的read 方法,注意此处DFSInputStream 有好几个read 方法,
只有这个方法签名的参数才支持hedgedRead,支持的方法如下:
public int read(long position, byte[] buffer, int offset, int length)
 
而StoreFileScanner.seek上面调用的read 方法为
read(final byte buf[], int off, int len)并不支持 hedgedRead

hmaster

赞同来自: igloo1986

https://issues.apache.org/jira/browse/HBASE-12411 
https://issues.apache.org/jira/browse/HDFS-6735 
 
这两个 jira 应该可以解决,
 
hbase.storescanner.use.pread=true
然后打上 HDFS-6735相关patch

igloo1986

赞同来自:

数据本地化率?

hmaster

赞同来自:

堆栈信息如下:



hdfs.DFSClient: Failed to connect to /yyyyy:50010 for block, add to deadNodes and continue. java.net.SocketTimeoutException: 3000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xxxx:39788 remote=/yyyy:50010]
java.net.SocketTimeoutException: 3000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/xxxx:39788 remote=/yyyy:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:118)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2230)
at org.apache.hadoop.hdfs.RemoteBlockReader2.newBlockReader(RemoteBlockReader2.java:408)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReader(BlockReaderFactory.java:796)
at org.apache.hadoop.hdfs.BlockReaderFactory.getRemoteBlockReaderFromTcp(BlockReaderFactory.java:674)
at org.apache.hadoop.hdfs.BlockReaderFactory.build(BlockReaderFactory.java:337)
at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:621)
at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:847)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:897)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.hbase.io.hfile.HFileBlock.readWithExtra(HFileBlock.java:679)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$AbstractFSReader.readAtOffset(HFileBlock.java:1412)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockDataInternal(HFileBlock.java:1625)
at org.apache.hadoop.hbase.io.hfile.HFileBlock$FSReaderImpl.readBlockData(HFileBlock.java:1504)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2.readBlock(HFileReaderV2.java:437)
at org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.loadDataBlockWithScanInfo(HFileBlockIndex.java:259)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:634)
at org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:584)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seekAtOrAfter(StoreFileScanner.java:247)
at org.apache.hadoop.hbase.regionserver.StoreFileScanner.seek(StoreFileScanner.java:156)
at org.apache.hadoop.hbase.regionserver.StoreScanner.seekScanners(StoreScanner.java:363)
at org.apache.hadoop.hbase.regionserver.StoreScanner.<init>(StoreScanner.java:217)
at org.apache.hadoop.hbase.regionserver.HStore.getScanner(HStore.java:2069)

要回复问题请先登录注册