hadoop FileSystem是否要close

先来说结论，最好不要close，让hadoop自己close，否则容易把进程里其他获取fs的地方一起关了。这是因为 FileSystem.get(Configuration)做了缓存的原因。当然可以设置

conf.setBoolean("fs.hdfs.impl.disable.cache", true);

就不缓存fs，但是这可能会导致性能问题，因为每个获取都要建立一个和namenode的连接。但是假如不关，会导致泄露吗，其实不会，是因为hdfs自己做了关闭的操作，它有一个shutdown的hook会负责最后关掉这些FileSystem。

缓存代码如下：

复制代码

public static FileSystem get(Configuration conf) throws IOException {
    return get(getDefaultUri(conf), conf);
  }

  /** Returns the FileSystem for this URI's scheme and authority.  The scheme
   * of the URI determines a configuration property name,
   * <tt>fs.<i>scheme</i>.class</tt> whose value names the FileSystem class.
   * The entire URI is passed to the FileSystem instance's initialize method.
   */
  public static FileSystem get(URI uri, Configuration conf) throws IOException {
    String scheme = uri.getScheme();
    String authority = uri.getAuthority();

    if (scheme == null && authority == null) {     // use default FS
      return get(conf);
    }

    if (scheme != null && authority == null) {     // no authority
      URI defaultUri = getDefaultUri(conf);
      if (scheme.equals(defaultUri.getScheme())    // if scheme matches default
          && defaultUri.getAuthority() != null) {  // & default has authority
        return get(defaultUri, conf);              // return default
      }
    }
    
    String disableCacheName = String.format("fs.%s.impl.disable.cache", scheme);
    if (conf.getBoolean(disableCacheName, false)) {
      return createFileSystem(uri, conf);
    }

    return CACHE.get(uri, conf);
  }

 FileSystem get(URI uri, Configuration conf) throws IOException{
      Key key = new Key(uri, conf);
      return getInternal(uri, conf, key);
    }

即根据key来判断缓存的，key有4个字段来判断是不是同一个key，schema和authority是取值于uri，而我们不传取得就是fs.defaultFS的配置值。Ugi取得是当前用户，unique直接写了个默认值0。所以这四个字段一致就会取到同一个fs。

key结构如下：

关闭的代码：