场景
- 在开发
Java程序时, 经常会遇到使用集合存储数据的情况。 比如集合元素去重,集合添加大量的元素,枚举集合元素。 这些常用的操作是有特定的高效率写法的,能节省时间和内存。
说明
- 如果明确知道元素不会超过某个数量,那么在创建集合的时候传入数值参数初始化集合的容量。这样操作能避免集合动态添加元素时频繁扩容导致时间和内存损耗。
java
var first = new ArrayList<String>(10000);
- 枚举
Map元素的时候使用,如果需要使用到Key和Value,那么使用Map.entrySet()方法直接返回Set的Key,Value集合, 会比先返回keySet()集合再通过get(key)迭代耗费的时间少,效率更高。
java
Set<Map.Entry<String, String>> entries = hm.entrySet();
- 对
ArrayList去重,常规写法耗费时间从少到多依次是HashMap<HashSet<TreeMap。
-
HashMap通过containsKey判断去重再添加进ArrayList -
使用
HashSet传入ArrayList对象时去重。 -
TreeMap通过containsKey判断去重再添加进ArrayList
例子
java
package test.example;
import org.apache.log4j.Logger;
import org.junit.After;
import org.junit.Before;
import org.junit.Test;
import org.junit.runner.RunWith;
import org.junit.runners.JUnit4;
import java.lang.reflect.Field;
import java.util.*;
import java.util.function.Consumer;
@RunWith(JUnit4.class)
public class TestCollection extends TestBase{
private static Logger logger = Logger.getLogger(TestCollection.class);
private final int kIterateCount = 100000;
@Before
public void setUp() {
super.setUp(logger);
}
@After
public void tearDown() {
super.tearDown(logger);
}
protected int getListCapacity(List<?> list){
try {
// 获取 ArrayList 的 Class 对象
Class<ArrayList> arrayListClass = ArrayList.class;
// 获取名为 "elementData" 的私有字段
Field elementDataField = arrayListClass.getDeclaredField("elementData");
// 设置可访问私有字段
elementDataField.setAccessible(true);
// 获取该字段的值(即底层数组)
Object[] elementData = (Object[]) elementDataField.get(list);
// 数组的长度即为当前容量
return elementData.length;
} catch (NoSuchFieldException | IllegalAccessException e) {
e.printStackTrace();
}
return 0;
}
@Test
public void testInitialCapacity(){
// 指定初始容量,提升性能;
var first = new ArrayList<String>(10000);
long start = System.nanoTime();
for(int i = 0; i < 10000; ++i){
first.add("hello");
}
long end = System.nanoTime();
logger.info("first duration: " + (end-start));
logger.info("first items size is : " + first.size());
logger.info("first items capacity is : " + getListCapacity(first));
var second = new ArrayList<String>();
start = System.nanoTime();
for(int i = 0; i < 10000; ++i){
second.add("hello");
}
end = System.nanoTime();
logger.info("second duration: " + (end-start));
logger.info("second items2 size is :" + second.size());
logger.info("second items2 capacity is :" + getListCapacity(second));
}
@Test
public void testIteratorHashMap(){
// HashMap因为key做了hash,查询值的时间是O(1)的复杂度。所以它的。get(key)方法很快,
// 枚举key/value使用`keySet`不比`entrySet`慢多少。甚至比`EntrySet`更快
HashMap<String,String> hm = new HashMap<>();
iteratorMap(hm);
}
@Test
public void testIteratorTreeMap(){
// 使用红黑树,查询值的时间复杂度是O(log(n)).因此数据越多,查询值的时间就会增加。枚举使用`keySet`
// 就会比`EntrySet`慢很多。
TreeMap<String,String> hm = new TreeMap<>();
iteratorMap(hm);
}
@Test
public void testIteratorMap(){
logger.info("===================== testIteratorHashMap =========================");
testIteratorHashMap();
logger.info("===================== testIteratorTreeMap =========================");
testIteratorTreeMap();
}
public void iteratorMap(Map<String,String> hm){
for(int i = 0; i< 1000000L; ++i){
hm.put("website-"+i,"https://blog.csdn.net/infoworld");
}
Set<Map.Entry<String, String>> entries = hm.entrySet();
StringBuilder sb = new StringBuilder();
var start = startRecord();
for(var item: entries){
sb.append(item.getKey());
sb.append(item.getValue());
}
var end = endRecord();
pDuration(logger,start,end,"EntrySet");
// logger.info(sb.toString());
var keys = hm.keySet();
sb = new StringBuilder();
start = startRecord();
for(var key: keys){
sb.append(key);
sb.append(hm.get(key));
}
end = endRecord();
pDuration(logger,start,end,"KeySet");
// logger.info(sb.toString());
}
@Test
public void testRemoveDulplicationTreeMap(){
Consumer<List<Integer>> func = lists->{
for(int i = 0; i< kIterateCount; ++i){
var value = (int)(Math.random()*100);
lists.add(value);
}
};
List<Integer> lists = new ArrayList<>();
long count = 0;
// 使用HashMap来去重
logger.info("===================== TreeMap =========================");
for(int i = 0; i< 1000; ++i){
lists.clear();
func.accept(lists);
var map = new TreeMap<Integer,Integer>();
var first = startRecord();
for(var one: lists){
if(!map.containsKey(one))
map.put(one,one);
}
lists.clear();
var keys = map.keySet();
for(var key : keys)
lists.add(key);
var second = endRecord();
count = (count > 0)?(count + second - first)/2:(second - first);
}
logger.info("hashmap duration: "+ count);
// for(var one : lists)
// logger.info("number is: "+one);
//
//
// for(var one : lists)
// logger.info("number is: "+one);
}
@Test
public void testRemoveDulplicationHashMap(){
Consumer<List<Integer>> func = lists->{
for(int i = 0; i< kIterateCount; ++i){
var value = (int)(Math.random()*100);
lists.add(value);
}
};
List<Integer> lists = new ArrayList<>();
long count = 0;
// 使用HashMap来去重
logger.info("===================== HashMap =========================");
for(int i = 0; i< 1000; ++i){
lists.clear();
func.accept(lists);
var map = new HashMap<Integer,Integer>();
var first = startRecord();
for(var one: lists){
if(!map.containsKey(one))
map.put(one,one);
}
lists.clear();
var keys = map.keySet();
for(var key : keys)
lists.add(key);
var second = endRecord();
count = (count > 0)?(count + second - first)/2:(second - first);
}
logger.info("hashmap duration: "+ count);
// for(var one : lists)
// logger.info("number is: "+one);
//
//
// for(var one : lists)
// logger.info("number is: "+one);
}
@Test
public void testRemoveDulplicationSet(){
Consumer<List<Integer>> func = lists->{
for(int i = 0; i< kIterateCount; ++i){
var value = (int)(Math.random()*100);
lists.add(value);
}
};
List<Integer> lists = new ArrayList<>();
long count = 0;
// 使用Set来去重
logger.info("=================== Set ===========================");
for(int i = 0; i< 1000; ++i){
lists.clear();
func.accept(lists);
var sets = new HashSet<Integer>(lists);
var first1 = startRecord();
lists.clear();
lists.addAll(sets);
var second1 = endRecord();
count = (count > 0)?(count + second1 - first1)/2:(second1 - first1);
}
logger.info("set duration: "+ count);
// for(var one : lists)
// logger.info("number is: "+one);
}
@Test
public void testRemoveDulpication(){
// 1. 使用HashSet来去重,速度并不比HashMap快,但是比TreeMap快很多。
logger.info("Time unit is nanosecond!");
testRemoveDulplicationHashMap();
testRemoveDulplicationTreeMap();
testRemoveDulplicationSet();
}
}
输出
-
testRemoveDulpication移除重复元素测试,元素个数有100000个,循环测试1000次求平均值。分别用HashMap,TreeMap和HashSet测试速度对比。0 [main] INFO test.example.TestCollection - Time unit is nanosecond!
0 [main] INFO test.example.TestCollection - ===================== HashMap =========================
2101 [main] INFO test.example.TestCollection - hashmap duration: 278803
2101 [main] INFO test.example.TestCollection - ===================== TreeMap =========================
7361 [main] INFO test.example.TestCollection - hashmap duration: 3430520
7361 [main] INFO test.example.TestCollection - =================== Set ===========================
10182 [main] INFO test.example.TestCollection - set duration: 356707 -
testInitialCapacity初始化10000个容量,之后添加10000个元素和不初始化容量添加10000个元素测试速度对比。0 [main] INFO test.example.TestCollection - first duration: 171200
6 [main] INFO test.example.TestCollection - first items size is : 10000
6 [main] INFO test.example.TestCollection - first items capacity is : 10000
6 [main] INFO test.example.TestCollection - second duration: 214400
6 [main] INFO test.example.TestCollection - second items2 size is :10000
6 [main] INFO test.example.TestCollection - second items2 capacity is :14053 -
testIteratorMap枚举1000000个元素,分别用entrySet获取key,value值和通过keySet先获取key再获取value测试速度对比。0 [main] INFO test.example.TestCollection - ===================== testIteratorHashMap =========================
191 [main] INFO test.example.TestCollection - EntrySet duration: 58433200
256 [main] INFO test.example.TestCollection - KeySet duration: 61964700
256 [main] INFO test.example.TestCollection - ===================== testIteratorTreeMap =========================
635 [main] INFO test.example.TestCollection - EntrySet duration: 85129200
837 [main] INFO test.example.TestCollection - KeySet duration: 200515700