1、构造list:

        List<HyperLink> list = new ArrayList<>();
        for (int i = 0; i < 1600000; i++) {
            HyperLink hyperLink = new HyperLink();
            hyperLink.setName("name" + i);
            hyperLink.setUrl("url" + i);
            list.add(hyperLink);
        }

2、不同方案耗时比较:

方案一:

        //方案一:java8 stream() + distinct()
        Long time1 = System.currentTimeMillis();
        List<String> dorgCodes = list.stream().map(o->o.getName()).distinct().collect(Collectors.toList());
        List<String> dgoodsCodes = list.stream().map(o->o.getUrl()).distinct().collect(Collectors.toList());
        Long time2 = System.currentTimeMillis();
        System.out.println("time2 - time1:" + (time2 - time1));

方案二:

        //方案二:Set 集合去重
        Long time3 = System.currentTimeMillis();
        Set<String> orgSets = new HashSet<>();
        Set<String> goodsSets = new HashSet<>();
        list.forEach(o ->{
            orgSets.add(o.getName());
            goodsSets.add(o.getUrl());
        });
        List<String> orgCodes = new ArrayList<>(orgSets);
        List<String> goodsCodes = new ArrayList<>(goodsSets);
        System.out.println("time3:" + (System.currentTimeMillis() - time3));

方案三:

        //方案三:List.contains()去重
        Long time3 = System.currentTimeMillis();
        List<String> orgCodes = new ArrayList<>();
        List<String> goodsCodes = new ArrayList<>();
        list.forEach(o ->{
            if(!orgCodes.contains(o.getName())){
                orgCodes.add(o.getName());
            }
            if(!goodsCodes.contains(o.getUrl())){
                goodsCodes.add(o.getUrl());
            }
        });
        System.out.println("time3:" + (System.currentTimeMillis() - time3));

结论: list的数据量是钱以下级别时,方案二和方案三较快;

    数据量是十万级别以上,方案二中List的contains方法性能急剧下降;

    数据量百万级以上,方法一和方案二耗时接近;

    故方案二是目前的最佳方案;

02-12 06:44