这本是多年前一个stackoverflow上的一个讨论,回答中涉及到了多种计数方法。对于一个key-value结构的map,我们在编程时会经常涉及到key是对象,而value是一个integer或long来负责计数,从而统计多个key的频率。

面对这样一个基本需求,可能有很多种实现。比如最基本的使用jdk的map直接实现——value是一个integer或者long。其基本代码型如下:

1: final Map freq = new HashMap();2: int count = freq.containsKey(word) ? freq.get(word) : 0;3: freq.put(word, count + 1);

逻辑简单,判断是否存在,是则get取值,否则为0,再put进去一个加1后的值。总共要contain判断,get,put做三次方法调用。

当然进一步我们可以把contain判断去掉,代码如下:

1: final Map freq = new HashMap();2: Integer count = freq.get(word);3: if (count == null) {4: freq.put(word, 1);5: } else {6: freq.put(word, count + 1);7: }

一般情况,我们做到这个地步,多数人对其逻辑已经满足,简单性能也能接受,试着想一下,难道不是这样吗?get加put,解决了。

当然这样的实现还不够高效,于是我们开始去尝试实现或寻找更高效的方法,看看开源的集合类库是否有需要的:

有个Trove,可以让我们参考一下:

1: final TObjectIntHashMap freq = new TObjectIntHashMap();2: freq.adjustOrPutValue(word, 1, 1);

这样做,非常优雅啊,性能如何呢?不知道,需要看源码了解细节。那再看看大名鼎鼎的guava如何呢?

1: AtomicLongMap map = AtomicLongMap.create();2: map.getAndIncrement(word);

实现依然优雅,但是,但是看这名字,再看源码,好吧,线程安全的,支持并发,这不好搞了,我们场景需要吗?不需要的话直觉告诉我们这肯定是“慢”的。再找:

1: Multiset bag = HashMultiset.create();2: bag.add(word);

这个看上去合适了,bag的实现明显好很多,而且从语义理解上,这样的接口更容易让人理解。

那么这些方法,性能如何呢?做了个简单的比较,将26个英文字母做key,均匀循环若干次比较各个方法的效率(单纯时间效率),而且时间不统计构建开销。外加了一个线程安全版的concurrentMap实现,而其实google的guava里的AtomicLongMap也是包装了juc的concurrentMap而已。里面有最终极的MutableInt方法,找找看吧,性能最好的就是它了。

1: /**2: *3: */4:5:6: import gnu.trove.map.hash.TObjectIntHashMap;7:8: import java.util.HashMap;9: import java.util.Map;10: import java.util.concurrent.ConcurrentHashMap;11: import java.util.concurrent.ConcurrentMap;12: import java.util.concurrent.atomic.AtomicLong;13:14: import com.google.common.collect.HashMultiset;15: import com.google.common.collect.Multiset;16: import com.google.common.util.concurrent.AtomicLongMap;17:18: /**19: * @author Administrator20: *21: */22: public class IntMapTest {23:24: /**25: * @param args26: */27: public static void main(String[] args) {28: // TODO Auto-generated method stub29: int cycles[] = { 100, 1000, 10000, 100000 };30: Tester baseLine = new BaseLine();31: Tester testForNull = new UseNullTest();32: Tester useAtomicLong = new UseAtomicLong();33: Tester useTrove = new UseTrove();34: Tester useMutableInt = new UseMutableInt();35: Tester useGuava = new UseGuava();36: Tester useGuava2 = new UseGuava2();37:38: for (int i = 0; i < cycles.length; i++) {39: System.out.println("-----With " + cycles[i] + " cycles-----");40: baseLine.test(cycles[i]);41: testForNull.test(cycles[i]);42: useAtomicLong.test(cycles[i]);43: useTrove.test(cycles[i]);44: useMutableInt.test(cycles[i]);45: useGuava.test(cycles[i]);46: useGuava2.test(cycles[i]);47: System.out.println("------------------------");48: }49:50: }51:52: }53:54: abstract class Tester {55: long ms;56: static String[] strs = "abcdefghijklmnopqrstuvwxyz".split("");57:58: void pre() {59: System.out.println("====" + this.getName() + "Test Case ");60: ms = System.currentTimeMillis();61: System.out.println("start at " + ms);62: }63:64: void post() {65: ms = System.currentTimeMillis() - ms;66: System.out.println("Time used: " + ms + " ms");67: }68:69: abstract void doAction(int cycles);70:71: public void test(int cycles) {72: pre();73: doAction(cycles);74: post();75: }76:77: abstract String getName();78: }79:80: class BaseLine extends Tester {81: final Map freq = new HashMap();82:83: @Override84: void doAction(int cycles) {85: for (int i = 0; i < cycles; i++) {86: for (String word : strs) {87: int count = freq.containsKey(word) ? freq.get(word) : 0;88: freq.put(word, count + 1);89: }90: }91: }92:93: @Override94: String getName() {95: return "BaseLine";96: }97:98: }99:100: class UseNullTest extends Tester {101: final Map freq = new HashMap();102:103: @Override104: void doAction(int cycles) {105: for (int i = 0; i < cycles; i++) {106: for (String word : strs) {107: Integer count = freq.get(word);108: if (count == null) {109: freq.put(word, 1);110: } else {111: freq.put(word, count + 1);112: }113: }114: }115: }116:117: @Override118: String getName() {119: return "TestForNull";120: }121:122: }123:124: class UseAtomicLong extends Tester {125: final ConcurrentMap map = new ConcurrentHashMap();126:127: @Override128: void doAction(int cycles) {129: for (int i = 0; i < cycles; i++) {130: for (String word : strs) {131: map.putIfAbsent(word, new AtomicLong(0));132: map.get(word).incrementAndGet();133: }134: }135: }136:137: @Override138: String getName() {139: return "AtomicLong";140: }141:142: }143:144: class UseTrove extends Tester {145: final TObjectIntHashMap freq = new TObjectIntHashMap();146:147: @Override148: void doAction(int cycles) {149: for (int i = 0; i < cycles; i++) {150: for (String word : strs) {151: freq.adjustOrPutValue(word, 1, 1);152: }153: }154: }155:156: @Override157: String getName() {158: return "Trove";159: }160:161: }162:163: class MutableInt {164: int value = 1; // note that we start at 1 since we're counting165:166: public void increment() {167: ++value;168: }169:170: public int get() {171: return value;172: }173: }174:175: class UseMutableInt extends Tester {176: Map freq = new HashMap();177:178: @Override179: void doAction(int cycles) {180: for (int i = 0; i < cycles; i++) {181: for (String word : strs) {182: MutableInt count = freq.get(word);183: if (count == null) {184: freq.put(word, new MutableInt());185: } else {186: count.increment();187: }188: }189: }190: }191:192: @Override193: String getName() {194: return "MutableInt";195: }196:197: }198:199: class UseGuava extends Tester {200: AtomicLongMap map = AtomicLongMap.create();201:202: @Override203: void doAction(int cycles) {204: for (int i = 0; i < cycles; i++) {205: for (String word : strs) {206: map.getAndIncrement(word);207: }208: }209: }210:211: @Override212: String getName() {213: return "Guava AtomicLongMap";214: }215:216: }217:218: class UseGuava2 extends Tester {219: Multiset bag = HashMultiset.create();220:221: @Override222: void doAction(int cycles) {223: for (int i = 0; i < cycles; i++) {224: for (String word : strs) {225: bag.add(word);226: }227: }228: }229:230: @Override231: String getName() {232: return "Guava HashMultiSet";233: }234:235: }

输出结果如下:

1: -----With 100 cycles-----2: ====BaseLineTest Case3: start at 13586557027294: Time used: 7 ms5: ====TestForNullTest Case6: start at 13586557027367: Time used: 3 ms8: ====AtomicLongTest Case9: start at 135865570273910: Time used: 14 ms11: ====TroveTest Case12: start at 135865570275313: Time used: 2 ms14: ====MutableIntTest Case15: start at 135865570275516: Time used: 2 ms17: ====Guava AtomicLongMapTest Case18: start at 135865570275719: Time used: 4 ms20: ====Guava HashMultiSetTest Case21: start at 135865570276122: Time used: 7 ms23: ------------------------24: -----With 1000 cycles-----25: ====BaseLineTest Case26: start at 135865570276827: Time used: 17 ms28: ====TestForNullTest Case29: start at 135865570278530: Time used: 7 ms31: ====AtomicLongTest Case32: start at 135865570279233: Time used: 44 ms34: ====TroveTest Case35: start at 135865570283636: Time used: 17 ms37: ====MutableIntTest Case38: start at 135865570285339: Time used: 5 ms40: ====Guava AtomicLongMapTest Case41: start at 135865570285842: Time used: 9 ms43: ====Guava HashMultiSetTest Case44: start at 135865570286845: Time used: 50 ms46: ------------------------47: -----With 10000 cycles-----48: ====BaseLineTest Case49: start at 135865570291850: Time used: 16 ms51: ====TestForNullTest Case52: start at 135865570293453: Time used: 14 ms54: ====AtomicLongTest Case55: start at 135865570294856: Time used: 29 ms57: ====TroveTest Case58: start at 135865570297759: Time used: 10 ms60: ====MutableIntTest Case61: start at 135865570298862: Time used: 5 ms63: ====Guava AtomicLongMapTest Case64: start at 135865570299365: Time used: 15 ms66: ====Guava HashMultiSetTest Case67: start at 135865570300968: Time used: 77 ms69: ------------------------70: -----With 100000 cycles-----71: ====BaseLineTest Case72: start at 135865570308673: Time used: 124 ms74: ====TestForNullTest Case75: start at 135865570321076: Time used: 118 ms77: ====AtomicLongTest Case78: start at 135865570332979: Time used: 240 ms80: ====TroveTest Case81: start at 135865570356982: Time used: 102 ms83: ====MutableIntTest Case84: start at 135865570367185: Time used: 45 ms86: ====Guava AtomicLongMapTest Case87: start at 135865570371688: Time used: 126 ms89: ====Guava HashMultiSetTest Case90: start at 135865570384291: Time used: 98 ms92: ------------------------

一般结论:单线程使用MutableInt,多线程使用guava的AtomicLongMap,其实可以看看guava对addAndGet的实现,循环,很有趣。

最后总结一下,我们在对这个问题做优化的时候,明显的思路就是减少方法调用,而MutableInt效率最高,明显的是它将方法调用减少到最小——1次get,指针的威力顿时显现。当然实际业务代码实现的时候还要考虑到多个因素,比如代码可读性,与业务结合等等,我们现实中不一定要追求如此的效率,但是也要避免毫无思考的写下baseline里的代码,因为明显是可优化的,why not?

注:文中单个实现代码来自stackoverflow的各个回答,测试代码本人编写。

ref:

posted on 2013-01-20 12:40 changedi 阅读(4269) 评论(0)  编辑  收藏 所属分类: Java技术

java 如何实现计数_如何高效的实现一个计数器map相关推荐

  1. stm32编码器正反转计数程序_如何高效的扩展定时/计数器?

    来源:公众号[鱼鹰谈单片机]作者:鱼鹰OspreyID   :emOsprey我们都知道,单片机往往都有定时器这个外设,定时器有时候也会用来作为计数器使用,在项目中它的的使用非常频繁,但有时候却满足不 ...

  2. java id生成器 分布式_分布式高效唯一ID生成器(sequence)

    分布式高效唯一ID生成器(sequence) 简介 高效GUID产生算法(sequence),基于Snowflake实现64位自增ID算法. Twitter-Snowflake算法产生的背景相当简单, ...

  3. java最少有多少线程_【并发编程】一个最简单的Java程序有多少线程?

    一个最简单的Java程序有多少线程? 通过下面程序可以计算出当前程序的线程总数. import java.lang.management.ManagementFactory; import java. ...

  4. java内存泄漏案例_寻找内存泄漏:一个案例研究

    java内存泄漏案例 一周前,我被要求修复一个有内存泄漏问题的webapp. 考虑到过去两年左右的时间里我已经看到并修复了数百个泄漏,我想这有多难. 但是事实证明这是一个挑战. 12小时后,我发现该应 ...

  5. avr计数_使用8位LCD创建计数器| AVR

    avr计数 This type of counter may be also used in the EVM machines. A counter can be used to count the ...

  6. java打字游戏代码_牛逼啊!一个随时随地写Python代码的神器

    现在学Python的人越来越多,很多小伙伴都非常有激情.利用碎片时间随时随地学习Python, 大家知道Python是一门编程语言,但是学语言光看不练是没有用的.最好能编程并运行,有没有什么好的神器可 ...

  7. java 正则提取邮箱_如何用正则表达式提取一个网站里面的所有邮箱地址?

    展开全部 用正则表达式提取一个网站里面的所有邮箱地址e68a8462616964757a686964616f31333337616565 import java.io.BufferedReader; ...

  8. import java.io后报错_用JSP+JAVABEAN实现一个根据圆半径求圆面积、圆周长的功能:为什么我的老出错啊错误:...

    用JSP+JAVABEAN实现一个根据圆半径求圆面积.圆周长的功能:为什么我的老出错啊错误: 圆的半径是: 圆的面积是: 圆的周长是: package tools; import java.io.*; ...

  9. java 网络流量统计_项目中用到的一个简单的流量统计例子-java流量统计

    两个步骤搞定 1:工具类 import java.io.BufferedReader; import java.io.File; import java.io.FileNotFoundExceptio ...

最新文章

  1. linux php目录是否存在,PHP判断文件或者目录是否可写,兼容windows/linux系统
  2. Coursera课程Python for everyone:Quiz: Single-Table SQL
  3. F5 IIS Log获取客户端源IP
  4. 《Objective-C基础教程》第二章 对C的扩展
  5. [转]汇编语言的准备知识--给初次接触汇编者 3
  6. Standard Deviation Normal Distribution
  7. Android 源代码中的res
  8. 不止鸿蒙 OS,华为的备用操作系统还有“极光”?
  9. 14章类型信息之使用类字面常量
  10. c#读取ini配置文件、将配置数据保存至ini文件
  11. WebServic调用天气预报服务
  12. android 外接键盘 五笔 百度输入法
  13. MAC-终端命令大全
  14. iMovie 裁剪视频
  15. 浅析大数据与人工智能
  16. vs2017 linux unable to start debugging
  17. HIVE 实现均匀抽样
  18. 3D hand pose:BMC
  19. linux中ls命令查看文件大小与时间
  20. 开源软件xxl-job的oracle版本

热门文章

  1. IP协议详解---Linux学习笔记
  2. maven的安装、路径配置、修改库文件路径和eclipse中的配置、创建maven工程(转)...
  3. HTML5实战——canvas 绘制钟表
  4. 机房管理系列之工作站
  5. IronRuby - 文件编码惹的祸
  6. 【POI word】使用POI实现对Word的读取以及生成
  7. [LeetCode]93.Restore IP Addresses
  8. Linux服务器部署邮件服务器详细操作文档
  9. Git push “fatal: Authentication failed ”
  10. SQL Server 跨库查询