HashMap#hash(int) 方法说明

本文介绍了HashMap#hash(int) 方法说明的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

谁能给我解释一下静态 HashMap#hash(int) 方法?

Can someone please explain to me the static HashMap#hash(int) method?

生成均匀分布的哈希背后的理由是什么?

What's the justification behind it to generate uniformly distributed hashes?

/**
 * Applies a supplemental hash function to a given hashCode, which
 * defends against poor quality hash functions.  This is critical
 * because HashMap uses power-of-two length hash tables, that
 * otherwise encounter collisions for hashCodes that do not differ
 * in lower bits. Note: Null keys always map to hash 0, thus index 0.
 */
static int hash(int h) {
    // This function ensures that hashCodes that differ only by
    // constant multiples at each bit position have a bounded
    // number of collisions (approximately 8 at default load factor).
    h ^= (h >>> 20) ^ (h >>> 12);
    return h ^ (h >>> 7) ^ (h >>> 4);
}

举个例子会更容易理解.

An example would make it easier to digest.

澄清我知道运算符、真值表和按位运算.我真的无法真正解码实现或评论.甚至背后的原因.

ClarificationI'm aware of the operators, truth tables and bitwise operations. I just can't really decode the implementation nor the comment really. Or even the reasoning behind it.

推荐答案

>>> 是逻辑右移(无符号扩展)(JLS 15.19 Shift Operators)，^ 是按位异或 (JLS 15.22.1 整数按位运算符).

>>> is the logical right shift (no sign-extension) (JLS 15.19 Shift Operators), and ^ is the bitwise exclusive-or (JLS 15.22.1 Integer Bitwise Operators).

至于这样做的原因，文档提供了一个提示:HashMap 使用 2 的幂长度表，并通过屏蔽掉较高位并仅取其较低位来散列键哈希码.

As to why this is done, the documentation offers a hint: HashMap uses power-of-two length tables, and hashes keys by masking away the higher bits and taking only the lower bits of their hash code.

// HashMap.java -- edited for conciseness
static int indexFor(int h, int length) {
    return h & (length-1);
}

public V put(K key, V value) {
    int hash = hash(key.hashCode());
    int index = indexFor(hash, table.length);
    // ...
}

所以 hash() 试图将相关性带到更高的位，否则会被屏蔽掉(indexFor 基本上丢弃了 hcode> 并且只采用较低的 k 位，其中 length == (1 << k)).

So hash() attempts to bring relevancy to the higher bits, which otherwise would get masked away (indexFor basically discards the higher bits of h and takes only the lower k bits where length == (1 << k)).

将此与 Hashtable(它不应该是二次幂长度表)使用键的哈希码的方式进行对比.

Contrast this with the way Hashtable (which should have NOT a power-of-two length table) uses a key's hash code.

// Hashtable.java -- edited for conciseness
public synchronized V get(Object key) {
    int hash = key.hashCode();
    int index = (hash & 0x7FFFFFFF) % table.length;
    // ...
}

通过执行更昂贵的 % 操作(而不是简单的位掩码)，Hashtable 的性能对低位分布较差的哈希码不太敏感(特别是如果 table.length 是素数).

By doing the more expensive % operation (instead of simple bit masking), the performance of Hashtable is less sensitive to hash codes with poor distribution in the lower bits (especially if table.length is a prime number).

这篇关于HashMap#hash(int) 方法说明的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！