Skip to content

Latest commit

ย 

History

History
336 lines (210 loc) ยท 14 KB

File metadata and controls

336 lines (210 loc) ยท 14 KB

๋ชฉ์ฐจ

Hash Table
Hash Function
Hash Collision
Dynamic Resizing
Java์˜ HashTable๊ณผ HashMap

Hash Table

key์— ๋Œ€ํ•œ hash ๊ฐ’์„ ์ธ๋ฑ์Šค๋กœ ์‚ฌ์šฉํ•˜์—ฌ key-value์Œ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ณ  ์กฐํšŒํ•˜๋ฉฐ, key-value์Œ์˜ ๋ฐ์ดํ„ฐ์˜ ๊ฐœ์ˆ˜์— ๋”ฐ๋ผ ๋™์ ์œผ๋กœ ํฌ๊ธฐ๊ฐ€ ์ฆ๊ฐ€ํ•˜๋Š” ์ž๋ฃŒ๊ตฌ์กฐ์ด๋‹ค.

bucket๊ณผ hash function

hash table์€ array๋กœ ์ด๋ฃจ์–ด์ ธ์žˆ๊ณ  array ๊ฐ๊ฐ์˜ ์ฃผ์†Œ๋ฅผ bucket ์ด๋ผ ๋ถ€๋ฅธ๋‹ค. hash function ์€ key-value์˜ key๊ฐ’์„ array์˜ index๋กœ ๋งคํ•‘์‹œํ‚ค๋Š” ํ•จ์ˆ˜์ด๋‹ค.

Untitled


Hash Function

ํ•ด์‹œ ํ•จ์ˆ˜๋Š” ์ž„์˜ ํฌ๊ธฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ๊ณ ์ • ํฌ๊ธฐ์˜ ๊ฐ’์œผ๋กœ ๋งคํ•‘ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜๋Š” ํ•จ์ˆ˜์ด๋‹ค.

  • hash function๋กœ hash table์ด๋ผ๊ณ  ๋ถˆ๋ฆฌ๋Š” array์— ์ €์žฅ๋  ๋ฐ์ดํ„ฐ์˜ ์œ„์น˜(index)๋ฅผ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค.
    • index๋ฅผ ๊ตฌํ•  ๋•Œ, ๋ณดํ†ต mod ์—ฐ์‚ฐ์„ ์‚ฌ์šฉํ•œ๋‹ค.
    • hash function์„ ํ†ตํ•ด index๋ฅผ ๊ตฌํ•˜๋Š” ๊ณผ์ •์„ hashing ์ด๋ผ๊ณ  ํ•œ๋‹ค.

Java์˜ Hash Function์˜ ํ•œ๊ณ„

์™„์ „ ํ•ด์‹œ ํ•จ์ˆ˜

๋ชจ๋“  key๋“ค์ด ์„œ๋กœ ๋‹ค๋ฅธ hash๊ฐ’์„ ๊ฐ€์ง€๊ณ  ์žˆ์–ด์„œ ํ•ด์‹œ ์ถฉ๋Œ์ด ์ผ์–ด๋‚˜์ง€ ์•Š๋Š” hash function์ด๋‹ค. key์˜ ์ „์ฒด์ง‘ํ•ฉ์„ ๋ฏธ๋ฆฌ ์•Œ๊ณ  ์žˆ๋Š” ๊ฒฝ์šฐ์— ์™„์ „ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ๋งŒ๋“ค ์ˆ˜ ์žˆ๋‹ค.

๊ฐ์ฒด๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์˜ ์ข…๋ฅ˜ โ‰ค 2^32

Boolean, Number(Integer, Long, Double) ์ฒ˜๋Ÿผ ๊ฐ์ฒด๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์˜ ๊ฐœ์ˆ˜(key์˜ ์ „์ฒด์ง‘ํ•ฉ)๊ฐ€ ์ ๋‹ค๋ฉด ํ•ด๋‹น ๊ฐ์ฒด๋ฅผ ์™„์ „ ํ•ด์‹œ ํ•จ์ˆ˜์˜ ๋Œ€์ƒ์œผ๋กœ ์‚ผ์„ ์ˆ˜ ์žˆ๋‹ค.

๊ฐ์ฒด๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์˜ ์ข…๋ฅ˜ > 2^32

String, POJO์™€ ๊ฐ™์ด ๊ฐ์ฒด๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์˜ ๊ฐœ์ˆ˜(key์˜ ์ „์ฒด์ง‘ํ•ฉ)๊ฐ€ int๋กœ ํ‘œํ˜„ํ•  ์ˆ˜ ์žˆ๋Š” ๊ฐœ์ˆ˜(2^32)๋ณด๋‹ค ๋งŽ๋‹ค๋ฉด ํ•ด๋‹น ๊ฐ์ฒด๋ฅผ ์™„์ „ ํ•ด์‹œ ํ•จ์ˆ˜์˜ ๋Œ€์ƒ์œผ๋กœ ์‚ผ์„ ์ˆ˜ ์—†๋‹ค.( hashCode() ๋ฉ”์†Œ๋“œ์˜ ๋ฐ˜ํ™˜๊ฐ’์ด int ํ˜•์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.)

Java์˜ HashMap์—์„œ ์™„์ „ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ์ˆ˜ ์—†๋‹ค

HashMap์ด ๋‹ด์„ ์ˆ˜ ์žˆ๋Š” ๊ฐ์ฒด๊ฐ€ ๋‚˜ํƒ€๋‚ด๋Š” ๊ฐ’์˜ ๊ฐœ์ˆ˜๋Š” int ์˜ ํ‘œํ˜„๋ฒ”์œ„์— ๋ฒ—์–ด๋‚˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ๋งŒ์•ฝ int ํ‘œํ˜„๋ฒ”์œ„๋ฅผ ๋ฒ—์–ด๋‚˜์ง€ ์•Š๋”๋ผ๋„ ๋žœ๋ค ์ ‘๊ทผ์‹œ ์‹œ๊ฐ„๋ณต์žก๋„ O(1)์„ ๋ณด์žฅํ•˜๊ธฐ ์œ„ํ•ด์„œ 2^32 ๊ธธ์ด์˜ ๋ฐฐ์—ด์„ ๊ฐ€์ง€๊ณ  ์žˆ์–ด์•ผ ํ•œ๋‹ค. ์ด๋Š” ๋ฉ”๋ชจ๋ฆฌ ๋‚ญ๋น„๋ฅผ ์•ผ๊ธฐํ•œ๋‹ค.


Hash Collision

hash bucket ์ธ๋ฑ์Šค

int m = hashTableSize; // m : ํ•ด์‹œ ํ…Œ์ด๋ธ”์˜ ํฌ๊ธฐ
int hash = object.hashCode(); // object: ์ž๋ฐ” ๊ฐ์ฒด, hashCode() : ์ž๋ฐ” ๊ฐ์ฒด์˜ hash function 
int index = hash % m; // hash bucket ์ธ๋ฑ์Šค
Java์˜ hashCode ๋ฉ”์†Œ๋“œ

/**
* Returns a hash code value for the object. This method is
* supported for the benefit of hash tables such as those provided by
* {@link java.util.HashMap}.
* @return  a hash code value for this object.
* @see     java.lang.Object#equals(java.lang.Object)
* @see     java.lang.System#identityHashCode
*/
@HotSpotIntrinsicCandidate
public native int hashCode();

/**
* Returns the same hash code for the given object as
* would be returned by the default method hashCode(),
* whether or not the given object's class overrides
* hashCode().
* The hash code for the null reference is zero.
*
* @param x object for which the hashCode is to be calculated
* @return  the hashCode
* @since   1.1
* @see Object#hashCode
* @see java.util.Objects#hashCode(Object)
*/
@HotSpotIntrinsicCandidate
public static native int identityHashCode(Object x);
  • ์ž๋ฐ” ๊ฐ์ฒด์˜ hashCode() ๋ฉ”์†Œ๋“œ๋Š” intํ˜•์˜ ํ•ด์‹œ๊ฐ’์„ ๋ฐ˜ํ™˜ํ•œ๋‹ค. ๊ธฐ๋ณธ์ ์œผ๋กœ System.identityHashCode ๋ฐ˜ํ™˜๊ฐ’์„ ๋ฐ˜ํ™˜ํ•˜๋ฉฐ, ์‚ฌ์šฉ์ž๊ฐ€ ์žฌ์ •์˜ํ•  ์ˆ˜ ์žˆ๋‹ค.


1/m ํ™•๋ฅ ์˜ ํ•ด์‹œ ์ถฉ๋Œ

hash function์˜ ํ‘œํ˜„๋ฒ”์œ„๋ฅผ m์œผ๋กœ ์ขํž˜์œผ๋กœ์จ ์„œ๋กœ ๋‹ค๋ฅธ hashCode๊ฐ’์„ ๊ฐ€์ง„ ๊ฐ์ฒด๊ฐ€ 1/m ์˜ ํ™•๋ฅ ๋กœ ๋™์ผํ•œ ํ•ด์‹œ ๋ฒ„ํ‚ท ์ธ๋ฑ์Šค๋ฅผ ๊ฐ€์งˆ ์ˆ˜ ์žˆ๋Š” ํ•ด์‹œ ์ถฉ๋Œ(hash collision) ์ด ๋ฐœ์ƒํ•œ๋‹ค. ํ•ด์‹œ ์ถฉ๋Œ์€ hash function์„ ์•„๋ฌด๋ฆฌ ์ž˜ ๊ตฌํ˜„ํ•˜์—ฌ๋„ ์ƒ๊ด€์—†์ด ๋ฐœ์ƒํ•œ๋‹ค.

Untitled

ํ•ด์‹œ ์ถฉ๋Œ ํ•ด๊ฒฐ๋ฐฉ๋ฒ•

ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•ด๋„ Open Addressing, Seperate Chaining ๋ฐฉ์‹์„ ํ†ตํ•ด key-value ๊ตฌ์กฐ์˜ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅ, ์กฐํšŒํ•  ์ˆ˜ ์žˆ๋‹ค.


#๏ธโƒฃOpen Addressing (๊ฐœ๋ฐฉ ์ฃผ์†Œ๋ฒ•)

๋ฐ์ดํ„ฐ๋ฅผ ์‚ฝ์ž…ํ•˜๋ ค๋Š” hash bucket์ด ์ด๋ฏธ ์‚ฌ์šฉ์ค‘์ด๋ผ๋ฉด ๋น„์–ด์žˆ๋Š” bucket์„ ์ฐพ์•„ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฝ์ž…ํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.

  • Worst Case์˜ ๊ฒฝ์šฐ ๋น„์–ด์žˆ๋Š” bucket์„ ์ฐพ์ง€ ๋ชปํ•˜๊ณ  ํƒ์ƒ‰์„ ์‹œ์ž‘ํ•œ ์œ„์น˜๋กœ ๋˜๋Œ์•„์˜ฌ ์ˆ˜ ์žˆ๋‹ค.
    • ๋ฐ์ดํ„ฐ๊ฐ€ ์กด์žฌํ•˜๋Š” bucket์ด ๋ชจ์—ฌ์žˆ์œผ๋ฉด Worse Case ๋ฐœ์ƒ ๋นˆ๋„๊ฐ€ ๋†’์•„์ง„๋‹ค.
  • ๋น„์–ด์žˆ๋Š” bucket์„ ํƒ์ƒ‰ํ•˜๋Š” ๋ฐฉ์‹์—๋Š” Linear Probing, Quadratic Probing, Double Hashing ์ด ์žˆ๋‹ค.

1. Linear Probing (์„ ํ˜• ํƒ์ƒ‰)

ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•œ ํ˜„์žฌ bucket index๋ถ€ํ„ฐ ๊ณ ์ •ํญ n ๋งŒํผ ์ด๋™ํ•˜๋ฉด์„œ ๋น„์–ด์žˆ๋Š” bucket์„ ์ฐพ์•„ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.

  • hi(x) = (Hash(x) + i) % arraySize
    • h(k) โ†’ h(k)+n โ†’ h(k)+2n โ†’ h(k)+3n ...

Linear Probing ์˜์‚ฌ์ฝ”๋“œ

while(Node != null){  // ํƒ์ƒ‰ ๋…ธ๋“œ๊ฐ€ ๋น„์–ด์žˆ๋‹ค๋ฉด searchKey๊ฐ€ ์•„์ง ์ €์žฅ์ด ์•ˆ๋œ ๊ฒƒ

  if(Node.key == searchKey) return Node.value;  
  
  Node = Node.next;   // ๊ทœ์น™์— ๋งž๋Š” ๋‹ค์Œ ๋…ธ๋“œ
}

Linear Probing ์˜ˆ์‹œ

Untitled

Linear Probing ๋‹จ์  : Primary Clustering ๋ฌธ์ œ

๋ฐ์ดํ„ฐ๊ฐ€ ์กด์žฌํ•˜๋Š” filled bucket๋“ค์ด ๋ชจ์—ฌ์žˆ๋‹ค๋ฉด ๋น„์–ด์žˆ๋Š” bucket์„ ์ฐพ๊ธฐ ๊นŒ์ง€์˜ ํƒ์ƒ‰ ์‹œ๊ฐ„์ด ๋Š˜์–ด๋‚œ๋‹ค. Primary Cluster๊ฐ€ ํ˜•์„ฑ๋˜๋ฉด ๋น ๋ฅด๊ฒŒ ์ปค์ง€๊ณ  ๊ฒฐ๊ตญ ์„ฑ๋Šฅ ์ €ํ•˜๋ฅผ ์•ผ๊ธฐํ•œ๋‹ค.

Untitled


2. Quadratic Probing (์ œ๊ณฑ ํƒ์ƒ‰)

ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•œ ํ˜„์žฌ bucket index๋ถ€ํ„ฐ n^2 ๋งŒํผ ์ด๋™ํ•˜๋ฉด์„œ ๋น„์–ด์žˆ๋Š” bucket์„ ์ฐพ์•„ ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ Linear Probing ๋ฐฉ์‹์˜ Primary Clustering ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.

  • hi(x) = (Hash(x) + i^2) % arraySize
    • h(k) โ†’ h(k)+1^2 โ†’ h(k)+2^2 โ†’ h(k)+3^2 ...

Quadratic Probing ์˜ˆ์‹œ

Untitled

Quadratic Probing ๋‹จ์  : Secondary Clustering ๋ฌธ์ œ

n^2 ๊ฐ„๊ฒฉ์œผ๋กœ filled bucket์ด ์กด์žฌํ•˜์—ฌ key์˜ hash๊ฐ’(index) ๋ณด๋‹ค ํ›จ์”ฌ ๋–จ์–ด์ง„ ๊ณณ์— ๋ฐ์ดํ„ฐ๊ฐ€ ์‚ฝ์ž…๋˜๋Š” ํ˜„์ƒ์ด๋‹ค. ์ด๋Š” filled bucket ๊ตฐ์ง‘์„ ํฌ๊ฒŒ๋งŒ๋“ค์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— Primary Clustering ๋ณด๋‹ค ๋œ ์‹ฌ๊ฐํ•œ ๋ฌธ์ œ์ด๋‹ค.

Untitled


3. Double Hashing (์ด์ค‘ ํ•ด์‹ฑ)

hash๊ฐ’์„ ๋‹ค๋ฅธ hash function์œผ๋กœ ํ•œ๋ฒˆ ๋” ํ•ด์‹ฑํ•˜์—ฌ hash์˜ ๊ทœ์น™์„ฑ์„ ์—†์• ๋Š” ๋ฐฉ์‹์œผ๋กœ Secondary Clustering ๋ฐœ์ƒ ๊ฐ€๋Šฅ์„ฑ์„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.

  • hi(x) = (Hash(x) + i * Hash2(X)) % arraySize

Double Hashing Probing ์˜ˆ์‹œ

Untitled


Open Addressing์˜ ๋ฐ์ดํ„ฐ ํƒ์ƒ‰ ๋ฐ ์‚ญ์ œ

๋ฐ์ดํ„ฐ ํƒ์ƒ‰(search) ์•Œ๊ณ ๋ฆฌ์ฆ˜

target ๋ฐ์ดํ„ฐ๋ฅผ ์ฐพ๊ฑฐ๋‚˜ empty bucket์— ๋„๋‹ฌํ•˜๊ธฐ ์ „๊นŒ์ง€ ํƒ์ƒ‰(probing)์„ ์ง„ํ–‰ํ•œ๋‹ค. target ๋ฐ์ดํ„ฐ๋ฅผ ์ฐพ๋Š” ๊ณผ์ •์—์„œ empty bucket์— ๋„๋‹ฌํ•˜๋ฉด ํƒ์ƒ‰(probing)์„ ์ข…๋ฃŒํ•˜๋Š” ๋ฌธ์ œ์ ์ด ์žˆ๋‹ค.

๋ฐ์ดํ„ฐ ์‚ญ์ œ์˜ ๋ฌธ์ œ์ 

Untitled

34 ๋ฅผ insertํ•˜๊ณ  35 ๋ฅผ deleteํ•œ ๋’ค 34 ๋ฅผ searchํ•˜๋Š” ์ƒํ™ฉ์„ ๊ฐ€์ •ํ•ด๋ณด์ž.

12 ๋ถ€ํ„ฐ ๋‚ด๋ ค๊ฐ€๋ฉด์„œ ํƒ์ƒ‰์„ ์ง„ํ–‰ํ•˜๋‹ค๊ฐ€ empty bucket(5๋ฒˆ)์„ ๋งŒ๋‚˜๋ฉด ํƒ์ƒ‰์„ ์ข…๋ฃŒํ•˜๊ฒŒ ๋˜์–ด 34 ๊นŒ์ง€ ๊ฒ€์ƒ‰์ด ์ง„ํ–‰๋˜์ง€ ์•Š๋Š”๋‹ค. ์ด๋Š” Open Addressing ํƒ์ƒ‰ ์•Œ๊ณ ๋ฆฌ์ฆ˜(empty bucket์„ ๋งŒ๋‚˜๋ฉด probing ์ข…๋ฃŒ)์œผ๋กœ ์ธํ•œ ๋ฌธ์ œ์ด๋‹ค.

๋ฐ์ดํ„ฐ ์‚ญ์ œ์˜ ๋ฌธ์ œ ํ•ด๊ฒฐ๋ฐฉ๋ฒ•

์‚ญ์ œํ•œ ๋ฐ์ดํ„ฐ์˜ bucket์— dummy node ๋ฅผ ๋„ฃ๊ฑฐ๋‚˜ flag (Occupied, Empty, Deleted) ๋ฅผ ํ™œ์šฉํ•˜์—ฌ ํƒ์ƒ‰์ด ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ์ง„ํ–‰๋˜๋„๋ก ํ•  ์ˆ˜ ์žˆ๋‹ค.


#๏ธโƒฃSeperate Chaining (๋ถ„๋ฆฌ ์—ฐ๊ฒฐ๋ฒ•)

๊ฐ๊ฐ์˜ hash bucket์„ LinkedList๋ฅผ ๊ฐ€๋ฆฌํ‚ค๋Š” ํฌ์ธํ„ฐ๋กœ ๊ตฌ์„ฑํ•˜์—ฌ ํ•ด์‹œ ์ถฉ๋Œ์‹œ ํ•ด๋‹น bucket์˜ LinkedList์— ์ถ”๊ฐ€ํ•˜๋Š” ๋ฐฉ์‹์ด๋‹ค.

  • ์ผ๋ฐ˜์ ์œผ๋กœ Seperate Chaining ๋ฐฉ์‹์€ Open Addressing ๋ฐฉ์‹๋ณด๋‹ค ๋น ๋ฅด๋‹ค.
    • Open Addressing์€ ๋ฐ์ดํ„ฐ๊ฐ€ ์กด์žฌํ•˜๋Š” bucket์ด ๋ชจ์—ฌ์žˆ์œผ๋ฉด Worst Case ๋ฐœ์ƒ ๋นˆ๋„๊ฐ€ ๋†’์•„์ง„๋‹ค
    • Seperate Chaining์€ ํ•ด์‹œ ์ถœ๋Œ์„ ํ”ผํ•˜๋„๋ก ๋ณด์กฐ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์กฐ์ •ํ•˜๋ฉด Worse Case์— ๊ฐ€๊นŒ์›Œ์ง€๋Š” ๋นˆ๋„๋ฅผ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค.
  • Java 7์—์„œ HashMap์€ Seperate Chaining ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„๋˜์–ด์žˆ๋‹ค.
  • Seperate Chaining ๋ฐฉ์‹์€ ๋ณด์กฐ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด์‹œ ์ถฉ๋Œ ๊ฐ€๋Šฅ์„ฑ์„ ์ค„์ธ๋‹ค.

Untitled

Red-Black Tree๋กœ ๊ตฌํ˜„ํ•œ Seperate Chaining

ํ•˜๋‚˜์˜ hash bucket์— ํ•ด๋‹นํ•˜๋Š” LinkedList์˜ ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ 8๊ฐœ๊ฐ€ ๋˜๋ฉด ๊ตฌ์กฐ๋ฅผ LinkedList -> Red-Black Tree ๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. LinkedList ํƒ์ƒ‰ ์„ฑ๋Šฅ์˜ Worst Case๋Š” O(N) ์ด์ง€๋งŒ Red-Black Tree์˜ ๊ฒฝ์šฐ O(logN)์ด๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

ํ•˜๋‚˜์˜ hash bucket์— ํ• ๋‹น๋œ ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ 6๊ฐœ์ด๋ฉด ๊ตฌ์กฐ๋ฅผ Red-Black Tree -> LinkedList๋กœ ๋ณ€ํ™˜ํ•œ๋‹ค. ๋ฐ์ดํ„ฐ ๊ฐœ์ˆ˜๊ฐ€ ์ ์œผ๋ฉด Red-Black Tree์™€ LinkedList๊ฐ„์˜ ํƒ์ƒ‰ ์„ฑ๋Šฅ ์ฐจ์ด๊ฐ€ ๊ฑฐ์˜ ์—†๊ณ , Red-Black Tree๋Š” LinkedList๋ณด๋‹ค ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ๋Ÿ‰์ด ๋งŽ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค.

๋ณ€ํ™˜ ๊ธฐ์ค€์ด 6๊ฐœ, 8๊ฐœ ์ธ ์ด์œ 

๋ณ€ํ™˜ ๊ธฐ์ค€์„ 6๊ฐœ, 8๊ฐœ ์ฒ˜๋Ÿผ 2๊ฐœ๋ผ๋Š” ์—ฌ์œ ๋ฅผ ๋‘์–ด ๊ณผ๋„ํ•œ ๊ตฌ์กฐ ๋ณ€ํ™˜์„ ๋ง‰์„ ์ˆ˜ ์žˆ๋‹ค. ๋งŒ์•ฝ ๋ณ€ํ™˜ ๊ธฐ์ค€์ด 6๊ฐœ , 7๊ฐœ ์ด๋ผ๋ฉด ๋ฐ์ดํ„ฐ๊ฐ€ ๋ฐ˜๋ณต์ ์œผ๋กœ ์‚ฝ์ž…, ์‚ญ์ œ๋˜๋Š” ๊ฒฝ์šฐ ๋ถˆํ•„์š”ํ•œ LinkedList โ†” Tree ๋ณ€ํ™˜์ด ์ผ์–ด๋‚˜ ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.

LinkedList -> Red-Black Tree๋กœ ๋ณ€ํ™˜๊ฐ€๋Šฅํ•œ ์ด์œ 

HashMap์˜ ๋ฐ์ดํ„ฐ Node ๋Š” TreeNode ์˜ ๋ถ€๋ชจ ํด๋ž˜์Šค์ด๊ธฐ ๋•Œ๋ฌธ์— replacementTreeNode ๋ฉ”์†Œ๋“œ๋ฅผ ํ†ตํ•ด Node -> TreeNode ๋กœ ๋ณ€ํ™˜ํ•  ์ˆ˜ ์žˆ๋‹ค.

// replacementTreeNode : HashMap์˜ ๋‚ด๋ถ€ ํ•จ์ˆ˜
TreeNode<K,V> replacementTreeNode(Node<K,V> p, Node <K,V> next) {
    return new TreeNode<>(p.hash, p.key, p.value, next);
}

// ์ƒ์† ๊ด€๊ณ„ : TreeNode <- LinkedHashMap.Entry <- HashMap.Node
static final class TreeNode<K,V> extends LinkedHashMap.Entry<K,V> { 
		...
}
public class LinkedHashMap<K,V> extends HashMap<K,V> implements Map<K,V> {
		static class Entry<K,V> extends HashMap.Node<K,V> { ... }
}

Open Addressing vs Separate Chaining

Open Addressing Seperate Chaining
Worst Case O(M) O(M)
์บ์‹œ ํšจ์œจ ์ข‹๋‹ค (์—ฐ์†๋œ ๊ณต๊ฐ„์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค) Open Addressing ๋ณด๋‹ค ์ข‹์ง€ ์•Š๋‹ค (ํ•ด์‹œ ์ถฉ๋Œ์‹œ LinkedList์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค)
๊ณต๊ฐ„ ํšจ์œจ ์ข‹๋‹ค (ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•œ ๊ฒฝ์šฐ์—๋„ probing์„ ํ†ตํ•ด ๋นˆ bucket์— ์ €์žฅ๋˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค) Open Addressing ๋ณด๋‹ค ์ข‹์ง€ ์•Š๋‹ค (ํ•ด์‹œ ์ถฉ๋Œ์ด ๋ฐœ์ƒํ•˜๋ฉด LinkedList์— ์ถ”๊ฐ€๋˜๊ธฐ ๋•Œ๋ฌธ์— ์‚ฌ์šฉ๋˜์ง€ ์•Š๋Š” bucket์ด ์กด์žฌํ•œ๋‹ค)
Resizing ๋นˆ๋„ ๋†’๋‹ค (bucket ์‚ฌ์šฉ๋ฅ ์ด ๋†’์•„ load factor์˜ ์ž„๊ณ„์ ์— ์‰ฝ๊ฒŒ ๋„๋‹ฌํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค) Open Addressing ๋ณด๋‹ค ๋‚ฎ๋‹ค (bucket ์‚ฌ์šฉ๋ฅ ์ด ๋‚ฎ๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค)

Dynamic Resizing

hash bucket์˜ ๊ฐœ์ˆ˜๊ฐ€ ์ ๋‹ค๋ฉด ๋ฉ”๋ชจ๋ฆฌ ์‚ฌ์šฉ์„ ์•„๋‚„ ์ˆ˜ ์žˆ์ง€๋งŒ ํ•ด์‹œ ์ถฉ๋Œ๋กœ ์ธํ•ด ์„ฑ๋Šฅ ์ €ํ•˜๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ ์ž๋ฐ”์˜ HashMap์€ load factor๊ฐ€ ์ž„๊ณ„์ (0.75)์„ ๋„˜์–ด๊ฐˆ ๊ฒฝ์šฐ์— hash bucket ๊ฐœ์ˆ˜๋ฅผ 2๋ฐฐ๋กœ ๋Š˜๋ฆฐ๋‹ค.

load factor (์ ์žฌ์œจ)

load factor (ฮฑ) = n / k
	- n : ๋ฐ์ดํ„ฐ๊ฐ€ ์ €์žฅ๋œ bucket ์ˆ˜
	- k : ์ „์ฒด bucket ์ˆ˜

load factor๋Š” hash table์— ๋ฐ์ดํ„ฐ๊ฐ€ ์ฐจ์žˆ๋Š” ๋น„์œจ์„ ๋‚˜ํƒ€๋‚ธ๋‹ค.

  • load factor๊ฐ€ 1์— ๊ฐ€๊นŒ์šฐ๋ฉด hash table์˜ ์„ฑ๋Šฅ์ด ์ €ํ•˜๋˜๊ณ , ๋„ˆ๋ฌด ์ž‘์€ ๊ฐ’์„ ๊ฐ€์ง€๋ฉด ๋น„ํšจ์œจ์ ์ด๋‹ค. ๊ทธ๋Ÿฌ๋ฏ€๋กœ hash table resizing์„ ํ†ตํ•ด load factor ๋ฅผ ์กฐ์ ˆํ•ด์•ผํ•œ๋‹ค.
  • 0.6 โ‰ค load factor โ‰ค 0.75 ๊ฐ€ ๋ฐ”๋žŒ์งํ•˜๋‹ค.

Java์˜ HashTable๊ณผ HashMap

HashTable HashMap
JDK 1.0๋ถ€ํ„ฐ ์กด์žฌํ•œ Java API Java 2์— ์„ ๋ณด์ธ Java Collections Framework API
ํ˜„์žฌ๊นŒ์ง€ HashTable์˜ ๊ตฌํ˜„์— ๋ณ€ํ™”๊ฐ€ ๊ฑฐ์˜ ์—†๋‹ค. (ํ•˜์œ„ ํ˜ธํ™˜์„ฑ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด ๋‚จ์•„์žˆ๋‹ค) ์ง€์†์ ์œผ๋กœ ๊ฐœ์„ ์ด ์ด๋ค„์ง€๊ณ  ์žˆ๋‹ค.
๋ณด์กฐ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๋Š”๋‹ค. ๋ณด์กฐ ํ•ด์‹œ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ํ•ด์‹œ ์ถฉ๋Œ์ด HashTable๋ณด๋‹ค ๋œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค.
๋‹ค์ค‘ ์Šค๋ ˆ๋“œ ํ™˜๊ฒฝ์—์„œ Thread safe ํ•˜๋‹ค ๋‹ค์ค‘ ์Šค๋ ˆ๋“œ ํ™˜๊ฒฝ์—์„œ Thread safe ํ•˜์ง€ ์•Š๋‹ค
key๊ฐ’์— null์„ ์ €์žฅํ•  ์ˆ˜ ์—†๋‹ค key๊ฐ’์— null์„ ์ €์žฅํ•  ์ˆ˜ ์žˆ๋‹ค

์ฐธ๊ณ 

wikipedia

Hash Table ์›๋ฆฌ์™€ ๊ตฌํ˜„

Hashtable์˜ ์ดํ•ด์™€ ๊ตฌํ˜„ #1

Java HashMap์€ ์–ด๋–ป๊ฒŒ ๋™์ž‘ํ•˜๋Š”๊ฐ€?

Open Addressing & its Classification to eliminate Collisions

Hashing Initially prepared by Dr lyas iekli improved

What is primary and secondary clustering in hash?

Secondary Clustering

HashMap ํŒŒํ—ค์น˜๊ธฐ 1 (Linked List + Red Black Tree)

Open Addressing vs Seperate Chaining


๋ฉด์ ‘ ์˜ˆ์ƒ ์งˆ๋ฌธ

ํ•ด์‹œ ํ…Œ์ด๋ธ”์— ๋ฐ์ดํ„ฐ๋ฅผ ์ €์žฅํ•˜๋Š” ๊ณผ์ •์„ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”
ํ•ด์‹œ ์ถฉ๋Œ๊ณผ ์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ๋ฐฉ์•ˆ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด๋ณด์„ธ์š”
๋ฐ์ดํ„ฐ ์‚ญ์ œ์‹œ ๋ฐœ์ƒํ•  ์ˆ˜ ์žˆ๋Š” Open Addressing์˜ ๋ฌธ์ œ์ ์— ๋Œ€ํ•ด ์„ค๋ช…ํ•ด๋ณด์„ธ์š”
Seperate Chaining์— ๋Œ€ํ•ด ์„ค๋ช…ํ•˜์„ธ์š”
LinkedList์™€ Red-Black Tree๋กœ Seperate Chaining์„ ๊ตฌํ˜„ํ•  ๋•Œ ์–ป์„ ์ˆ˜ ์žˆ๋Š” ์ด์ ์— ๋Œ€ํ•ด ๊ฐ๊ฐ ์„ค๋ช…ํ•ด๋ณด์„ธ์š”