Sunday, November 19, 2017

You are not logged in.

## #1 2015-04-25 03:08:01 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### A simple Alternative to a small hash table

Hello.

This is something I came up with in order to assign a number to names, like an index if you like for lookup purposes, so I can get back the name, by the index number I retrieved when I stored the name into the table.  -I also needed the id number to be static, and independent of any deleted/inserted items, which is whay I use a table by design, and not a regular AppleScript list, where the item numbers will vary with insertions and deletions.

It is  a replacement for a hash-table, and is is calibrated to hold less than 100 items, although it will search for a free spot (deleted item), should it be full, before it errors. It has a much better performance than a hash table of that size, the downside is that the keys aren't encrypted in anyway, but that is outside of my requirements.

I use this for translating names into indicies, since some algorithms are better at working with indicies than names, (like matrices of various kinds, adjacency matrices in my case).

Why not use a hash table?
Because, I calculated the probabilities for a collision on such a small table, and I found the probability for a collison to be almost 100%, with a filling of 50 elements, for a table size of 100.

The formulae for the probability of a hash collision is 1 - e^(-k(k-1)/2N) where k is the number of elements, and N is the size of the hash key. For 15 elements, this yields a probability of a hash collison of 0.56 ~Â  56%, for 50 elements, the probability is almost 100%.

The hash key I used was really much larger than 7bits, but if you are setting 100 elements as the max number of elements, then 100 is the  7 bit number we have to relate to, since we have to use the hash key modulus 101, to get elements between 1 and 100, which is what the hash keys will be "concentrated" down to (a span of values over 7 bits, since we need 7-bits to represent the decimal number 100 as a  binary number).

Reference:
Hash Collision Probabilities

#### Applescript:

on makeidTable()
script idTable
-- 100 empty strings
property table : {"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""}
property elmPtr : 1
on assignID(aName)
if elmPtr < (length of my table) then
set item elmPtr of my table to aName
set assignedId to elmPtr
set elmPtr to elmPtr + 1
return assignedId
else
set emptyItem to false
repeat with i from 1 to (length of my table)
if item i of my table is "" then
set emptyItem to i
exit repeat
end if
end repeat
if emptyItem is not false then
set item emptyItem of my table to aName
return emptyItem
else
error "assignID: the table was full, holding " & my elmCount & " elements."
end if
end if
end assignID
on remvName for anId
set item anId of my table to ""
end remvName

on aName for anId
return item anId of my table
end aName
end script
end makeidTable

set ids to makeidTable()
set idlist to {}
set end of idlist to ids's assignID("London")
set end of idlist to ids's assignID("Oslo")
set end of idlist to ids's assignID("Sidney")
set end of idlist to ids's assignID("Zurich")
set end of idlist to ids's assignID("Geneva")
set end of idlist to ids's assignID("Paris")
set end of idlist to ids's assignID("Rome")
set end of idlist to ids's assignID("Torino")
set end of idlist to ids's assignID("Vienna")

set aCity to aName of ids for 4
--> "Zurich"

Edit
2015-04-25 22:45 CET: I have edited the documentation so that it should be a bit clearer, what it is, the purpose, and why the design.

Last edited by McUsrII (2015-04-25 06:37:41 pm)

Offline

## #2 2015-04-25 08:21:08 pm

DJ Bazzie Wazzie
Member
From:: the Netherlands
Registered: 2004-10-20
Posts: 2724
Website

### Re: A simple Alternative to a small hash table

I like the code but I think it differs too much from an hash table to consider it as an alternative.

p.s. Since you mentioned collision issues, here is how I solved it in my hash-like-table code exchange.

Offline

## #3 2015-04-26 03:15:21 am

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### Re: A simple Alternative to a small hash table

Hello.

Thanks.

I made it as an alternative to using a hash-table, to cover my needs for a datastructure to let me easily relate a string value to an integer, with the lowest possible overhead.

It is a constant time lookup table, with a very low probability of collision detection/handling. So it is an alternative, in the sense that one of the purposes of a hash table, is to provide constant time lookup of values. There are no generation of encrypted keys here, which may also be viewed as a purpose of  having a hash table, but that I didn't need, I just needed a way to relate a key made up of a string into an integer, with the smallest possible probability of a collision.

It is also much simpler than a hash table, in that there really isn't any way to regenerete the hash value for a key. Here you'll have to keep the "id" you get for the value, also since inserting the same value twice, will generate two different "ids", so the idea here is from whence you have your "id", you'll use it, until you need the name of the object, which you retrieve via the "id".

Last edited by McUsrII (2015-04-26 03:28:36 am)

Offline

## #4 2015-04-26 05:44:40 am

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### Re: A simple Alternative to a small hash table

Hello.

I added a layer of datavalidation, to my original values, and then a hash table didn't seem to be such a bad idea after all, because I wanted to ensure that a key existed in my table. The above is more of a one way street, as for now I need to verify that a name is in my table, before I start to using any keys. (The former approach, just assumed a user was happy with getting a key returned for a name.)

Now it is a small, yet fully fledged hash table with collision detection.

I did receive 19 collisions, during entry of the  58 city names, in the driver (math in post #1). I think it still will perform good enough for my usage.

#### Applescript:

on newHashTable()
-- [url]http://macscripter.net/viewtopic.php?pid=180425#p180425[/url]
-- Removed bug, computes hashkey correctly when key consists of a single character.
script hashTable
-- 100 empty strings
property table : {"", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", "", ""}
-- table should be made to be an even number, so that table size + 1 is a prime
property tableSize : (count table)

on makeInitalHashKey for aValue
tell aValue
set clist to its characters
set keysize to its length
end tell

set h to id of item 1 of clist
if keysize > 1 then
repeat with i from 2 to keysize
set h to (h * 32 + (id of item i of clist)) mod ((my tableSize) + 1)
-- horners method
end repeat
else
set h to h mod ((my tableSize) + 1)
end if

if h = 0 then set h to 1
return h

end makeInitalHashKey

on genKey for aValue
-- Inserts a value into the hash-table, and returns
-- the hash-key that can be used to retrieve it.

-- You gotta test this, or test the output
-- to ensure you got just unique values

-- This is based on Sedgewick 2nd ed. p. 233
set h to makeInitalHashKey for aValue

if item h of my table is "" then
set item h of my table to aValue
return h
else if item h of my table is aValue then
return h
else
-- **            log "Collided name = " & aValue & " h = " & h
-- collision detection/handling
set foundSlot to false
set h to h + 1
repeat ((my tableSize) - 1) times
set h to h mod ((my tableSize) + 1)
if h is 0 then set h to 1
if item h of my table is "" then
set foundSlot to true
set item h of my table to aValue
exit repeat
else
set h to h + 1
end if
end repeat
error "genKey: Table is full"
else
-- **                    log h
return h
end if
end if
end genKey

on keyExists for aValue
return aValue is in my table
end keyExists

on aKey for aValue
-- returns an already generated key for a value.
-- key-exists should be run first
set h to makeInitalHashKey for aValue
if item h of my table is aValue then
return h
else
-- collision detection/handling
set foundSlot to false
set h to h + 1
repeat ((my tableSize) - 1) times
set h to h mod ((my tableSize) + 1)
if h is 0 then set h to 1
if item h of my table is aValue then
set foundSlot to true
exit repeat
else
set h to h + 1
end if
end repeat
error "aKey: Table doesn't contain that key."
else
return h
end if
end if
end aKey

on rmValue for aKey
-- This handler assumes that the hashkey exists!
set item aKey of my table to ""
end rmValue

on aValue for aKey
-- This handler assumes that the hashkey exists!
return item aKey of my table
end aValue
end script
end newHashTable

on driver()
-- be sure to remove the comments in front of the log statements!

set htbl to makeHashTable()
set klist to {}

set end of klist to genKey of htbl for "Trondheim"
set end of klist to genKey of htbl for "London"
set end of klist to genKey of htbl for "Oslo"
set end of klist to genKey of htbl for "Sidney"
set end of klist to genKey of htbl for "Zurich"
set end of klist to genKey of htbl for "Geneva"
set end of klist to genKey of htbl for "Paris"
set end of klist to genKey of htbl for "Rome"
set end of klist to genKey of htbl for "Torino"
set end of klist to genKey of htbl for "Vienna"
set end of klist to genKey of htbl for "Haag"
set end of klist to genKey of htbl for "Bern"
set end of klist to genKey of htbl for "Amsterdam"
set end of klist to genKey of htbl for "Copenhagen"
set end of klist to genKey of htbl for "Bristol"
set end of klist to genKey of htbl for "Las Vegas"
set end of klist to genKey of htbl for "Brighton"
set end of klist to genKey of htbl for "Arendal"
set end of klist to genKey of htbl for "Divonne les Bains"
set end of klist to genKey of htbl for "Aarhus"
set end of klist to genKey of htbl for "Munich"
set end of klist to genKey of htbl for "Strasbourg"
set end of klist to genKey of htbl for "New Jersey"
set end of klist to genKey of htbl for "Wellington"
set end of klist to genKey of htbl for "Canberra"
set end of klist to genKey of htbl for "Stockholm"
set end of klist to genKey of htbl for "Gothenburg"
set end of klist to genKey of htbl for "Houston"
set end of klist to genKey of htbl for "Galveston"
set end of klist to genKey of htbl for "Miami"
set end of klist to genKey of htbl for "Vera Cruz"
set end of klist to genKey of htbl for "Sacramento"
set end of klist to genKey of htbl for "Los Angeles"
set end of klist to genKey of htbl for "San Fransisco"
set end of klist to genKey of htbl for "Tempe"
set end of klist to genKey of htbl for "Bogota"
set end of klist to genKey of htbl for "Buenos Aires"
set end of klist to genKey of htbl for "Brugge"
set end of klist to genKey of htbl for "Florence"
set end of klist to genKey of htbl for "Vienna"
set end of klist to genKey of htbl for "Rotterdam"
set end of klist to genKey of htbl for "Leiden"
set end of klist to genKey of htbl for "Glasgow"
set end of klist to genKey of htbl for "Aberdeen"
set end of klist to genKey of htbl for "Liverpool"
set end of klist to genKey of htbl for "Portsmouth"
set end of klist to genKey of htbl for "Cardiff"
set end of klist to genKey of htbl for "Karlstad"
set end of klist to genKey of htbl for "HjÃ¸rring"
set end of klist to genKey of htbl for "Dresden"
set end of klist to genKey of htbl for "Hamburg"
set end of klist to genKey of htbl for "Kiel"
set end of klist to genKey of htbl for "Frankfurt"
set end of klist to genKey of htbl for "Toulouse"
set end of klist to genKey of htbl for "Madrid"
set end of klist to genKey of htbl for "Barcelona"
set end of klist to genKey of htbl for "Tampa"
set end of klist to genKey of htbl for "Seattle"
set end of klist to genKey of htbl for "Carson"
-- 19 collisions during generation of 58 keys

set aName to aValue of htbl for 32
return klist
end driver

Last edited by McUsrII (2015-04-28 03:41:00 pm)

Offline

## #5 2015-04-27 07:16:01 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### Re: A simple Alternative to a small hash table

Hello.

Just for the hell of it, I checked how many equal random numbers between 1 and 100 I got, when I generated 58 random numbers at a time, I got on the average 13 collisions. This means that my hash function doesn't perform that bad at all, even though 19 collisions may sound much, it is totally in line with the math, described in the post #1.

Offline

## #6 2015-04-28 03:43:20 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### Re: A simple Alternative to a small hash table

Hello.

I removed a bug, that took effect in the makeHashTable script in post #4, when the key just consisted of one character. I also refactored out a makeInitalHashKey for aValue handler, and changed the script name inside the newHashTable to be more descriptive: "hashTable".

I should also mention, just in case, that the keys are case sensitive: Toy and ToY generates two different keys, if you don't take any precations. That is the user of the script's  responsibility, to handle.

Last edited by McUsrII (2015-04-28 04:10:57 pm)

Offline

## #7 2015-06-12 01:51:23 pm

McUsrII
Member
Registered: 2012-11-21
Posts: 3046
Website

### Re: A simple Alternative to a small hash table

Hello.

For those interested in creating a hashtable large enough to avoid any collisions, before the number of stored keys reaches some treshold, here is the formula:

It turns out that by calculus, a function is generated for the number of elements a hashtable can contain before the probability of a collision reaches 1/2.

The formula is:

N=1.117*m^0.5

m is the number of keys that can be stored in the hashtable. -The hashtable size.
N that is returned, yields the number of keys that can be inserted into the hashtable of size m, before a probability of 1/2, for a collision, is reached.

Last edited by McUsrII (2015-06-12 01:53:15 pm)

Filed under: hashtable

Offline