Lists; a fast way to store objects.

Tjakka5
Party member
Posts: 240
Joined: Thu Dec 26, 2013 12:17 pm

Lists; a fast way to store objects.

We all like speed, speed is nice.
But all those tables that you add objects to, remove them, loop over them? Those are slow.

Lists is my attempt at an alternative which gets rid of those nasty slow functions.
It uses some logic and magic to quickly add and remove objects.

Get it here:
https://github.com/tjakka5/Lists/tree/master

Check out my portfolio: http://tjakka5.sorunome.de/

airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Looks interesting! Do you have any benchmarks vs. the usual tables-as-lists?

I guess you only allow tables as list items because each item needs to be unique, but what happens if you add the same table twice? What if you want to put userdata or functions in the list? Maybe instead of the type(object) == "table" check, you could check if it's a unique value by checking that self.items[object] is nil. You'd still need to either disallow numbers or put the index in a separate table, though.

I'm not sure the type(func) == "function" in forEach is really necessary either; what if you want to use a callable table instead of a function? Even without the check, you'd still get a reasonable error message if you pass in something that's not callable.

Edit: I wrote a quick benchmark that seems to expose some issues with this. Take a look:

Code: Select all

local TJList = require 'list'

local ITEMS = 10000
local ITER = 100

do
local tab = {}

local t = {}
for n = 1, ITEMS do
t[n] = {}
end

local time = os.clock()
for n = 1, ITEMS do
table.insert(tab, t[n])
end
print(('table.insert: %.4f'):format(os.clock() - time))

local time = os.clock()
for n = ITEMS, 1, -2 do
table.remove(tab, n)
end
print(('table.remove: %.4f'):format(os.clock() - time))

local time = os.clock()
for m = 1, ITER do
for n = 1, #tab do
assert(tab[n])
end
end
print(('table iterate: %.4f'):format(os.clock() - time))
end

do
local tjlist = TJList()

local t = {}
for n = 1, ITEMS do
t[n] = {}
end

local time = os.clock()
for n = 1, ITEMS do
end
print(('tj list insert: %.4f'):format(os.clock() - time))

local time = os.clock()
for n = ITEMS, 1, -2 do
tjlist:remove(t[n])
end
print(('tj list remove: %.4f'):format(os.clock() - time))

local time = os.clock()
for m = 1, ITER do
tjlist:forEach(assert)
end
print(('tj list iterate: %.4f'):format(os.clock() - time))
end
Under vanilla Lua, compared with the normal table operations, you get slightly higher times for inserting items and iterating the list, but considerably better removal times. It looks like it's doing its job there.

Under LuaJit, something weird happens. You get much higher insertion times than you do in PUC Lua for some reason, and they seem to grow exponentially with the size of the list. Removal and iteration times are also significantly worse than normal table operations. This benchmark seems to suggest that under LuaJit, these lists could actually hurt performance considerably. Of course, it's just one benchmark and it might perform better on others.

Since the goal seems to be fast removal, you might consider using something like a double linked list. Here's something I threw together as a test:

Code: Select all

-- mylist.lua

local function rebuild (list)
list.length = 0
list.items = list.realItems
local item = list.firstItem
while item do
list.length = list.length + 1
list.items[list.length] = item
item = list.itemAfter[item]
end
end

local meta = {
__index = function (list, key)
if key == 'length' or key == 'items' then
rebuild(list)
return rawget(list, key)
end
end,
}

local function insert (list, item)
list.length = list.length + 1
list.items[list.length] = item
if not list.firstItem then
list.firstItem = item
end
if list.lastItem then
list.itemAfter[list.lastItem] = item
list.itemBefore[item] = list.lastItem
end
list.lastItem = item
end

local function remove (list, item)
local prev, next = list.itemBefore[item], list.itemAfter[item]

if prev then
list.itemAfter[prev] = next
else
list.firstItem = list.itemAfter[item]
end

if next then
list.itemBefore[next] = prev
list.length = nil
list.items = nil
else
list.lastItem = list.itemBefore[item]
list.length = list.length - 1
end

list.itemAfter[item] = nil
list.itemBefore[item] = nil
end

return function (t)
local list = setmetatable({}, meta)

list.insert = insert
list.remove = remove

list.items = t or {}
list.realItems = list.items
list.itemAfter = {}
list.itemBefore = {}
list.length = 0

for i = 1, #list.items do
insert(list, list.items[i])
end

return list
end
Here's the benchmark again with my test lists included:

Code: Select all

local TJList = require 'list'
local MyList = require 'mylist'

local ITEMS = 10000
local ITER = 100

do
local tab = {}

local t = {}
for n = 1, ITEMS do
t[n] = {}
end

local time = os.clock()
for n = 1, ITEMS do
table.insert(tab, t[n])
end
print(('table.insert: %.4f'):format(os.clock() - time))

local time = os.clock()
for n = ITEMS, 1, -2 do
table.remove(tab, n)
end
print(('table.remove: %.4f'):format(os.clock() - time))

local time = os.clock()
for m = 1, ITER do
for n = 1, #tab do
assert(tab[n])
end
end
print(('table iterate: %.4f'):format(os.clock() - time))
end

do
local list = MyList()

local t = {}
for n = 1, ITEMS do
t[n] = {}
end

local time = os.clock()
for n = 1, ITEMS do
list:insert(t[n])
end
print(('my list insert: %.4f'):format(os.clock() - time))

local time = os.clock()
for n = ITEMS, 1, -2 do
list:remove(t[n])
end
print(('my list remove: %.4f'):format(os.clock() - time))

local time = os.clock()
for m = 1, ITER do
for n = 1, list.length do
assert(list.items[n])
end
end
print(('my list iterate: %.4f'):format(os.clock() - time))
end

do
local tjlist = TJList()

local t = {}
for n = 1, ITEMS do
t[n] = {}
end

local time = os.clock()
for n = 1, ITEMS do
end
print(('tj list insert: %.4f'):format(os.clock() - time))

local time = os.clock()
for n = ITEMS, 1, -2 do
tjlist:remove(t[n])
end
print(('tj list remove: %.4f'):format(os.clock() - time))

local time = os.clock()
for m = 1, ITER do
tjlist:forEach(assert)
end
print(('tj list iterate: %.4f'):format(os.clock() - time))
end
Again, it's just one benchmark, but the difference in performance is noticeable.

ivan
Party member
Posts: 1565
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: Lists; a fast way to store objects.

Tjakka5 wrote:But all those tables that you add objects to, remove them, loop over them? Those are slow.
What is this assumption based on?
Tjakka5 wrote:Lists is my attempt at an alternative which gets rid of those nasty slow functions.
It uses some logic and magic to quickly add and remove objects.
Since you're adding logic on top of Lua tables, the solution is going to be slower than using tables.
You have to run less code to make things faster, not more!
And the line "self.items[object] = nil" means that adding the same element twice won't work.
Generally speaking, storing each element as a table is less than ideal.

As far as I know, the idea of a "list" data structure is that you can add and remove elements at ANY OFFSET in constant time.
Adding and removing elements from the end of the table is not how a "list" works.
I hate to be the grumpy old dude here, but your code looks more like a "stack" that had a few extra drinks on the weekend and made an illegitimate baby with a "set".

airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

ivan wrote: Since you're adding logic on top of Lua tables, the solution is going to be slower than using tables.
You have to run less code to make things faster, not more!
Not necessarily. Insertion and iteration might always be slower, but table.remove can be replaced with something much faster if you don't mind sacrificing other things. Suppose inserting things takes only slightly longer, but removing things is orders of magnitude faster, and your use case involves inserting things about as often as you remove them (and you don't care that you can only insert things at the end, and can't have duplicates). Everything else aside, this should be a net gain. A list of game entities might benefit from this.
As far as I know, the idea of a "list" data structure is that you can add and remove elements at ANY OFFSET in constant time.
Adding and removing elements from the end of the table is not how a "list" works.
That would be another benefit of something like double-linked lists.

Here's what I've got now; it's not well tested but removal is very fast compared to table.remove, probably a net gain for many use cases. Items can be inserted at or removed from any position. Items must be unique; inserting an item that's already there will move it from its old position to the new one (yes, it's really a set, not a list).

Code: Select all

local function check (list)
if not list.isDirty then
return list
end
list.length = 0
local item = list.firstItem
while item do
list.length = list.length + 1
list.items[list.length] = item
item = list.itemAfter[item]
end
list.isDirty = false
return list
end

local function getItems (list)
return check(list).items
end

local function getLength (list)
return check(list).length
end

local function nextItem (items, index)
local item = items[index]

if item then
return index + 1, item
end
end

local function each (list, func, ...)
if not func then
return nextItem, list:getItems(), 1
end
local items = list:getItems()
for i = 1, list.length do
func(items[i], ...)
end
end

local function push (list, item)
list.length = list.length + 1
list.items[list.length] = item
if not list.firstItem then
list.firstItem = item
end
if list.lastItem then
list.itemAfter[list.lastItem] = item
list.itemBefore[item] = list.lastItem
end
list.lastItem = item
end

local function unshift (list, item)
list.isDirty = true
local next = list.firstItem
if next then
list.itemAfter[item] = next
list.itemBefore[next] = item
end
list.firstItem = item
if not list.lastItem then
list.lastItem = item
end
end

local function pop (list)
local item = list.lastItem
list:remove(item)
return item
end

local function shift (list)
local item = list.firstItem
list:remove(item)
return item
end

local function insert (list, item, neighbor, before)
if list.has[item] then
list:remove(item)
end

if not (neighbor and list.has[neighbor]) then
if before then
unshift(list, item)
else
push(list, item)
end
else
if before and list.firstItem == neighbor then
unshift(list, item)
elseif (not before) and list.lastItem == neighbor then
push(list, item)
elseif before then
list.isDirty = true
local prev = list.itemBefore[neighbor]
list.itemAfter[prev] = item
list.itemBefore[item] = prev
list.itemAfter[item] = neighbor
list.itemBefore[neighbor] = item
else
list.isDirty = true
local next = list.itemAfter[neighbor]
list.itemAfter[item] = next
list.itemBefore[next] = item
list.itemAfter[neighbor] = item
list.itemBefore[item] = neighbor
end
end

return list
end

local function remove (list, item)
if not (item and list.has[item]) then
return list
end

local prev, next = list.itemBefore[item], list.itemAfter[item]

if prev then
list.itemAfter[prev] = next
else
list.firstItem = list.itemAfter[item]
end

if next then
list.itemBefore[next] = prev
list.isDirty = true
else
list.lastItem = list.itemBefore[item]
list.length = list.length - 1
end

list.itemAfter[item] = nil
list.itemBefore[item] = nil
list.has[item] = nil

return list
end

return function (items)
local list = {
insert = insert,
remove = remove,
getItems = getItems,
getLength = getLength,
each = each,
push = push,
unshift = unshift,
pop = pop,
shift = shift,

items = items or {},
itemAfter = {},
itemBefore = {},
has = {},
length = 0,
isDirty = false,
}

for i = 1, #list.items do
insert(list, list.items[i])
end

return list
end


raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: Lists; a fast way to store objects.

If it's not a list, then there's no advantage over plain Lua table manipulated directly.

airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

raidho36 wrote:If it's not a list, then there's no advantage over plain Lua table manipulated directly.
Again, the advantage is in removal time. Consider the benchmark I posted, where ten thousand items are reverse iterated and every other item is removed. On my laptop, the solution I posted completes that section of the benchmark about a thousand times faster than table.remove.
Last edited by airstruck on Sat Dec 17, 2016 10:02 am, edited 1 time in total.

ivan
Party member
Posts: 1565
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: Lists; a fast way to store objects.

airstruck wrote:but table.remove can be replaced with something much faster
The point of table.remove(t, i) is that it can remove elements from any numeric index.
Numeric indexes greater than i are updated automatically hence tables can already work as a "list".
If you're concerned about removal time, you can use:
table[lastindex] = nil
or alternatively you can use non-numeric indexes which is insignificantly faster since the table length is not updated.

In short, you can't make tables perform faster using Lua code, it just doesn't work that way.

airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

The point of table.remove(t, i) is that it can remove elements from any numeric index.
Both Tjakka's solution and my own can also remove elements from anywhere in the set.
Numeric indexes greater than i are updated automatically hence it works as a "list".
The same thing applies to both Tjakka's solution and my own. The only reason these solutions should be called "set" instead of "list" is because each element must be unique; it has nothing to do with random access in this case.
If you're concerned about removal time, you can use:
table[lastindex] = nil
Sure, but unless you're only nilling the last index, you'll leave holes in the table. Setting values equal to nil is not equivalent to using table.remove. Other solutions like moving the last element into the hole also work, but if things need to stay in the same order that's no good either.
In short, you can't make tables perform faster using Lua code, it just doesn't work that way.
The goal is not to make "tables perform faster." As I understand it, the goal here is to provide an alternative data structure with fast removal times that maintains the integrity of the data (like table.remove, but faster, and unlike t[n]=nil).

Edit: There was a bug in the solution I posted earlier; I'll keep an updated copy here. The removal section of the benchmark actually completes about 100 times faster than the table.remove equivalent (not 1000). Still, it more than makes up for the slightly slower insertion and iteration time on this particular benchmark (there is a significant net gain).

The benchmark was supposed to represent a scenario similar to a game entity list, where you iterate the list about 100 times more often than you insert things (for example you iterate the entity list every frame, but only create or destroy entities every few seconds). Realistically, you'd probably be removing entities about as often as you insert them, but this benchmark only removes half as many as it inserts. Even so, the time gained when removing items far outweighs the time lost when inserting and iterating them (compared to standard table operations).

It should be pretty clear that there are at least some situations where you can get a significant net speed gain out of specialized data structures that you're not going to get from standard table operations.

Tjakka5
Party member
Posts: 240
Joined: Thu Dec 26, 2013 12:17 pm

Re: Lists; a fast way to store objects.

This is.. weird, to say the least.
I've been holding off on posting because of the, uh, unnatural kind of feedback (I'm not saying it's bad.)

I'll try to address my points as well I can.

So first, to get it out of the way.
Yes, "lists" (or "sets", or whatever you would call them) do have a place in Lua.
Specific solutions for specific situations will always be faster than a generic "jack of all trades" solution.
I think Airstruck did a good job at demonstrating this. So I won't talk about it further.

Next,
While Ivan's solution to one problem works, they make the other solution(s) impossible to work.
Using 'table = nil' makes you unable to use '#table', 'table.remove' is slow, using non-numerical indexes will force you to use 'pairs', which is even slower.

Last, the biggest problem.
The code doesn't work. It even slows down exponentially.
I do not know why this is; I could be content with accepting it was slow.
But, as I said, it slows down exponentially, and I just can't understand why.

My first attempt at lists were posted in this topic, along with benchmarks. (Which were done sloppy, but should still give a broad good result)
Check out my portfolio: http://tjakka5.sorunome.de/

airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Tjakka5 wrote:Last, the biggest problem.
The code doesn't work. It even slows down exponentially.
I do not know why this is; I could be content with accepting it was slow.
But, as I said, it slows down exponentially, and I just can't understand why.
The strangest part is that it works fairly well under PUC Lua, but is much, much slower under LuaJit (the exponential slowdown only seems to happen there). I don't understand that at all.

The solution I posted seems like it should work pretty well. I think it will be faster, and it has the advantage of not needing to be cleaned periodically. I'll probably clean it up a bit later, write some tests and put it on GitHub. Feel free to use it instead if you want.

Who is online

Users browsing this forum: No registered users and 16 guests