Lists; a fast way to store objects.

Showcase your libraries, tools and other projects that help your fellow love users.
User avatar
Tjakka5
Party member
Posts: 240
Joined: Thu Dec 26, 2013 12:17 pm

Lists; a fast way to store objects.

Post by Tjakka5 » Fri Dec 16, 2016 7:55 pm

We all like speed, speed is nice.
But all those tables that you add objects to, remove them, loop over them? Those are slow.

Lists is my attempt at an alternative which gets rid of those nasty slow functions.
It uses some logic and magic to quickly add and remove objects.

Get it here:
https://github.com/tjakka5/Lists/tree/master

Documentation is in the Readme.
Check out my portfolio: http://tjakka5.sorunome.de/

User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Post by airstruck » Fri Dec 16, 2016 9:44 pm

Looks interesting! Do you have any benchmarks vs. the usual tables-as-lists?

I guess you only allow tables as list items because each item needs to be unique, but what happens if you add the same table twice? What if you want to put userdata or functions in the list? Maybe instead of the type(object) == "table" check, you could check if it's a unique value by checking that self.items[object] is nil. You'd still need to either disallow numbers or put the index in a separate table, though.

I'm not sure the type(func) == "function" in `forEach` is really necessary either; what if you want to use a callable table instead of a function? Even without the check, you'd still get a reasonable error message if you pass in something that's not callable.

Edit: I wrote a quick benchmark that seems to expose some issues with this. Take a look:

Code: Select all

local TJList = require 'list'

local ITEMS = 10000
local ITER = 100

do
    local tab = {}

    local t = {}
    for n = 1, ITEMS do
        t[n] = {}
    end
    
    local time = os.clock()
    for n = 1, ITEMS do
        table.insert(tab, t[n])
    end
    print(('table.insert: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for n = ITEMS, 1, -2 do
        table.remove(tab, n)
    end
    print(('table.remove: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for m = 1, ITER do
        for n = 1, #tab do
            assert(tab[n])
        end
    end
    print(('table iterate: %.4f'):format(os.clock() - time))
end

do
    local tjlist = TJList()

    local t = {}
    for n = 1, ITEMS do
        t[n] = {}
    end
    
    local time = os.clock()
    for n = 1, ITEMS do
        tjlist:add(t[n])
    end
    print(('tj list insert: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for n = ITEMS, 1, -2 do
        tjlist:remove(t[n])
    end
    print(('tj list remove: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for m = 1, ITER do
        tjlist:forEach(assert)
    end
    print(('tj list iterate: %.4f'):format(os.clock() - time))
end
Under vanilla Lua, compared with the normal table operations, you get slightly higher times for inserting items and iterating the list, but considerably better removal times. It looks like it's doing its job there.

Under LuaJit, something weird happens. You get much higher insertion times than you do in PUC Lua for some reason, and they seem to grow exponentially with the size of the list. Removal and iteration times are also significantly worse than normal table operations. This benchmark seems to suggest that under LuaJit, these lists could actually hurt performance considerably. Of course, it's just one benchmark and it might perform better on others.

Since the goal seems to be fast removal, you might consider using something like a double linked list. Here's something I threw together as a test:

Code: Select all

-- mylist.lua

local function rebuild (list)
    list.length = 0
    list.items = list.realItems
    local item = list.firstItem
    while item do
        list.length = list.length + 1
        list.items[list.length] = item
        item = list.itemAfter[item]
    end
end

local meta = {
    __index = function (list, key)
        if key == 'length' or key == 'items' then
            rebuild(list)
            return rawget(list, key)
        end
    end,
}

local function insert (list, item)
    list.length = list.length + 1
    list.items[list.length] = item
    if not list.firstItem then
        list.firstItem = item
    end
    if list.lastItem then
        list.itemAfter[list.lastItem] = item
        list.itemBefore[item] = list.lastItem
    end
    list.lastItem = item
end

local function remove (list, item)
    local prev, next = list.itemBefore[item], list.itemAfter[item] 
    
    if prev then
        list.itemAfter[prev] = next
    else
        list.firstItem = list.itemAfter[item]
    end
    
    if next then
        list.itemBefore[next] = prev
        list.length = nil
        list.items = nil
    else
        list.lastItem = list.itemBefore[item]
        list.length = list.length - 1
    end
    
    list.itemAfter[item] = nil
    list.itemBefore[item] = nil
end

return function (t)
    local list = setmetatable({}, meta)
    
    list.insert = insert
    list.remove = remove
    
    list.items = t or {}
    list.realItems = list.items
    list.itemAfter = {}
    list.itemBefore = {}
    list.length = 0
    
    for i = 1, #list.items do
        insert(list, list.items[i])
    end
    
    return list
end
Here's the benchmark again with my test lists included:

Code: Select all

local TJList = require 'list'
local MyList = require 'mylist'

local ITEMS = 10000
local ITER = 100

do
    local tab = {}

    local t = {}
    for n = 1, ITEMS do
        t[n] = {}
    end
    
    local time = os.clock()
    for n = 1, ITEMS do
        table.insert(tab, t[n])
    end
    print(('table.insert: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for n = ITEMS, 1, -2 do
        table.remove(tab, n)
    end
    print(('table.remove: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for m = 1, ITER do
        for n = 1, #tab do
            assert(tab[n])
        end
    end
    print(('table iterate: %.4f'):format(os.clock() - time))
end

do
    local list = MyList()

    local t = {}
    for n = 1, ITEMS do
        t[n] = {}
    end
    
    local time = os.clock()
    for n = 1, ITEMS do
        list:insert(t[n])
    end
    print(('my list insert: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for n = ITEMS, 1, -2 do
        list:remove(t[n])
    end
    print(('my list remove: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for m = 1, ITER do
        for n = 1, list.length do
            assert(list.items[n])
        end
    end
    print(('my list iterate: %.4f'):format(os.clock() - time))
end

do
    local tjlist = TJList()

    local t = {}
    for n = 1, ITEMS do
        t[n] = {}
    end
    
    local time = os.clock()
    for n = 1, ITEMS do
        tjlist:add(t[n])
    end
    print(('tj list insert: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for n = ITEMS, 1, -2 do
        tjlist:remove(t[n])
    end
    print(('tj list remove: %.4f'):format(os.clock() - time))
    
    local time = os.clock()
    for m = 1, ITER do
        tjlist:forEach(assert)
    end
    print(('tj list iterate: %.4f'):format(os.clock() - time))
end
Again, it's just one benchmark, but the difference in performance is noticeable.

User avatar
ivan
Party member
Posts: 1565
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: Lists; a fast way to store objects.

Post by ivan » Sat Dec 17, 2016 6:15 am

Tjakka5 wrote:But all those tables that you add objects to, remove them, loop over them? Those are slow.
What is this assumption based on?
Tjakka5 wrote:Lists is my attempt at an alternative which gets rid of those nasty slow functions.
It uses some logic and magic to quickly add and remove objects.
Since you're adding logic on top of Lua tables, the solution is going to be slower than using tables.
You have to run less code to make things faster, not more!
For example, using metatables adds additional overhead.
And the line "self.items[object] = nil" means that adding the same element twice won't work.
Generally speaking, storing each element as a table is less than ideal.

As far as I know, the idea of a "list" data structure is that you can add and remove elements at ANY OFFSET in constant time.
Adding and removing elements from the end of the table is not how a "list" works.
I hate to be the grumpy old dude here, but your code looks more like a "stack" that had a few extra drinks on the weekend and made an illegitimate baby with a "set".

User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Post by airstruck » Sat Dec 17, 2016 8:15 am

ivan wrote: Since you're adding logic on top of Lua tables, the solution is going to be slower than using tables.
You have to run less code to make things faster, not more!
Not necessarily. Insertion and iteration might always be slower, but table.remove can be replaced with something much faster if you don't mind sacrificing other things. Suppose inserting things takes only slightly longer, but removing things is orders of magnitude faster, and your use case involves inserting things about as often as you remove them (and you don't care that you can only insert things at the end, and can't have duplicates). Everything else aside, this should be a net gain. A list of game entities might benefit from this.
As far as I know, the idea of a "list" data structure is that you can add and remove elements at ANY OFFSET in constant time.
Adding and removing elements from the end of the table is not how a "list" works.
That would be another benefit of something like double-linked lists.

Here's what I've got now; it's not well tested but removal is very fast compared to table.remove, probably a net gain for many use cases. Items can be inserted at or removed from any position. Items must be unique; inserting an item that's already there will move it from its old position to the new one (yes, it's really a set, not a list).

Code: Select all

local function check (list)
    if not list.isDirty then
        return list
    end
    list.length = 0
    local item = list.firstItem
    while item do
        list.length = list.length + 1
        list.items[list.length] = item
        item = list.itemAfter[item]
    end
    list.isDirty = false
    return list
end

local function getItems (list)
    return check(list).items
end

local function getLength (list)
    return check(list).length
end

local function nextItem (items, index)
    local item = items[index]
    
    if item then
        return index + 1, item
    end
end

local function each (list, func, ...)
    if not func then
        return nextItem, list:getItems(), 1
    end
    local items = list:getItems()
    for i = 1, list.length do
        func(items[i], ...)
    end
end

local function push (list, item)
    list.length = list.length + 1
    list.items[list.length] = item
    if not list.firstItem then
        list.firstItem = item
    end
    if list.lastItem then
        list.itemAfter[list.lastItem] = item
        list.itemBefore[item] = list.lastItem
    end
    list.lastItem = item
end

local function unshift (list, item)
    list.isDirty = true
    local next = list.firstItem
    if next then
        list.itemAfter[item] = next
        list.itemBefore[next] = item
    end
    list.firstItem = item
    if not list.lastItem then
        list.lastItem = item
    end
end

local function pop (list)
    local item = list.lastItem
    list:remove(item)
    return item
end

local function shift (list)
    local item = list.firstItem
    list:remove(item)
    return item
end

local function insert (list, item, neighbor, before)
    if list.has[item] then
        list:remove(item)
    end
    
    if not (neighbor and list.has[neighbor]) then
        if before then
            unshift(list, item)
        else
            push(list, item)
        end
    else
        if before and list.firstItem == neighbor then
            unshift(list, item)
        elseif (not before) and list.lastItem == neighbor then
            push(list, item)
        elseif before then
            list.isDirty = true
            local prev = list.itemBefore[neighbor]
            list.itemAfter[prev] = item
            list.itemBefore[item] = prev
            list.itemAfter[item] = neighbor
            list.itemBefore[neighbor] = item
        else
            list.isDirty = true
            local next = list.itemAfter[neighbor]
            list.itemAfter[item] = next
            list.itemBefore[next] = item
            list.itemAfter[neighbor] = item
            list.itemBefore[item] = neighbor
        end
    end
    
    return list
end

local function remove (list, item)
    if not (item and list.has[item]) then
        return list
    end
    
    local prev, next = list.itemBefore[item], list.itemAfter[item] 
    
    if prev then
        list.itemAfter[prev] = next
    else
        list.firstItem = list.itemAfter[item]
    end
    
    if next then
        list.itemBefore[next] = prev
        list.isDirty = true
    else
        list.lastItem = list.itemBefore[item]
        list.length = list.length - 1
    end
    
    list.itemAfter[item] = nil
    list.itemBefore[item] = nil
    list.has[item] = nil
    
    return list
end

return function (items)
    local list = {
        insert = insert,
        remove = remove,
        getItems = getItems,
        getLength = getLength,
        each = each,
        push = push,
        unshift = unshift,
        pop = pop,
        shift = shift,
        
        items = items or {},
        itemAfter = {},
        itemBefore = {},
        has = {},
        length = 0,
        isDirty = false,
    }
    
    for i = 1, #list.items do
        insert(list, list.items[i])
    end
    
    return list
end

User avatar
raidho36
Party member
Posts: 2063
Joined: Mon Jun 17, 2013 12:00 pm

Re: Lists; a fast way to store objects.

Post by raidho36 » Sat Dec 17, 2016 8:49 am

If it's not a list, then there's no advantage over plain Lua table manipulated directly.

User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Post by airstruck » Sat Dec 17, 2016 9:07 am

raidho36 wrote:If it's not a list, then there's no advantage over plain Lua table manipulated directly.
Again, the advantage is in removal time. Consider the benchmark I posted, where ten thousand items are reverse iterated and every other item is removed. On my laptop, the solution I posted completes that section of the benchmark about a thousand times faster than table.remove.
Last edited by airstruck on Sat Dec 17, 2016 10:02 am, edited 1 time in total.

User avatar
ivan
Party member
Posts: 1565
Joined: Fri Mar 07, 2008 1:39 pm
Contact:

Re: Lists; a fast way to store objects.

Post by ivan » Sat Dec 17, 2016 9:16 am

airstruck wrote:but table.remove can be replaced with something much faster
The point of table.remove(t, i) is that it can remove elements from any numeric index.
Numeric indexes greater than i are updated automatically hence tables can already work as a "list".
If you're concerned about removal time, you can use:
table[lastindex] = nil
or alternatively you can use non-numeric indexes which is insignificantly faster since the table length is not updated.

In short, you can't make tables perform faster using Lua code, it just doesn't work that way.

User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Post by airstruck » Sat Dec 17, 2016 9:26 am

The point of table.remove(t, i) is that it can remove elements from any numeric index.
Both Tjakka's solution and my own can also remove elements from anywhere in the set.
Numeric indexes greater than i are updated automatically hence it works as a "list".
The same thing applies to both Tjakka's solution and my own. The only reason these solutions should be called "set" instead of "list" is because each element must be unique; it has nothing to do with random access in this case.
If you're concerned about removal time, you can use:
table[lastindex] = nil
Sure, but unless you're only nilling the last index, you'll leave holes in the table. Setting values equal to nil is not equivalent to using table.remove. Other solutions like moving the last element into the hole also work, but if things need to stay in the same order that's no good either.
In short, you can't make tables perform faster using Lua code, it just doesn't work that way.
The goal is not to make "tables perform faster." As I understand it, the goal here is to provide an alternative data structure with fast removal times that maintains the integrity of the data (like table.remove, but faster, and unlike t[n]=nil).

Edit: There was a bug in the solution I posted earlier; I'll keep an updated copy here. The removal section of the benchmark actually completes about 100 times faster than the table.remove equivalent (not 1000). Still, it more than makes up for the slightly slower insertion and iteration time on this particular benchmark (there is a significant net gain).

The benchmark was supposed to represent a scenario similar to a game entity list, where you iterate the list about 100 times more often than you insert things (for example you iterate the entity list every frame, but only create or destroy entities every few seconds). Realistically, you'd probably be removing entities about as often as you insert them, but this benchmark only removes half as many as it inserts. Even so, the time gained when removing items far outweighs the time lost when inserting and iterating them (compared to standard table operations).

It should be pretty clear that there are at least some situations where you can get a significant net speed gain out of specialized data structures that you're not going to get from standard table operations.

User avatar
Tjakka5
Party member
Posts: 240
Joined: Thu Dec 26, 2013 12:17 pm

Re: Lists; a fast way to store objects.

Post by Tjakka5 » Sat Dec 17, 2016 6:10 pm

This is.. weird, to say the least.
I've been holding off on posting because of the, uh, unnatural kind of feedback (I'm not saying it's bad.)

I'll try to address my points as well I can.

So first, to get it out of the way.
Yes, "lists" (or "sets", or whatever you would call them) do have a place in Lua.
Specific solutions for specific situations will always be faster than a generic "jack of all trades" solution.
I think Airstruck did a good job at demonstrating this. So I won't talk about it further.

Next,
While Ivan's solution to one problem works, they make the other solution(s) impossible to work.
Using 'table = nil' makes you unable to use '#table', 'table.remove' is slow, using non-numerical indexes will force you to use 'pairs', which is even slower.

Last, the biggest problem.
The code doesn't work. It even slows down exponentially.
I do not know why this is; I could be content with accepting it was slow.
But, as I said, it slows down exponentially, and I just can't understand why.

My first attempt at lists were posted in this topic, along with benchmarks. (Which were done sloppy, but should still give a broad good result)
Check out my portfolio: http://tjakka5.sorunome.de/

User avatar
airstruck
Party member
Posts: 650
Joined: Thu Jun 04, 2015 7:11 pm
Location: Not being time thief.

Re: Lists; a fast way to store objects.

Post by airstruck » Sat Dec 17, 2016 8:32 pm

Tjakka5 wrote:Last, the biggest problem.
The code doesn't work. It even slows down exponentially.
I do not know why this is; I could be content with accepting it was slow.
But, as I said, it slows down exponentially, and I just can't understand why.
The strangest part is that it works fairly well under PUC Lua, but is much, much slower under LuaJit (the exponential slowdown only seems to happen there). I don't understand that at all.

The solution I posted seems like it should work pretty well. I think it will be faster, and it has the advantage of not needing to be cleaned periodically. I'll probably clean it up a bit later, write some tests and put it on GitHub. Feel free to use it instead if you want.

Post Reply

Who is online

Users browsing this forum: No registered users and 16 guests