Page 2 of 4

Re: Lists; a fast way to store objects.

Posted: Sat Dec 17, 2016 9:55 pm
by pgimeno
Could I see an example of that slowing down? I only see one that appears linear-ish, by modifying main.lua so it holds ~5000 elements.

Re: Lists; a fast way to store objects.

Posted: Sat Dec 17, 2016 10:10 pm
by airstruck
pgimeno wrote:Could I see an example of that slowing down? I only see one that appears linear-ish, by modifying main.lua so it holds ~5000 elements.
The slowdown happens when inserting elements. On my laptop, the insertion section of benchmark I posted earlier takes about 3.5 seconds for 10,000 elements. With 5000 elements, it takes about 0.7 seconds (under LuaJit, testing Tjakka's solution).

Under Lua 5.1, 5000 elements takes more like 0.03 seconds, and 10,000 takes slightly more than twice that.

Re: Lists; a fast way to store objects.

Posted: Sat Dec 17, 2016 10:36 pm
by pgimeno
Thanks. Found the cause.

It turns out to be the caveat I alerted about here: http://lua.space/general/assert-usage-caveat

Commenting out the assert at the top of add() solves the issue (for me).

Re: Lists; a fast way to store objects.

Posted: Sun Dec 18, 2016 1:07 am
by airstruck
pgimeno wrote:Thanks. Found the cause.

It turns out to be the caveat I alerted about here: http://lua.space/general/assert-usage-caveat

Commenting out the assert at the top of add() solves the issue (for me).
Nice, it's much faster without those asserts. It's strange that LuaJit can't manage to do that as efficiently as PUC Lua.

Re: Lists; a fast way to store objects.

Posted: Sun Dec 18, 2016 12:59 pm
by pgimeno
I think it may be a problem with the management of garbage. The time growth (which might be quadratic, rather than exponential) might have to do with not being able to get rid of all these strings, and having to look for room to allocate more. Adding collectgarbage("step") after the assertion makes it work faster than without it (up to 3 collectgarbage() calls improve the timing in my system; 4 or more makes it slower).

Re: Lists; a fast way to store objects.

Posted: Tue Dec 20, 2016 2:22 am
by WetDesertRock
Not sure I understand the point of this. I could understand something like numpy's arrays in Lua, but Lua already has highly optimized tables as arrays. In fact, if you only use a table as an array, it will only ever be an array. Lua stores tables as a two part object, one part array one part hashmap. It then chooses which element to use in an highly optimized algorithm.

Were you finding slow downs with tables as arrays? Or is this a pre-optimization?

Re: Lists; a fast way to store objects.

Posted: Tue Dec 20, 2016 3:39 am
by Positive07
Well removal and insertion at random indexes is the slower portion of the table as array. So I guess that is the actual point in this post right?

Re: Lists; a fast way to store objects.

Posted: Tue Dec 20, 2016 4:53 am
by pgimeno
WetDesertRock wrote:Were you finding slow downs with tables as arrays? Or is this a pre-optimization?
Don't fall for the fallacy of premature optimization. It makes sense when it makes sense.

There are a number of use cases for a library like this. I'd say it's all about avoiding table.remove, because it takes time proportional to the number of elements in the sequence. See Airstruck's benchmark.

Re: Lists; a fast way to store objects.

Posted: Tue Dec 20, 2016 1:16 pm
by kikito
I'm by no means an expert in performance (If anything, I am the contrary), but I have some remarks.

I never use table.remove or table.insert. If I am going to be doing frequent insertions and deletions, what I try to do first is using the items as keys, like this:

Code: Select all

t[new_item]=1 -- insertion
t[old_item]=nil -- deletion
if t[some_item] then ... -- checking for existence
for v in pairs(t) do ... -- parsing
Insertion and deletion is super fast. Checking for existence is super fast too. The only problem is parsing: pairs is slower than a numerical loop, and the items will be parsed in random order. In a scenario where addition/deletion happens more often than parsing, this can be quite fast.

If I need to keep the order for some reason, I rely heavily on the fact that just popping the last element is really fast, and so is adding an element at the end:

Code: Select all

t[#t]=nil -- remove
t[#t + 1]=new_item -- add new
(I also cache the "len" of the array in a variable and increment it before each insert, not re-calculating #t each time).

When I need to remove some non-last value, I can replace that value by the last one, and then remove the last one. Like this:

Code: Select all

t[i]=t[#t] -- replace by last element
t[#t]=nil -- remove last element
(Again, assume that #t is cached in a variable). I think that approach is still faster than invoking table.remove, but I confess that I have not tested this.

Inserting things not-at-the-end is the only place where I would need table.insert. But in reality I never do that.

My impression is that table.remove & insert are only useful when you have an array-like table with random deletes & inserts, and you need to keep it in certain order, and you can't use table.sort. I have personally never needed such table - I guess someone might do. But I think that if you want to compare the speed of a data structure with Lua's tables, you should compare it with the insertion-at-end and replacing-by-last-and-deleting-last approaches I have just discussed. These are what someone concerned about speed will use.

Re: Lists; a fast way to store objects.

Posted: Tue Dec 20, 2016 4:29 pm
by zorg
kikito wrote:Inserting things not-at-the-end is the only place where I would need table.insert. But in reality I never do that.
kikito wrote:When I need to remove some non-last value, I can replace that value by the last one, and then remove the last one. Like this:

Code: Select all

t[i]=t[#t] -- replace by last element
t[#t]=nil -- remove last element
Not that this would be that common of a scenario either, but:

Code: Select all

local yesThisIsLuaCode = nil -- (ignore this)
t[#t] = t[i] -- move old to end
t[i]  = e -- new element inserted wherever wanted
-- or in a more concise form:
t[#t], t[i] = t[i], e
is completely possible, if you both want to insert stuff at a specific location, and don't care that whatever was there is now not. Granted, the first one doesn't make that much sense (caring about insertion position) when we don't care about the second one (order) either, but who knows; maybe someone would find a use-case i didn't think of. :)