[opensource-dev] Mesh viewers and tcmalloc issues

Henri Beauchamp sldev at free.fr
Sun Oct 2 01:20:47 PDT 2011


On Sun, 2 Oct 2011 01:12:40 -0400, Zabb65 wrote:

> It is not desirable to let the default allocation engine run under
> Linux, and possibly Mac. glibc has an incredibly slow allocator for
> small objects, and tcmalloc was implemented to remedy these
> situations(This is what I heard, but have not confirmed.) Inventory
> and a few other items create massive numbers of very small heap
> objects.

This would be true only if the viewer was to perform dozens of
thousands of allocations *every* second, which is *not* the case,
plus the speed difference is not that large (we are speaking of
microseconds per allocation here), especially when tcmalloc finds
itself with a fragmented allocated virtual memory pool and must
perform garbage collection as a result: the overhead is then huge,
when the system malloc() simply *transparently* benefits from the
OS ability of allocating continuous blocks of virtual memory out
of fragmented physical memory via the page table and doesn't need
any garbage collection.

It becomes even worst with private memory pools of v3 that add
yet another layer of code around memory allocations, slowing
things down more.

Finally, when tcmalloc doesn't release enough memory you find
youself with a huge process in memory that may result in your
system starting to swap... And instead of microseconds penalties,
we are now speaking of seconds !

> Setting a much lower value(1000) that only returns memory
> occasionally is much more desirable on this platform.

The problem is then that not *all* memory get released (the
tcmalloc algorithm is crippled), and tcmalloc will still get
its non-released pool fragmented over time.

> You can do this
> without recompiling tcmalloc as well by passing an environment
> variable, which is the desired method.

I know and told it: please re-read my message !
However, it doesn't work for Windows (unless you create a complicated
shortcut with cmd.exe and accept having a command window staying open
together with the viewer after launching the latter with that
shortcut), thus why hardcoding the value is best (it's a private library
for use by the viewer only anyway: it's not like it it was to be
installed system-wide).

> Please note that it is listed
> as a caveat on the tcmalloc page that it very clearly does not return
> memory to the system(I assume this is outdated, or reflects the
> default value)

No, you probably saw an outdated doc. See:
http://google-perftools.googlecode.com/svn/trunk/doc/tcmalloc.html
and scroll down to the "Releasing Memory Back to the System" section.

> Windows doesn't really have a need for tcmalloc from what I can see.

I so think it does for mesh viewers, for aligned allocations.
MacOS-X doesn't need it.

> If LL compiles using visual studio 2010, the C runtime uses the low
> fragmentation heap allocator. The low fragmentation allocator is fast
> enough to satisfy even large numbers of small objects, and keep heap
> fragmentation to very small percentages. Working "against" the heap
> manager on windows by rolling your own is generally not advised unless
> there are extraordinary needs or requirements, and even then, it is
> far easier to cause more problems then you fix.

Really, the only true motivation behind the use of tcmalloc in mesh
viewers (it was not use for non-mesh viewers) is to provide aligned
allocations. Without it, and the way the mesh viewer code is written,
the viewer simply crashes as soon as it tries to perform an SSE2
operation on an unaligned structure.

I tried with and without tcmalloc in the non-mesh branch of the
Cool VL Viewer (v1.26.0): there is no speed difference at all, but
the viewer does use more memory with tcmalloc (even with the force-
release trick). I got rid of it in newest v1.26.0 versions since
it's not worth bothering with it for non-SSE2 llmath viewers.

> Is tcmalloc really providing aligned allocations? I only found
> documentation that it would enforced specific amounts of space between
> items, not that they were guaranteed to be aligned to an X byte
> boundary. (Maybe this is what the spacing guarantees, but I am
> unsure.)

Yes, it does (see above).

Henri.


More information about the opensource-dev mailing list