fork()
, exec()
and calls to shared memory and semaphore routines. This is because a thread creation is supposed to take less resource for creation and cheaper for switching between them. Threads (pthreads) may be implemented at OS level or supported by appropriate hardware.Though it seems that pthreads take up much less resources, but the way an OS is configured can drastically alter the resource requirements of creating pthreads. This was typically the situation I landed up when I was using pthread via Python on a large SGI Altix system (google for: ANUSF).
The stack size on this altix system was set to a default of 1GB, which resulted in stack allocation of 1GB per thread even if no work (or memory allocated) was done within these threads. Initially I though that this was a problem with Python interpreter on IA64, so i coded up a small skeleton code in C using pthreads; to my surprise this also resulted in allocation of 1GB per thread. Next, I tried using OpenMP "threads" and was pleased to see that the memory of this process didn't shoot up like its pthreads counterparts.
After some consultations with my instructor, I discovered that you can set the stack sizes of pthreads using:
pthread_attr_setstacksize()
function (check google codesearch for examples). But all this meant rewriting all my code in Python in C or writing a full thread wrapper for Python in C.So determined not to do that I set out finding new ways to handle this in Python itself. I discovered that you could actually set the stack size in Python, but to my dismay this had been only introduced in the latest 2.5 release, and there was no way that the 2.3 version of Python on the Altix machines were to be updated.
After googling around a bit i discovered what is called as stack-less python. This essentially reduces usage of python stacks by maintaining a common stack. But again this had many problems, first and foremost was this was not standard python and had to install it separately. Secondly there is a lot of debate on the merit of using stackless python and the disagreement with the main Python development community.
Ruling this out, by sheer chance i googled for "microthreads" and came across an interesting article by David Mertz. This article suggested using generators in Python to achieve user level cooperative multithreading. This was really an interesting article for me as it was the first time that I was introduced to the wonderful generators in Python. I began toying around this idea, but finally discovered that I would still be requiring preemptive multithreading for my particular application.
I had known the use of "ulimit" in bash (or "limit" in csh), and had frequently used it to query the system limits. But had never intentionally used it to change those limits. As soon as I remembered this command, it was very obvious what I would be doing: wrap up python execution in a shell script, and issue an ulimit with 8MB (or so) as the maximum stack limit. So only the python process will be affected by this change, and the rest of the system process remained intact.
In the end the solution to the problem seemed to be simple, but learned a lot in the process.
Now I am writing a small MicroThreads interface using generators, which I will be soon posting here. (Note: there are many more implementation using this idea, but I just want to have some fun with generators)