header image
Probabilistic processor and the end of locks
August 22nd, 2010 under Algorithms, Computers, rengolin. [ Comments: none ]

Every now and then, when I’m bored with normal work, I tend to my personal projects. For some reason, that tends to be mid-year, every year. So, here we go…

Old ideas, new style

This time was an old time project, to break the parallelization barrier without the use of locks. I knew simple boolean logic on regular Turing machines wouldn’t do, as they rely on the fact that the tape is only written by one head at a time. If you have more than one head writing to the same place, the behaviour is undefined. I’ve looked into several other technologies…

Some were just variations of Turing machines, like molecular, ternary and atomtronic computers. Others were simply interesting toys, like the programmable water, but there were two promising classes: quantum computers (taking a bit too long to impress) and non-deterministic Turing machine (NTM), that could get around some concurrency problems quite well.

Quantum computers are nice, but they’re more like vector computing on steroids. You can operate on several qbits at once, maybe even use non-locality to speed up communications, but until all that reaches the size of a small building, we need something different than yet-another n-core frying-pan Intel design. That may be non-deterministic machines…

Non-deterministic Turing machines are exactly like their deterministic counterparts, but the decision process can take different paths given the same input. So, instead of

if (a == 10) do_something;

you have

if (probability(a) > threshold) do_something;.

The problem with that is that you still can’t ignore different processes writing to the same data, since if a process happens to read it (even if by chance), the data you get from it is still being concurrently changed, therefore, still undefined. In other words, even if the state change is probabilistic, the data is still deterministic.

So, how can you make sure that, no matter how many processes are writing to a piece of data at random times, you still have meaningful data?

Well, if the reading part is not interested into one particular value of that data, but say, only in the probability of that data be a certain value, then it doesn’t matter in which order the writes are performed, as long as they’re all writing about the same thing. That can be viewed as a non-deterministic register, not a non-deterministic Turing machine. You can still have a deterministic machine reading and writing on non-deterministic registers, that you still won’t need locks for those values.

Probabilistic registers

Suppose you want to calculate an integral. You can use several very good deterministic algorithms (like the Romberg’s method), but if you create producers to get the values and consumers to reduce it to a sum, you’ll need a synchronized queue (ie. locks) to make sure the consumer is getting ALL the values. If you loose some values, you might get a completely wrong result.

If instead, you create random generators, getting the value of any (multi-dimensional) function from random values, and accumulate them into a non-deterministic register (say by updating the average of the last values, like recharging a capacitor that keeps discharging), when the consumer process read the value from it, it’ll get an average of the previously written values. Because the writes are all random, and if you keep only the average of the last N writes, you know how much data is being filled in and what’s the average value for that block.

In essence, you’re not storing any particular value, but information enough to re-create the original data. Since specific values are not important in this case, the order in what they appear (or if they appear at all) is completely irrelevant.

In a way, it’s similar to quantum mechanics, where you have the distribution of probabilities but not the values themselves. With a few tricks you can extract every information from those distributions. This is also similar to what Monte Carlo methods are. Taking the probabilistic side of computations in order to achieve a rough image of the problem quicker than by brute-force, when the complexity of the problem would be non-polynomial and the problem size would be big enough to need a few billion years to calculate.

Here are some examples of a non-det register.

A new logic

Obviously, current deterministic algorithms cannot run on such machine. All of them are based on the fact that specific data is stored in a specific place. If you compress a file and uncompress it back again, it’s just impossible to work with probable values, your original file will be completely undecipherable. Also, boolean logic cannot operate on such registers, Operating an AND between probabilities or a XOR of two averages doesn’t make any sense. You need a different logic, something that can make sense of probable inputs and analog streams. Fortunately, Bayesian logic is just the thing for such machines.

But the success of Bayesian logic in computing is not enormous. First, it’s not simple to create algorithms using a network of probabilities, and very few mundane algorithms can benefit from it to be worth the trouble. (Optimor labs is a good example). But with the advent of this new machine, a different logic has to be applied and new algorithms must be developed to make use of it.

Probably by mixing deterministic with non-deterministic hardware and creating a few instructions that would use the few non-deterministic registers (or accumulators) available. Most likely they would be used for inter-process communication in the beginning, but as you increase the number of those registers you can start building complex Bayesian networks or also neural networks in the registers themselves. That would open the door to much more complex algorithms, probably most of them unpredictable or found by trial and error (genetic approach?), but with time, people would begin to think that way and that would be the dawn of the non-deterministic computing.

Eureka moment

The probabilistic register was an Eureka moment, when I finally implemented it and saw that my simple monte-carlo or my neuron model was really a three-line code, but that didn’t last long…

A few days ago I read the news that DARPA has developed a fully featured probabilistic processor, taking random computation to the masses, even using Bayesian logic in place of boolean logic! The obvious place for a “public trial” was spam filtering (though I think the US military is not that interested into filtering spam…), but according to them, the sky is the limit. I do have to agree…

Anyway, I wouldn’t have 5 spare years nor $18mi in the bank to implement that project like they did, so in a way I’m glad that it turned out to be a good idea after all. I just hope that they make good use of it (ie. not use it for Carnivore 2.0), and they realize that not only the military have uses for it, or anti-spam filters, for that matter. We (them?) should develop a whole new set of algorithms to work on that logic and make good use of a hardware that is not a simple progression of current technology (to keep Moore’s law going), but could be a game changing in the next decades.

I could re-focus on defining the basic blocks for that processor’s new logic, but I’m sure DARPA has much better personnel to do that. So, what’s next, then? I’ll tell you next year…


Barrelfish
March 18th, 2010 under Algorithms, Devel, Distributed, OSS, Software, rengolin. [ Comments: none ]

Minix seems to be inspiring more operating systems nowadays. Microsoft Research is investing on a micro-kernel (they call it multi-kernel, as there are slight differences) called Barrelfish.

Despite being Microsoft, it’s BSD licensed. The mailing list looks pretty empty, the last snapshot is half a year ago and I couldn’t find an svn repository, but still more than I would expect from Microsoft anyway.

Multi-kernel

The basic concept is actually very interesting. The idea is to be able to have multi-core hybrid machines to the extreme, and still be able to run a single OS on it. Pretty much the same way some cluster solutions do (OpenMPI, for instance), but on a single machine. The idea is far from revolutionary. It’s a natural evolution of the multi-core trend with the current cluster solutions (available for years) and a fancy OS design (micro-kernel) that everyone learns in CS degrees.

What’s the difference, then? For one thing, the idea is to abstract everything away. CPUs will be just another piece of hardware, like the network or graphic cards. The OS will have the freedom to ask the GPU to do MP floating-point calculations, for instance, if it feels it’s going to benefit the total execution time. It’ll also be able to accept different CPUs in the same machine, Intel and ARM for instance (like the Dell Latitude z600), or have different GPUs, nVidia and ATI, and still use all the hardware.

With Windows, Linux and Mac today, you either use the nVidia driver or the ATI one. You also normally don’t have hybrid-core machines and absolutely can’t recover if one of the cores fail. This is not the same with cluster solutions, and Barrelfish’s idea is to simulate precisely that. In theory, you could do energy control (enabling and disabling cores), crash-recovery when one of the cores fail but not the other, or plug and play graphic or network cards and even different CPUs.

Imagine you have an ARM netbook that is great for browsing, but you want to play a game on it. You get your nVidia and a coreOcta 10Ghz USB4 and plug in. The OS recognizes the new hardware, loads the drivers and let you play your game. Battery life goes down, so once you’re back from the game, you just unplug the cards and continue browsing.

Scalability

So, how is it possible that Barrelfish can be that malleable? The key is communication. Shared memory is great for single-processed threaded code and acceptable for multi-processed OSs with little number of concurrent process accessing the same region in memory. Most modern OSs can handle many concurrent processes, but they rarely access the same data at the same time.

Normally, processes are single threaded or have a very small number of threads (dozens) running. More than that is so difficult to control that people usually fall back to other means, such as client/server or they just go out and buy more hardware. In clusters, there is no way to use shared memory. For one, accessing memory in another computer via network is just plain stupid, but even if you use shared memory in each node and client/server among different nodes, you’re bound to have trouble. This is why MPI solutions are so popular.

In Barrelfish there’s no shared memory at all. Every process communicate with each other via messages and duplicate content (rather than share). There is an obvious associated cost (memory and bus), but the lock-free semantics is worth it. It also gives Barrelfish another freedom: to choose the communication protocol generic enough so that each piece of hardware is completely independent of all others, and plug’n'play become seamless.

Challenges

It all seem fantastic, but there’s a long road ahead. First, message passing scales much better than shared memory, but nowadays there isn’t enough cores in most machines to make it worth it. Message passing also introduces some other problems that are not easily solvable: bus traffic and storage requirements increase considerably, and messages are not that generic in nature.

Some companies are famous for not adhering to standards (Apple comes to mind), and a standard hardware IPC framework would be quite hard to achieve. Also, even if using pure software IPC APIs, different companies will still use slightly modified APIs to suit their specific needs and complexity will rise, exponentially.

Another problem is where the hypervisor will live. Having a distributed control centre is cool and scales amazingly well, but its complexity also scales. In a hybrid-core machine, you have to run different instructions, in different orders, with different optimizations and communication. Choosing one core to deal with the scheduling and administration of the system is much easier, but leaves the single-point-of-failure.

Finally, going the multi-hybrid-independent style is way too complex. Even for a several-year’s project with lots of fund (M$) and clever people working on it. After all, if micro-kernel was really that useful, Tanembaum would have won the discussion with Linus. But, the future holds what the future holds, and reality (as well as hardware and some selfish vendors) can change. Multi-kernel might be possible and even easier to implement in the future.

This seems to be what the Barrelfish’s team is betting on, and I’m with them on that bet. Even if it fails miserably (as did Minix), some concepts could still be used in real-world operating systems (like Minix), whatever that’ll mean in 10 years. Being serious about parallelism is the only way forward, sticking with 40 years old concepts is definitely not.

I’m still aspiring for non-deterministic computing, though, but that’s an even longer shot…


SLC 0.2.0
September 12th, 2009 under Algorithms, Devel, OSS, rengolin. [ Comments: none ]

My pet compiler is now sufficiently stable for me to advertise it as a product. It should deal well with the most common cases if you follow the syntax, as there are some tests to assure minimum functionality.

The language is very simple, called “State Language“. This language has only global variables and static states. The first state is executed first, all the rest only if you call goto state;. If you exit a state without branching, the program quits. This behaviour is consistent with the State Pattern and that’s why implemented this way. You can solve any computer problem using state machines, therefore this new language should be able to tackle them all.

The expressions are very simple, only binary operations, no precedence. Grouping is done with brackets and only the four basic arithmetic operations can be used. This is intentional, as I don’t want the expression evaluator to be so complex that the code will be difficult to understand.

As every code I write on my spare time, this one has an educational purpose. I learn by writing and hopefully teach by providing the source, comments and posts, and by letting it available on the internet so people can find it through search engines.

It should work on any platform you can compile to (currently only Linux and Mac binaries provided), but the binary is still huge (several megabytes) because they contain all LLVM libraries statically linked to it.

I’m still working on it and will update the status here at every .0 release. I hope to have binary operations for if statements, print string and all PHI calculated for the next release.


The LLVM compilation infrastructure
August 25th, 2009 under Algorithms, Devel, Software, rengolin. [ Comments: none ]

I’ve been playing with LLVM (Low-Level Virtual Machine) lately and have produced a simple compiler for a simple language.

The LLVM compilation infrastructure (much more than a simple compiler or virtual machine), is a collection of libraries, methods and programs that allows one to create a simple, robust and very powerful compilers, virtual machines and run-time optimizations.

As GCC, it’s roughly separated into three layers: the front-end, which parses the files and produce intermediate representation (IR), the independent optimization layer, which acts on the language-independent IR and the back-end, which turns the IR into something executable.

The main difference is that, unlike GCC, LLVM is extremely generic. While GCC is struggling to fit broader languages inside the strongly C-oriented IR, LLVM was created with a very extensible IR, with a lot of information on how to represent a plethora of languages (procedural, object-oriented, functional etcetera). This IR also carries information about possible optimizations, like GCC’s but to a deeper level.

Another very important difference is that, in the back-end, not only code generators to many platforms are available, but Just-In-Time compilers (somewhat like JavaScript), so you can run, change, re-compile and run again, without even quitting your program.

The middle-layer is where the generic optimizations are done on the IR, so it’s language-independent (as all languages wil convert to IR). But that doesn’t mean that optimizations are done only on that step. All first-class compilers have strong optimizations from the time it opens the file until it finishes writing the binary.

Parser optimizations normally include useless code removal, constant expression folding, among others, while the most important optimizations on the back-end involve instruction replacement, aggressive register allocation and abuse of hardware features (such as special registers and caches).

But the LLVM goes beyond that, it optimizes during run-time, even after the program is installed on the user machine. LLVM holds information (and the IR) together with the binary. When the program is executed, it profiles automatically and, when the computer is idle, it optimizes the code and re-compile it. This optimization is per-user and means that two copies of the same software will be quite different from each other, depending on the user’s use of it. Chris Lattner‘s paper about it is very enlightening.

There are quite a few very important people and projects already using LLVM, and although there is still a lot of work to do, the project is mature enough to be used in production environments or even completely replace other solutions.

If you are interested in compilers, I suggest you take a look on their website… It’s, at least, mind opening.


Ad infinitum
February 12th, 2009 under Algorithms, Devel, Life, OSS, Physics, World, rengolin. [ Comments: none ]

Quality is fundamental in any job, and software is no exception. Although fairly good software is relatively easy to do, really good software is an art that few can truly reach.

While in some places you see a complete lack of understanding about the minimal standards of software development, in others you see it in excess. It’s no good either. In the end, as we all know, the only thing that prevails is common sense. Quality management, all sorts of tests and refactoring is fundamental to the agile development, but being agile doesn’t mean being time-proof.

One might argue that, if you keep on refactoring your code, one day it’ll be perfect. That if you have unit tests, regression tests, usability test (and they’re also being constantly refactored), you won’t be able to revive old bugs. That if you have a team always testing your programs, building a huge infrastructure to assure everything is user proof, users will never get a product they can’t handle. It won’t happen.

It’s like general relativity, the more speed you get, the heavier you become and it gets more difficult to get more speed. Unlike physics, though, there is a clear ceiling to your growth curve, from where you fall rather than stabilize. It’s the moment when you have to let go, take out what you’ve learned and start all over again, probably making the same mistakes and certainly making new ones.

Cost

It’s all about cost analysis. It’s not just money, it’s also about time, passion, hobbies. It’s about what you’re going to show your children when they grow up. You don’t have much time (they grow pretty fast!), so you need to be efficient.

Being efficient is quite different on achieving the best quality possible, and being efficient locally can also be very deceiving. Hacking your way through every problem, unworried about the near future is one way of screwing up things pretty badly, but being agile can lead you to the same places, just over prettier roads.

When the team is bigger than one person, you can’t possibly know everything that is going on. You trust other peoples judgements, you understand things differently and you end up assuming too much about some of the things. Those little things add up to the amount of tests and refactoring you have to run for each and every little change and your system will indubitably cripple up to a halt.

Time

For some, time is money. For me, it’s much more than that. I won’t have time to do everything I want, so I better choose wisely putting all correct weights on the things I love or must do. We’re not alone, nor is all we do for ourselves, so it’s pretty clear that we all want our things to last.

Time, for software, is not a trivial concept. Some good software don’t even get the chance while some really bad things are still being massively used. Take the OS/2 vs. Windows case. But also some good software (or algorithms or protocols) have proven to be much more valuable and stable than anyone ever predicted. Take the IPv4 networking and the Unix operating system (with new clothes nowadays) as examples.

We desperately need to move to IPv6 but there’s a big fear. Some people are advocating for decades now that Unix is already decades deprecated and still it’s by far the best operating system we have available today. Is it really necessary to deprecate Unix? Is hardware really ready to take the best out of micro-kernel written in functional programming languages?

For how long does a software lives, then?

It depends on so many things that it’s impossible to answer that question, but there are some general rules:

  • Is it well written enough to be easy to enhance to users’ request? AND
  • Is it stable enough that won’t drive people away due to constant errors? AND
  • Does it really makes the difference to people’s lives? AND
  • Are people constantly being reminded that your software exists (both intentionally and unintentionally)? AND
  • Isn’t there something else much better? AND
  • Is the community big enough to make migration difficult?

If you answered no to two or more questions, be sure to review your strategy, you might already be loosing users.

There is another path you might find your answers:

  • Is the code so bad that no one (not even its creator) understand it anymore? OR
  • The dependency chain is so unbearably complicated, recursive and fails (or works) sporadically? OR
  • The creator left the company/group and won’t give a blimp to answer your emails? OR
  • You’re relying on closed-source/proprietary libraries/programs/operating systems, or they have no support anymore? OR
  • Your library or operating system has no support anymore?

If you answered yes to two or more questions, be sure to review your strategy, you might already be on a one-way dead-end.

Ad infinitum

One thing is for sure, the only thing that is really unlimited is stupidity. There are some things that are infinite, but limited. Take a sphere, you can walk on a great circle until the end of all universes and you won’t reach the end, but the sphere is limited in radius, thus, size. Things are, ultimately, limited in the number of dimensions they’re unlimited.

Stupidity in unlimitedly unlimited. If the universe really has 10 dimensions, stupidity has 11. Or more. The only thing that will endure, when the last creature of the last planet of the last galaxy is alive is his/her own stupidity. It’ll probably have the chance to propagate itself and the universe for another age, but it won’t.

In software, then, bugs are bound to happen. Bad design has to take part and there will be a time when you have to leave your software rest in peace. Use your time in a more creative way because for you, there is no infinite time or resources. Your children (and other people’s children too) will grow quick and deprecate you.


Calliper, chalks and the axe!
September 10th, 2008 under Algorithms, Devel, Physics, rengolin. [ Comments: none ]

Years ago, when I was still doing physics university in São Paulo, a friend biochemist stated one of the biggest truths about physics: Physicist is the one that measures with a calliper, marks with chalk and cuts with an axe!.

I didn’t get it until I got through some courses that teaches how to use the mathematical tools available, extrapolate to the most infamous case, than expand in a series, take the first argument and prove the theorem. If you get the second argument, you’re doing fine physics (but floating point precision will screw up anyway).

Only recently I’ve learnt that some scientists are really doing a lot by following in the opposite direction. While most molecular dynamics simulation are going to the quantum level, taking ages to get to an averagely reasonable result (by quantum standards), some labs are actually beating them in speed and quality of results by focusing on software optimizations rather than going berzerk on the physical model.

It’s not like the infamous Russian pen (which is a hoax, by the way), it’s only the normal over-engineering that we normally see when people are trying to impress the rest of the world. The Russians themselves can do some pretty dumb simplifications like the cucumber picker or over-engineering like the Screw Drive that, in the end, created more problems than solved.

Very clear, in software development, the situation can be as bad as that. The complexity of over-designed interfaces or over-engineered libraries can render a project useless in a matter of months. Working around would increase the big ball of mud and re-writing from scratch would take a long time, not to mention include more bugs than it solves.

Things that I’ve recently seen as over-engineering were:

  • Immutable objects (as arguments or on polymorphic lists): When you build some objects and feed them to polymorphic immutable lists (when creating a bigger object, for instance) and then need to change that afterwards you have to copy, change and then write back.
    This is not only annoying but is utterly ineffective when the list is big (and thousands of objects need to be copied back and forth). The way out of it is to use the bridge pattern and create several (RW) implementations of your objects and lists and whatever you have but that also increases a lot on code complexity and maintenance.
    My view of the matter is: protect your stuff from other people, not from yourself. As in “Library Consistent” or “Package-wise Consistent”.
  • Abuse of “Standard algorithms“: Ok, one of the important concepts in software quality is the use of standards. I’ve written it myself, over and over. But, like water, using no standards will kill your project the same way as abusing of them.
    So, if you create a std::set that gives you the power of log(N) searches, why the heck you’d use std::find_if ( begin(), end(), MyComparator() );, that gives you linear searches? Worse, that find was actually before each and every insert! std::set guarantees at least N.log(N) speed on insertion, but the “standard fail-safe assurance” was giving it N².log(N). For what? To assure no duplicated entries were ever inserted in the set, what was yet another thing guaranteed by the default container in question.
    All in all, the programmer was only trying to follow the same pattern over the entire code. A noble cause, indeed.

Now, I’m still defining what’s worse: over-engineering or under-engineering… Funny, though, both have very similar effects on our lives…


Object Orientation in C: Structure polymorphism
August 28th, 2008 under Algorithms, Devel, Unix/Linux, rengolin. [ Comments: 1 ]

Listening to a podcast about the internals of GCC I’ve learnt that, in order to support object oriented languages in a common AST (abstract syntax tree), the GCC does polymorphism in a quite exquisite way.

There is a page that describes how to do function polymorphism in C but not structure polymorphism as it happens on GCC, by means of a union, so I decided that was a good post to write…

Unions

Like structs, you can create a list of things together in an union but, unlike structs, the things share the same space. So, if you create a struct of an int and a double, the size of the structure is the sum of both sizes. In an union, its size is the size of the biggest element and all elements are in the same area of memory, accessed from the first byte of the union.

Its usage is somewhat limited and can be quite dangerous, so you won’t find many C programs and rarely find any C++ programs using it. One of the uses is to (unsafely) convert numbers (double, long, int) to their byte representation by accessing as an array of chars. But the use we’ll see now is how to entangle several structures together to achieve real polymorphism.

Polymorphism

In object oriented polymorphism, you can have a list of different objects sharing the same common interface being accessed by their interface’s structure. But in C you don’t have classes and you can’t build structure inheritance, so to achieve the same effect you need to put them all in the same box, but at the same time defining a generic interface to access their members.

So, if you define your structures like:

struct Interface {
    int foo;
    void (*bar)();
};
struct OneClass {
    int foo;
    void (*bar)();
};
struct TwoClass {
    int foo;
    void (*bar)();
};

and implement the methods (here represented by function pointers) like:

void one_class_bar () {
    printf("OneClass.Bar()\n");
}
void two_class_bar () {
    printf("TwoClass.Bar()\n");
}

and associate the functions created to the objects (you could use a Factory for that), you have three different classes, still not connected. The next step is to connect them via the union:

typedef union {
    struct Interface i;
    struct OneClass o;
    struct TwoClass t;
} Object;

and you have just created the generic object that could hold both OneClass and TwoClass and be accessed via the Interface. Later on, when reading from a list of Objects, you can access through the interface (if you build your classes with parameters in the same order) and it’ll call the correct method (or use the correct variable):

Object list[2];
/* Setting */
list[0] = (Object) one;
list[1] = (Object) two;
/* Using */
list[0].i.bar(list[0]);
list[1].i.bar(list[1]);

Note that when iterating the list, it access the Object via the Interface (list[0].i) and not via OneClass or TwoClass. Although the result would be the same (as they share the same portion in memory, thus would execute the same method), it’s conceptually correct and compatible with object oriented polymorphisms.

The code above produces the following output:

$ ./a.out
OneClass.Bar()
TwoClass.Bar()

You can get the whole example here. I haven’t checked the GCC code but I believe that they’ve done it in a much better and more stable way, of course, but the idea is probably be the same.

Disclaimer: This is just a proof-of-concept. It’s not nice, they (GCC programmers) were not proud (at least in the podcast) of using it and I’d not recommend anyone to use that in production.


Silly prototypes of the week
August 7th, 2008 under Algorithms, Devel, rengolin. [ Comments: 1 ]

This time I don’t have a project, nor probably will have time to implement them (I’m on real holidays!), so I’ll just post the ideas and welcome any comments on the feasibility or usefulness of them. Feel free to implement them and let me know how it went.

Independent Bayesian Agents

An Artificial Neural Network is a collection of neurons connected in meaningful ways that, when the information, coming from an input layer, passes through them, a resulting signal is produced on the output layer. The learning is done via feedback (good or bad), by changing the internal functions of each neuron to accommodate them for the next “pulse”.

The problem with this is that, after the learning phase, the resulting network can only classify a strict class of problems. Networks created to recognise alphabetical letter in an image has a different format and components than those to recognise other patterns. There are some implementations where the network itself can change its connections in order to learn something new, but that’s not commonly seen, not even on many scientific works on the subject.

You can have the same network structure using Bayesian nodes. The class of problems to be solved are not the same but the overall logic is similar.

Bayesian Agents are different. They are independent, learning programs that might be able to communicate with other agents upon necessity and also update their internal states from feedback and previous probabilities. It also has the advantage to return not only true/false values but probabilities or even probability distributions to the user.

But this way, your agents will be much more complex than the neurons on ANN, and that’s a path that is rarely worth taking if you want to build emergent behaviour as it becomes more and more difficult to combine them to produce unexpected behaviours. So, the idea is to break all the agents into very small pieces that can interact with each other, with just a few rules.

This way, the agents themselves would be pretty much like objects in an OO design, following the implementation of a class (agent definition) and inheriting or using methods from other Agents (interaction) you can, at the same time, build a network with any format you want (using graphs or even network connections via name service or so for the interactions) and keep it simple, by following the agent definition and not trying to put any complex logic on each agent.

Same as OO design, you don’t want to create one single class that does everything. What you do is a relationship between classes that express the problem set you’re solving and, if well designed, you can reuse some of the components to solve other problems as well. The power of the OO design is the relationship between the classes and not what the classes are actually doing, right?

So, my hunch is that, if we do very simple agents and let the network itself be the relationship between them, we acquire a new power of extensibility.

Also, by changing the connections, the learning happens at the network level instead of the agent level. Further on, by inflicting an “explorer behaviour” on each agent (like extending from a common class Explorer for all agents), they can search for new answers once in a while, in background, to increase their confidence levels.

In the same line, by extending from an agent Feedback you can make all agents be able to process feedback from the users, from its derived classes such as BooleanFeedback returning good or bad, or MultipleChoiceFeedback, ContinuousFeedback and so on.

Genetic programming using template Policies

Learning by structure given above (network structure) is similar to genetic algorithms, but in a different level. Genetic programming allows you to learn by the structure of the program itself. A standard way to solve that is using a rulebase approach. Create a big set of simple rules and write programs to use them in a mixed fashion.

Given the problem you want to solve, run all programs against it and try to define the quality of each solution (if they’re at least able to get somewhere) and give those programs closer to the solution a higher chance of survival, but don’t kill the unfit, and that’s an important step.

If there are good, but dormant, rules (genes) on the unfit and you kill them, you’re removing from genetic pool (rulebase) some solutions that could be the best when joint with others that you didn’t have the chance to join yet. Anyway, crossing those programs (by exchanging rules and creating new combinations) and running the next breed on the same problem you can, iteratively, lead the evolution towards the best solution by chance.

So, what about template policies?

C++ templates are very powerful. One of the great powers (which comes with great responsibility) is to be able to inherit functionality from the template argument.

template <class EatPolicy, MovePolicy, ReproducePolicy>
class LifeForm : public EatPolicy, MovePolicy, ReproducePolicy {
};

With this, you can create a life form following one of the policies provided. They will inherit the methods and properties of the policies as if it was a normal OO inheritance. So, to create a bacterium, just implement some policies inheriting from EatPolicy, MovePolicy and ReproducePolicy and instantiate like this:

LifeForm<Fermentation, Osmosis, Mitosis> bacteria;

You can also create several policies inheriting from Fermentation (from different materials, etc) and create several types of bacteria, but the problem is how to cross them later. Because templates are evaluated during compile time (and types are created), you can’t create new types during run time and the crossing over should be done off-line (and recompile). A few pre-processor commands and a Perl script could do it, though, and isolate that nasty part from the rest of the code.

I’m not sure of the implementation details of the cross-over but the basic skeleton did work: Skeleton of genetic policies. Would give it a try later on.


Silly project of the week: molecule dynamics
July 9th, 2008 under Algorithms, Devel, Physics, rengolin. [ Comments: 1 ]

This week’s project is a molecular dynamics simulation. Don’t get too excited, it’s not using any of the state-of-art algorithms nor is assembling 3-dimensional structures of complex proteins. I began with a simple carbon chain using only coulomb’s law in a spring-mass system.

The molecule I’m using is this:

Molecular Dynamics

The drawing program is quite simple and wont work for most molecules, but for the 2-dimensional simple molecules (max. of 3 connections per atom) it kinda works.

Later on, putting the program to run, each atom “pushes” all others electrically and the spring “pulls” them back. A good way to solve that is to say that q1 . q2 / x² = – k . x = m . d²x/dx² (where x is a vector) and integrate numerically using Runge-Kutta.

But that’s my first openGL program, so I decided to go easy on the model and actually see it pseudo-working with an iterative-based simulation following the same equations above. This picture is a frame after a few iterations.

Quoting its page: “As this simulation is not using any differential solution, the forces grow and grow until the atom becomes unstable and break apart. Some Runge-Kutta is required to push the realism further.

UPDATE:

The webpage of the fully-functional prototype is HERE.


Markov chain available for NumCalc
June 13th, 2008 under Algorithms, Devel, rengolin. [ Comments: none ]

NumCalc is my personal numerical methods program where I’ve implemented some nice algorithms for numerical computation. The new in the list is Markov Chain.

The Wikipedia article (link above) is far too complex… I’ll try to give a simplified explanation:

A travelling salesman goes back and forth in a set of cities and, given the city he is currently in, you want to know what’s the next city he’ll travel. Of course, he won’t show you his travel itinerary.

The simplest way of doing it is to record all travels he does within time. For each city, you have a counter of how many times he went from each city to all other. If you think these numbers as a portion of all the travels from each city you have a probability of going to any other city in the list.

Example: When he was on Paris, he went 3 times to London, 2 times to Amsterdam and only 1 time to Milan. It means that, 3 out of 6 times (50%) he went to London, so the probability of going again is 50%.

For such small quantities it’s weird to assume that the behaviour will be always the same (he can go to new cities as well) but when the amount of statistics you have is big, the behaviour become very repetitive and thus, predictable.

Real Cases:

  • MegaHAL uses an advanced Markov model to create chat bots by replying what people said before based primarily on the sole probability of one word coming after the other.
  • HMMER is hidden Markov model (a Markov model to predict another Markov model to predict something else) that can do powerful searches within long and scrambled sequences of proteins and genes. The IntrePro group use it to find their protein matches against UniProt.

Of course my super-simplified model is far from being that efficient and useful, but it’s a good start to understand how simple and how powerful they are. You can download it from its webpage.


« Previous entries