I am starting my doctoral studies, and my adviser keeps telling me decide my research topic. I already tried to tell her and other people what are my main concerns in contemporary research in my field, and what is the problem I would like to contribute to, but I haven´t succeeded in convincing anyone yet. I am writing this here to see if anyone can comment on my ideas, and help me explaining what I believe in.
My field is computers, signal processing, AI and mobile robotics. For one side, I like neuronal networks and other said “connectionist” structures of computation and methods of learning. But I’m not a NN guy, I like complex computers, and complex programs and mechanisms. I like the shape of the computer structures that the so-called “symbolic paradigm” uses. But I don’t see the wall between the two areas, and I want to make a study to help bringing the two together. The way to do this is mainly studying the performance of different systems, and analyzing them “from the outside”, peeking at the computational structures a bit like a biologist and using tools like dynamical system analysis, and information theory.
What I don’t like in neuronal networks research is that the final structures obtained are never quite complex. And more specially: they are often memoryless systems. I don’t think we should research so intensely memoryless feed-forward network anymore. The inclusion of memory in neuronal networks studies is the most important next step to be taken, and we are taking too long to do it.
What I don’t like in “logical/symbolic” studies is that everything has got to have a damn meaning. People build those very interesting cognitive architectures, and go on tagging each processor and memory block with a name, trying to explain everything. It’s OK to have an inspiration, but why can’t we be a bit more uncertain and obscure?... Like it happens in neuronal networks? There we often end up with a bunch of weights that doesn’t make sense to us. It is not pleasurable, of course, but first of all, it’s not very different from the brain itself, so it might be perhaps a requirement to achieve that strong-AI system of our dreams. And second, people are too hedonistic, wanting to have pleasurable sure and certain names for everything. What about “the beautiful theories slain by ugly facts” (T. Huxley) and “the pursuit of truth instead of avoid suffering” (Poincaré)??
So, what I want to build is a system that looks just like all those cognitive architectures, with different layers of memory and processing. But I don’t want to give names to things. I don’t have an idea what each memory block will “mean”, or where will be the “reasoning” or the “analogy” or the whatever block of the system.
What I want is to create somehow those complex computer systems, and look at the shape of their structures and compare it to the performance obtained when I use them for something. Say, for controlling a mobile robot in making a certain task. (The most difficult part will probably be the “somehow” above.)
...The most important thing of all is how we look at the shape of this structure. We will look at what might seem at first a very intricate recurrent neuronal network, but we will identify basic blocks in this structure. We will somehow (applying information theory analysis) identify different memory and processing blocks.
What we will find out is, for example, that some tasks might be impossible with a feed-forward system. Sooner or later we always need memory.
But we can’t just go on increasing the memory and complexity. Sooner or later again we will need not to just increase things linearly, but start to create hierarchical structures, like I described above. Somehow we must have a gain at breaking the structure. It’s a bit like NN people talk about adding layer to MLPs. This is not often advantageous with feed-forward networks, but it makes sense here because of the memory at each layer. It also has to do with pruning and mixing specialists… But it’s different because it’s not just a feed-forward network anymore.
I want to show that the shape of the computational system of those cognitive architectures has inherent benefits. that it is better then a feed-forward network, better then a simple network of these with a single memory block, and better then a very large network with a very large block. I believe the hierarchical structure has some kind of intrinsic benefit, and I want to make it explicit somehow.
...And the most interesting part of all will be this identification of the blocks in the structure based on the way the memory is used in the system, and on “information bottlenecks”: parts of the network where many inputs become few outputs, making some sort of codification…
I’ll stope here because it is staring to sound more as sci-fi then as sci. :)
Any comments, idea, suggestions, critics?...
(By the way, I am a very huge fan of Marvin Minsky and of William Gibson.)