Sunday, June 30, 2013

The Biology of Computer Programming


Computer Programming (CP) is a human activity. 

Humans are organisms. So CP can be investigated biologically.

Scientific investigation makes use of any method that produces results: the further understanding of the natural world. So, let's take two complementary, established approaches to investigating human activity within the biological sciences. 

Approach 1: CP is a human experience that is partly accessible to consciousness. That is, there is some conscious experience, likely a very small part of the overall activity, which people who do CP are aware of. This is a repeatable awareness, found in common to all people who do CP, and although there isn't much outward indication, the moments of the experiences are mostly identifiable by informants. At the very least, I know when I perceive that I'm doing the various tasks that constitute CP. 

Approach 2: where, in the moment of the awareness of these experiences, we find externalized indicators (perhaps with instrumentation).

Principles of investigation

In a nod to Franz Boas, I need to say: I couldn't investigate this question if I wasn't a computer programmer. 

In a nod to 17th & 18th century philosophers of the mind, for example David Hume or Dugald Stewart, I assume that these internal experiences are small indicators of large, complex, inherited mechanisms that shape our mental activity and our actions in the world. 

In a nod to ethologists such as Niko Tinbergen and Konrad Lorenz, I can assume that these are significantly instinctive abilities, and so can be investigated for any member of our species. 

In a nod to geneticists, I can assume that these abilities are highly enabled and shaped by our genetic endowment, with phenotypic modifications that take place during development and life.

In a nod to Noam Chomsky, I can use myself as an experimental subject, an informant in linguistic terms, and begin to ask increasingly sharp questions, and perform experiments, regarding aspects of these human experiences that we all take for granted. 

In a nod to Christopher Alexander, I can use the informant method to examine in detail extremely obscure, complex, hard to find, yet universal, human experiences. 

In a nod to recent use of fMRI techniques by cognitive neurologists, I can turn these experiments into external readings, which help us to begin to sketch out the neurological structures that are part of the activity, and the relation of these activities to other complex human experiences.

In another nod to Chomsky, we assume that any human experience, or any biological feature, is composed of some combination of these three factors (A, B, C): 

A) natural law: for example, there is no gene that says a single cell must be spheroid rather than some other shape -- this is natural law. Genetics operates within these biophysical constraints, which are still poorly understood.

B) biological inheritance: genetics, epigenetics, and various known biological mechanisms -- and poorly understood ones, for example, structure-preservation during morphogenesis, and whatever makes multicellular organisms cohere so robustly.

C) external stimulus: from the environment, during the development and life of the organism.

Approach 1

Let's begin with a general interview of our informant. How does CP feel?

CP feels like any craft, in some way, with some planning, some building of a strategy, some carrying it out, some storytelling (to oneself and others, parts of which become part of the product), a great deal of trial and error, and the use of a great many unknown mental capacities. There is also a sense that one must endeavor to keep the structure of the program coherent, and the user interface must be effective. There are little things that we do, which are quite important, like trying to keep the code well-organized.

That's a very complex human activity. So let's tease one tiny possible CP piece for testing by our informants.

A tiny theory

Here's a small, testable theory: a programmer considers one aspect of "good coding" to be the representation of the same code by the same name. I posit, in this theory, that the human "moment of recognition" of "the same code doesn't use the name" and the "resolution" of this, by "replacing the code" with the "previously defined name for that code" is carried out by people in the same way, activating the same pattern within the brain, that one would activate when putting identical objects into the same pile or while putting books into sets on a shuffled bookshelf.

I may be wrong here. It's just a theory. It also may be far too complex a comparison, requiring simplification. 

Also, there are many questions one could ask about CP mentations that only require comparisons between very similar coding activities (see the conclusion for an example). But, for broader interest, I thought I'd include a test of a broader comparison between mental activities.

The programming side of the experiment is pretty straightforward: in front of the subject, we place a few lines of code. There are a few variable definitions at the top. One complicated value sits on the righthand side of one variable definition. The same complicated value has been placed elsewhere, within the few lines of code. We ask the informant to "clean up this code":


  • var x = a.b.c.f(15);
  • var y = 0;
  • print x, y;
  • y = z( a.b.c.f(15) );
  • print x, y; 


We then create an equally simple version of the comparison activity, and have our informant perform the two, and ask if it feels like the same kind of mental activity. For this methodological discussion, let's assume that we had it right, and most of them said "yes".

Approach 2

At this point, we can bring in an fMRI, and more informant-subjects. We scan for normal brain activity for the action of typing, cutting and pasting, looking at a screen, etc. For our comparison activity, we scan for normal identifying and shifting of objects, books, etc. We then subtract these from the scans during our two experimental situations, look at the remainder, and see if they are the same.

Clearly, such an experiment could show the theory to be inconclusive. At that point, the theories, questions and experiments would need to sharpened up.

The limits of comparison studies

It's quite likely that every mental aspect of programming cannot be revealed through comparisons with other kinds of activities. There are many times when programming feels like nothing else, although it is possible that there is some mental activity that is simply exaggerated by programming.

In these cases we will continue to iterate between the approaches until we can more clearly identify such unique mental activities, and then probe them by constructing more subtle experiments.

The Scale

It is likely that there will be hundreds of mental theories of CP that could be identified, and confirmed or rejected, in this way. It is likely that unifying these theories, simplifying them as much as possible, will be difficult work, but hopefully revealing.

Engineering

At that point, yes, we could, perhaps, begin to use this information to model human CP activity, creating a computer simulation that, with some stimulus and direction, might do something we could metaphorically call 'programming'. 

Of course, we could create such "human simulations" without doing any biological research on the actual human mind -- this has been done in computer science and in the computer industry for at least 50 years. That's the field of Artificial Intelligence, which, in its rush to a "solution", has invented hundreds of techniques that have almost nothing to do with the actual science of biology. Artificial Intelligence, and engineering in general, is not science. Science is our attempt to discover what is actually happening in the natural world. Science is not the drive to imitate nature. It's the drive to understand it.

Conclusion

I believe that moving the study of computing into biology can be motivated by the construction of many testable theories, something that any programmer can begin to do. The reduction and teasing out of the features of CP, as a human activity, the identification of fundamental actions, is something that is commonplace in the engineering world, as a way of passing along principles of good engineering practice: patterns being a recent example. Repositioning and refining these practical insights into biological theorems is very difficult, and leads one to very different questions than engineers are used to asking. 

Just using approach 1, coding activities can be reduced to increasingly interesting questions, such as, "is grouping code into a procedure a completely different activity from dividing one procedure into two, and if not, what are the overlapping mental activities?" These can be tested just using approach 1. I think if we are going to have a true science of computing in the future, it will be essential to examine these kinds of questions.

No comments:

Post a Comment