Monday, February 18, 2008

SofSW: The Basics (2/2)

The bearded man returns to his seat. Five minutes ago he left his seat and went to the restaurant cart. I can deduce it easily from the 15 centimeter long sandwich he is holding. He sits at a distance of about a meter and a half from me and I can smell his fresh bread. Suddenly I’m hungry. I fight the urge to go to the restaurant cart, I don’t have the time anymore. I planned to eat at the airport and according to the schedule it can’t be more than seven minutes now. But we were seven minutes late at the last station, if the train hasn’t managed to catch up at all, it might still take fourteen minutes or more. Nobody has ever starved in less than half an hour.

The airport is located 45 kilometers from the previous station and the travel time according to the timetable is 27 minutes. With a piece of paper and a pen I make the average speed of the train to be exactly 100 km/h. The train has 15 carriages, each about 20 meters long and there’s the engine. Assuming it is as long as each carriage, the whole train from the front of the engine to the end of the last carriage is 315 meters long. Assuming that there’s a meter of empty space between each carriage and the engine, we’ll need to add fifteen meters so the total length of the train is 330 m. Let’s say that you’re standing at a platform and looking at a beautiful girl on the next platform across the railway. She’s absolutely fine looking in her short skirt and tight shirt. She has long hair that shines in the sunshine and you’re sure that if you could see her eyes, they would turn out to be perfect in a perfect face. If she saw you staring and at the same time the train would go past on the track between the platforms without slowing down with an average speed of 100 km/h, she would have only twelve seconds to run away and disappear. She should really hope for a train that was stopping.

The basic measurements of the physical universe are simple to use and understand. They help to grasp the nature and its phenomena around us, but what about the immaterial universe? Now I don’t mean the spiritual world, which only exists in the imaginations of people, I mean the world of software. It is a strange universe of ideas and reality, of formlessness and function, of thought and implementation. In the world of software, immaterial ideas are turned into programs, which can produce extremely tangible results in the real world. What do we have to help us grasp this mostly mental meta-world of mind-artifacts? Not much.

Software as an industry has been around for less than a century and is one of the youngest areas of industry. It has evolved a lot since the first computers and every now and then there have been great revolutions so that a new generation of programmers has started making its journey on fresh ground with new thinking and considerably more power than the previous one. The situation now is that the reigning programming languages have been around roughly since the nineteen seventies. Clearly the most popular programming languages is called ‘C’ and the object-oriented languages of today - C++, Java and C# - are really offshoots of C. But I don’t want to think about programming languages. Not at the moment at least. I want to think about the basics of software. What would be the equivalent measurement for size in the real world? Software doesn’t have a mass, but is there something equivalent to the concept? What about time?

A program is usually written as text in stored in a file on a media readable by a computer. The text has a specific form that it needs to have in order to work as expected and all that is defined in the programming language definition. That matters not at the moment. The text files written in the programming language form the source code of the program. There has to be at least one file for a program, usually there are several. Biggest programs consists of thousands and tens of thousands files sprinkled in the branches and leaves of a complicated folder structure. Just finding the right source code file is sometimes challenging, but the challenges truly start when we start looking in the source code files themselves. Each file consists of lines of text, which many tools like compilers and preprocessors need to read and process. The tools aren’t important yet.

If we compare two programs, and especially their source code, there are obvious differences. The lines in the files are different, with a few exceptions probably, even some files maybe duplicated between the programs, but mostly the source code files are unique. We could perhaps use the size of the source code files as a basic measurement in the software world. Let’s try that. Let’s define that a length ‘l’ refers to the size of a source code file in characters. Characters are the atoms of text, they are indivisible, the smallest ingredients of text. In that sense this definition would actually be closer to the definition of mass ‘m’ than length ‘l’. Mass could be defined as the sum of the atomic masses of each atom present in the object. Length refers to its one-dimensional shape. It doesn’t work that well. Good. Failed ideas can work as seeds for new ones. Once you realize an idea won’t work, you can discard it and move to looking for the next idea. It’s bad to try and work with an idea not realizing that it won’t work. The longer you keep banging your head against the wall, the more likely it won’t work properly later.

Maybe the level of abstraction is wrong. Maybe the physical dimensions of a source code file have no meaning. But that can’t be. Two files, one twice as long as the other, both implementing the same functionality are different. Yet they are the same. We need to have a unit that shows that. The simple idea is to use one unit for the size of the file and another for the intellectual content of the source code file. These can either be two different units or two different dimensions. Is code one-dimensional, two-dimensional or three-dimensional like Matrix wanted us to view it?

For now, I’d like to separate the concepts of material size and intellectual size. Let’s use length ‘l’ to refer to the material size, or file size, i.e. the number of characters in it. The intellectual size would be equivalent to mass ‘m’ and let’s call it ‘mental-mass’. It refers to the number and size of ideas implemented in the file. We’ll have to define it more carefully later, but for now let’s imagine that each statement in the source code file is part of an Idea and has some mental-mass. A whole idea will have a mental-mass that is the sum of the mental-masses of all the statements that comprise it. Now we can say that two programs can be equal in mental-mass but differ largely in size. This is a start.

There’s no more time to think further, the train is arriving at the airport station. I pack my stuff, put on my jacket and leave the train. I have to squint a little as the platform is bright in the sunshine and my eyes are used to the dim indoor lighting of the train.

Labels:

2 Comments:

At 11:37, Blogger Virpi said...

So langsam wird's interessant...

"Girl in a short skirt and a tight shirt." I do hope she can run away in only twelve seconds. Or that you can run!

You ARE getting too deep, my darling. And don't come telling me it's all just physics.

HAHHAH! Just joking. Really enjoying your journey. Most of the times I even understand what you're talking about. How about that!

Mach weiter so!

 
At 21:32, Blogger Miska said...

Thanks for the kind(?) words of encouragement(?)

 

Post a Comment

Links to this post:

Create a Link

<< Home