February 2010 ( Ph.D. student )
This article elaborates on my four-word summary of the essence of Computer Science as a field of study: Efficiently Implementing Automated Abstractions . I also provide one-, two-, and three-word summaries for those who want to be more concise.
Many people who study Computer Science (myself included) have trouble explaining what exactly it is to people who are not in the field. This article is my attempt at defining what Computer Science is as a field of study, in four words or less.
Apologies in advance if this article sounds overly fluffy or bullshitty; I don’t often get to pontificate about such lofty abstract ideas. Adults who have real jobs probably think that since I’m working towards my Ph.D. in Computer Science, I must spend all day sipping lattes and engaging in philosophical discussions with my colleagues in some Ivory Tower; in reality, I spend most of my time ‘in the trenches’ hacking on research prototypes and debugging memory errors in C code. So here comes my rare attempt at being ‘scholarly’ … enjoy!
It’s nearly impossible to summarize an entire field in one word, but if I had to choose just one for Computer Science, it would be abstraction . At its core, Computer Science is about building clean abstract models (abstractions) of messy, noisy, real-world objects or phenomena. As Computer Scientists, we must choose what to include in our models and what to discard, to determine what is the minimum amount necessary to model in order to solve our given problem to the required degree of accuracy.
Computer Science is a means of solving real-world problems, and it all starts with abstraction. For example, the first step in building an automated movie recommendation system (like what Netflix does ) is to choose what features of movie watcher behavior to model, to abstract a movie watcher to a set of relevant metrics (e.g., number of times he’s rented Arnold Schwarzenegger films). Ideally, we would like to fully model the human brain so that we can make near-perfect recommendations, but that’s obviously intractable (for now, at least).
Automating Abstractions(two-word summary)
The word I would immediately add to my summary is automating . Related fields like mathematics and the natural sciences also involve building abstractions (e.g., of geometric shapes or atomic structures), but their models only serve to describe and explain, not to evoke actions. What makes Computer Science different is that it deals with putting the models into action to solve problems. This involves creating algorithms , which are step-by-step instructions for performing actions on and with the data that we have modeled.
In our movie recommendation example, once we have formed the proper abstractions (models), then we need to figure out how to act on them in order to make recommendations. One very simple algorithm would be to simply recommend more Arnold movies to people who have rented Arnold movies in the past, or perhaps to throw some Stallone movies into the mix.
I’ve purposely not chosen automating as my first word, since automation existed far before the invention of Computer Science. People have thought about automation for thousands of years, starting with building tools to automate aspects of farming and culminating with the assembly lines of the Industrial Revolution. Instructions for assembling cotton gins or guns or automobiles are definitely algorithms for automating tasks, but they deal with the world of concrete, real-world objects; no abstraction is needed.
In contrast, abstraction is essential for solving problems like effectively finding information online, predicting stock prices, optimizing flight plans to save gasoline, or automatically detecting credit card fraud; that’s part of the reason why these are Computer Science problems (rather than, say, industrial or process engineering problems).
Implementing Automated Abstractions(three-word summary)
Notice that I haven’t mentioned computers at all so far; that’s because, at its core, Computer Science doesn’t involve computers at all. You could imagine creating the requisite models and algorithms, then handing them off to a team of people to execute in an assembly line filled with pencil and paper. That would still be doing Computer Science per se, but it would be too slow to solve any practical problems.
Computers are electronic devices that are engineered to be amazingly fast at executing algorithms on data, so to solve any real Computer Science problems, we need to implement our models and algorithms (the ‘automated abstractions’) in the form of code (instructions) that the computer can understand.
This is where the rubber hits the road. So far we’ve been dealing in some intangible fairy world of abstract models and algorithms. We might jot our notes down on paper, formalize them and do mathematical proofs about their properties, or write scholarly papers trying to persuade others as to why our chosen models and algorithms ought to work well. However, the best proof that our proposed solution truly works comes from actually implementing it in the form of a computer program, executing it on a computer, and using the output to affect the real world . No matter whether the output is shown to a person (e.g., Google search results) or fed into a mechanical device (e.g., airplane autopilot system), it has a direct, tangible effect on the world. If an implementation actually works, then nobody can argue that it doesn’t work (this statement sounds silly, but you really can’t achieve that high degree of certainty that you’re undeniably correct in most other life endeavours).
The beauty of having a properly-functioning (correct) implementation is that it cuts away all the subjective bullshit . As a user, the only way that Netflix can convince you that its recommendation system works is if it actually provides good recommendations for you; without a working implementation, no amount of persuasive rhetoric from the CEO or even mathematical proofs from resident theorists (which might contain logical flaws or unrealistic assumptions) can convince you otherwise.
Efficiently Implementing Automated Abstractions(four-word summary)
The final word I’m piling onto the summary is efficiently . First and foremost, we must ensure correctness in our implementation; it doesn’t matter how fast your code runs if it doesn’t properly solve the problem. Next, we can think about efficiency , designing our data models and algorithms to run quickly while taking up the minimal amount of resources (e.g., memory, hard disk space, electricity). Computer Science researchers have developed many theoretical and empirical techniques to make implementations of algorithms more efficient.
So how efficient is ‘efficient enough’? Well, it depends on your particular application. If your movie recommendation algorithm must run for 1 year before giving results to the user, then it’s useless. (Actually, if you ran today’s Netflix or Google algorithms on computers built 20 years ago, they would probably actually take a year to run!) But the difference between it taking 0.5 seconds to run and 0.005 seconds (a 100X factor) might not matter to most users.
Again, here are the 4 words I would use when describing the essence of Computer Science, in order of significance:
- Automating Abstractions
- Implementing Automated Abstractions
- Efficiently Implementing Automated Abstractions
Note that I purposely didn’t use either the word ‘computer’ or ‘science’ in my summaries, since I don’t think they properly embody the essence of the field. I’ll finish with my thoughts on those two words, though:
Computersare merely a tool for implementing Computer Science ideas. Edsger Dijkstra, a pioneer of the field, once said, "Computer Science is no more about computers than astronomy is about telescopes." That said, though, since computers are the link between Computer Science and the real world, lots of Computer Science research is actually geared towards improving the capabilities of computers! Researchers and engineers write computer programs to help them build more powerful next-generation computers (with faster processors and more storage); this creates a wonderful positive feedback loop where the next generation has more powerful computers and so are able to implement even more ambitious Computer Science ideas, including ideas to improve future generations of computers. Also, lots of Computer Science research is geared towards helping people interact more productively with computers; if computers didn’t exist, then there would be no research in sub-fields like programming languages, operating systems, or human-computer interaction.
And as for science , I believe that one must constantly use the scientific method of hypothesis creation, testing, and refinement when designing, implementing, debugging, and tweaking implementations of Computer Science ideas. Nobody designs or implements an idea correctly on the first attempt, so that’s why a methodical, empirically-driven scientific mindset is required to create correct and efficient implementations of Computer Science ideas.
Created: 2010-02-04Last modified: 2010-03-06