神刀安全网

Why Aren’t There Simple Programming Languages?

Most programming languages are extremely complex and hard to learn. They confront learners with a steep learning curve. You need to understand a lot before you can write even a tiny but useful program. Which means that you need a lot of motivation to keep going. Many give up, and those who succeed also face frustration at some point or other in the process.

There’s no need for programming to be so complex. Other tools come in varying degrees of ease of use. A knife is easier to use than a grinder. A bike is easier to drive than a plane, and a cycle is perhaps even easier.

When a tool is easy to use, an order of magnitude more people use it to solve their problems, and benefit from it. Tools that can be used by everyone are democratising. I find it much more satisfying to build a tool used by ten million people than one by a million people, even if the former can do far less.

In addition to the majority of users sticking with a simple tool, even people who move on to a complex tool often start with a simple one. Even an advanced programmer like myself would prefer a simple tool for a simple task, if it reduces friction. The importance of simple tools can’t be overstated.

The top 10 languages in the TIOBE index all have a lot of complexity like classes, access control, modules, exceptions and so on that you don’t need for simple tasks.

The languages that are widely used today are all very complex ones that repel most learners. It’s as if we had only SLRs and no smartphones. Where is the smartphone of programming languages?

By definition, it would do less than Java or Ruby. What are some sacred cows that we can kill?

Scripting Languages

Scripting languages were supposed to fill this niche. But they are extremely sophisticated and powerful in their own regard, in many ways more so than a statically typed language like Java. Ruby, for example, has many of the features of Java, like classes, methods, encapsulation, overriding, and so on. And much more, like mixins, metaclasses, tainting and so on. We need a language simpler than Java, not more complex.

So, if scripting languages are not the answer, how would one go about designing a simple language?

Goals

A simple language, by definition, can do less than Java or Ruby. To design a simple language, you need to recognise that, and embrace that. At each decision, prioritise simplicity over other qualities like power, performance, reliability, programming in the large, and so on. It would not be as amenable to large teams working together to build non-trivial software like a photo editor, or to codebases that are maintained over years.

A simple language would avoid offering two ways to do the same thing, like functions vs methods. It would offer only one. Which would be the simpler one, if possible.

A simple language would also eliminate features that improve the reliability of the software, but at a price in complexity. Static typing is the obvious example, but another example is exception handling. Exception handling adds complexity to do what you can do by returning an error code. It doesn’t matter whether exception handling is better or worse; it’s more complex, and that’s sufficient reason to do away with it.

So, to build a simple language, whenever you make a decision, you prioritise simplicity over power, reliability, performance, early error detection, etc.

The Language

What features might one remove from Java or Ruby, say, to make it simple?

A simple language would not be object-oriented. That, at one stroke, gets rid of a huge amount of complexity: classes, objects, constructors, access control, subclassing, interfaces, dynamic dispatch, class cast exceptions, functions vs methods, finalisers, and so on.

As I mentioned above, you’d get rid of static typing. You’d have a simpler type system that enables type inference to work, and to detect the vast majority of errors as the user is typing in the code. Unless you do something like conditionally assigning either a number or a string to a variable, the type inferencer should be able to detect all type errors.

Dynamic typing, of course, lets you have functions that handle different types without needing templates. You can have a function that works on lists of different types, like a list of numbers of a list of strings. Or different collections like lists and maps.

The language would be garbage-collected. Everything would be handled by reference, like Ruby, as opposed to having some value types and some reference types, like Java or C++. Why have both, since you need references, anyway?

Variables would automatically get created when you first assign to them. Accessing a variable before that returns null. You wouldn’t have to worry about the difference between declaring a variable and assigning to it, and the difference between a variable not existing and it existing but being null. Which is like drinking from an empty glass.

One drawback of doing away with variable declarations is that typos don’t generate an error or warning. But, in practice, the IDE could use heuristics to detect most typos as you type in your code. If you have a variable named inputFile, inuptFile is probably a typo, while outputFile isn’t.

Data Types

There would be just one number type, like Javascript. It would have infinite range, so you wouldn’t have to worry about overflow. In addition, you’d have a string, a boolean, a list and a map.

You wouldn’t have fixed-size arrays like in Java, because having a fixed-size and a variable-sized type is unnecessary complexity. Just have a list like Ruby or Python.

You’d also have a map, like a hash in Ruby or a dictionary in Python. Like in Javascript, you would have syntactic sugar of the form map.key as a shortcut for map[‘key’].

There would be no new operator. You implicitly create a list or a map as a literal. You could also have a clone function, which is polymorphic — clone a list, and you end up with another list. Close a map, and you end up with another map.

There would be no user-defined types in the form of classes, structs, enums, unions or typedefs.

There would be no instanceof operator, since you don’t have classes. And no casts. Conversion of basic types would be done using functions, like ToString(5) or ToNumber(“5”).

A simple language wouldn’t also have const or final. Const-ness raises a lot of questions — is it the reference or the object that’s const? If I pass an object to a function, does it modify it? If so, does the function need to be declared as const? That means all functions it invokes need to be declared const, recursively. Const causes a huge amount of complexity and duplicated functions in C++. Final in Java causes a lot less complexity, but doesn’t do much as a result. Just get rid of const and final.

Operators

Current languages have a lot of cruft here, too, that can be swept away. First, get rid of the all the bitwise operators: logical and shift. They are not used often enough, unless you’re doing low-level programming. They may be appropriate in C, but not in high-level, managed languages.

Besides, & can be confused with &&. Even if you don’t care about bitwise operators, you should know there’s & and that you should not use it. That’s extra complexity.

Getting of shift also eliminates fine distinctions about what happens with negative numbers, logical (>>>) vs arithmetic (>>) shift, and so on.

So, get rid of all the bitwise operators.

On the topic of getting rid of operators, dump the comma and unary plus operators.

And the increment and decrement operators. To begin with, they are not needed: you can instead do a += 1. There’s no reason to have special syntax for incrementing a variable by 1 when there isn’t one for 2 or 3. Getting rid of ++ also eliminates the confusion between ++a and a++. You will do just fine without this, anyway. And the code will probably be clearer.

Assignment operators won’t return a value to prevent questions like: will

a = b = c

assign the old or the new value of b to a?

Some operators are less confusing as named functions. For example, spell remainder out, rather than using a misleading % sign, which indicates we’re calculating a percentage. The remainder function can take a list of two numbers.

That also solves precedence problems:

8 % 3 * x

can mean two different things, leading to confusion and bugs, while

remainder [8, 3 * x]

is unambiguous.

For that matter, even the logical operators are more understandable if they are named, like and, or, xor and not, rather than as cryptic symbols like || or !. & is reasonable but, outside of computers, | and ! aren’t used to mean or or not, so get rid of them. A programming language is a UI for programmers to tell the computer what to do, and a good UI is obvious at first glance even if you haven’t used it before. &&, ||, ^ and ! operators are poor UI.

In summary, get rid of operators that are used rarely. Convert operators that have an unintuitive symbol to a function with an understandable name. Simplify the precedence table by having fewer operators and fewer groups.

Scope

To keep things simple, the language would disallow collisions between names of variables in different scopes. You won’t be able to have a local and a global variable with the same name, for example.

A function has a single scope, so a variable initialised in the body of an if statement would continue to be accessible for the rest of the function. Remember that there’s no such thing as a variable declaration. You create a variable by assigning to it. So, you wouldn’t end up with two variables with the same name in the same function, which is confusing to novices.

You also won’t be able to have symbols whose names differ only in case or underscores. That is either a programmer error or confusing naming.

Control Flow

There would be no difference between a statement and an expression, so you’d be able to do things like:

a = if b then c else d

The language wouldn’t have a switch statement, since if works fine for that. Switch is also counter-intuitive since it doesn’t work for all data types, like strings or objects, which it logically should. Switch brings in additional complexity like fall through and a default case. All that can be eliminated by getting rid of switch.

You wouldn’t have a continue statement, since you can always convert a continue statement into an else statement that executes the rest of the body of the loop.

Talking of loops, we can get rid of all three types of loops — while, do and for — in favor of a single, simple loop, via a ‘repeat’ keyword:

repeat {

// Loop body

}

The loop body would have a break statement somewhere in it. That way, we can eliminate all three kinds of loops. And eliminate the need for a while(true) loop, which is actually a hack to fix the problem that you need a condition but don’t want to give one. Such hacks are counter-intuitive to beginners. Why do things in a roundabout manner if you do them in a straightforward manner?

You wouldn’t have a semicolon terminating statements, either. The line break will do the job. If you have a long expression, just have a long line, with word wrap in the IDE, rather than a hard return. We don’t need to have multiple ways of doing the same thing. All it does is add unnecessary complexity and useless debates over style.

Functions

To begin with, you don’t need to put all code within a function. You can have code at the top-level. There’s no need for a main function, which is a hack to begin with. You need a function only to abstract out a body of code you are going to call elsewhere. Since you’re not going to call a main function, you shouldn’t logically need a function. Just put it inline at the top level, as with Python or Javascript.

Functions are also drastically simpler. A function would take in only one argument. If you want more, put them in a map or list. With map and list literals, all it means in practice is using braces instead of parentheses at the call site:

f = file_open {name: ‘input.txt’, mode: ‘w’}

Or:

total = sum [1, 2, 3, 4]

As you can see, restricting functions to taking only one argument doesn’t make your code any more verbose or cluttered in practice.

Why not support multiple arguments? Because the language already has two ways to represent multiple pieces of data — a map and a list. Why have a third option, that too to be used in only one place in the language (when invoking functions)? Getting rid of multiple arguments also gets rid of a lot of complexity: positional vs keyword arguments, varargs, optional arguments, default values, and so on. The question of overloading wouldn’t arise, since functions don’t take different sets of arguments.

So, there’s only one argument, but of whatever type you want.

With this, passing data into and out of a function are symmetrical — gone is the odd limitation in most languages of having multiple arguments but only one return value. If it’s reasonable to pass in multiple pieces of data into a function, why can’t you return multiple from the function? That’s an odd limitation. Sure, you can return a map or a list, but in that case, why not use that for the argument as well? That makes it consistent.

A lot of complexity goes away. Parentheses are no longer required at call sites. With multiple arguments, you need parentheses because f a b can mean f(a, b) or f(a(b)). When functions have only one argument, the former interpretation no longer applies. So you no longer need parentheses to disambiguate. Or commas to separate multiple arguments.

Since functions don’t take multiple arguments, there’s no need to name or declare them. The argument is always named ‘argument’.

Perhaps you won’t be allowed to assign to the argument. When a novice writes, in Java:

void capitalise(String s) {

s = s.toUpperCase();

}

it won’t work, to figure out why, she needs to understand the concept of call by value. The object isn’t being copied, but the reference is. This is a subtle distinction for novices to understand, so it’s simpler to disallow a function that assigns to ‘argument’.

You can have a function that doesn’t expect an argument, requires an argument, or can handle both cases. Keeping in mind the theme of simplicity, there’s no special syntax to tell them apart. If you invoke a function that requires an argument without giving one, you’ll get a NullException when the null argument is dereferenced. Conversely, if you supply an argument to a function that doesn’t use one, it will be ignored.

The IDE will be able to detect a lot of cases and warn the programmer. For example, passing a parameter to a function that doesn’t access its argument should trigger a warning. Similarly, if you have a function that access only keys a and b in its argument map, passing an argument c will trigger a warning. As does sending the wrong type of argument. Or using the return value when the function doesn’t have a return statement with a value.

The IDE will detect and warn about as many cases as it can. That way, you’ll get most of the benefit of a more bureaucratic type system, but without the cost in learning curve.

To keep things simple, there would be no nested functions. And no function literal syntax, like Javascript’s f = function() {…}. And no closures or first-class functions. A function can’t take in another function as an argument, or return one.

Moving on from functions, you wouldn’t have packages or modules, and no need to import or export anything. You would be able to just call a function from another file without any ceremony. You wouldn’t worry about file names and the directory structure. The way this would work is that you would put all your code in a folder. To run your program, you give the runtime your folder and ask it to run it, as opposed to giving it the path to your main file. You can have any filenames or sub-folders you want; it wouldn’t matter. The runtime effectively concats all your files and then runs it. You wouldn’t need to write and maintain makefiles or other build rules.

Needless to say, you wouldn’t have any support for threads or asynchronous execution. No assertions.

An Escape Hatch

One downside of writing a program in a simple language is that if you outgrow it, you need to manually port the program to another language. This often dissuades people from using a simple language. This problem becomes more severe the simpler the language it. It’s especially severe for an ultra-simple language like this one.

To fix this, the language should come with a converter that converts the program to say, Javascript. That way, you can start programming in the simple language knowing that it will add the least friction, but if you outgrow it, you can convert your program into Javascript in a few seconds, rather than porting it manually and painfully. This conversion will be straightforward and safe since our language is effectively a subset of Javascript [1].

Conclusion

This language is far simpler than most languages, even ones generally considered simple, like Javascript, Ruby or Visual Basic.NET. Most languages have been designed to be so complex that we forget what a simple language would be like. It’s as if we have been designing jet planes for so long we forgot how to design a cycle.

This language will hopefully attract far more people to programming, as opposed to current languages, which repel would-be users by their complexity and steep learning curve. Programming language designers should learn from UX designers — how do you design a tool usable by the most number of people, and for the most tasks, not just the complex ones?

A simple language is suitable for many tasks for which the complexity of existing languages makes it too much trouble to be worth it. Even for an advanced programmer like myself, I would want to use a simpler language if lets me get my task done with less overhead.

Finally, a simple language will also be useful as a teaching language. Even advanced programmers often start with a simple language.

The importance of simple languages can’t be overstated.

[1] One can imagine extending this language by adding features that prioritise power over simplicity. For example, closures, first-class functions, reflection, modules, and so on. Think of it as extending C to create C++ or Objective C. Even if you’re aiming for a more powerful language than the one I described here, you’ll get a better result by starting with this minimalistic core and judiciously adding features that give you the most benefit in expressiveness. Make a list of features, rank them by how much expressiveness they add to the language, and implement the ones at the top of the list. Having three kinds of loops probably doesn’t add much in expressiveness or power. Neither does supporting multiple arguments to functions. Adding closures or reflection instead adds much more bang for the buck. You end up with a far simpler language than Ruby or Java, but almost as powerful.

Today’s languages are not a good starting point since they have too much complexity that doesn’t translate into power. They have accumulated so much cruft over the decades that even attempts at starting from scratch like Ruby or Python end up with a lot of cruft.

It’s time to wipe away all this cruft, start with a clean slate, and carefully and judiciously add back features that provide the most bang for the buck.

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Why Aren’t There Simple Programming Languages?

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址