Migrating a 10,000-line legacy JavaScript codebase to TypeScript

June 2016 ( assistant professor )


I recount my experiences migrating the 6.5-year-old, 10,000-line codebase of Python Tutor from “old-school” HTML/CSS/JavaScript to a modern 2016-era development workflow using Webpack and TypeScript.

I started working on Python Tutor in January 2010 , about 6.5 years ago. Since then, its Web-based frontend has grown to around 10,000 lines of JavaScript code (the vast majority written by me) spread throughout multiple files, interfacing with several versions of third-party libraries, and imported in idiosyncratic orders by different HTML files with a tangle of script tags.

Even though the system works well and is now supporting thousands of daily active users from around the world, its codebase has gotten to a point where its complexity is too much for me to stand. Technical debt has been piling up for over half a decade as I layered hacks on top of more hacks.

Since I have some downtime this summer while preparing for myCalifornia move, I can finally take a step back from the grind and figure out how to refactor the codebase so that it’s easier for me (and future collaborators) to extend it in the future. This isn’t just yak shaving, though. It’s very important for me to be able to iterate quickly on the Python Tutor codebase so that I can easily prototype, deploy, and evaluate new research ideas.

I’ve been following the Web development tools ecosystem over the past year and realize that new fads come and go every month. After assessing my priorities for Python Tutor and cutting through the hype of new shiny frameworks, I decided to refactor my codebase using two mature yet still-modern technologies: Webpack and TypeScript .

This article describes how I migrated the Python Tutor codebase over to using Webpack and TypeScript. It will mostly be conceptual and light on technical details. To dive deeper, you can check out the new Version 5 codebase to learn from the code itself and compare it with the legacy Version 3 codebase .

Migrating to Webpack

Although Webpack has a gazillion complex use cases, I use it to:

  • have each JavaScript file be a self-contained module like I’m used to doing in other languages such as Python,
  • have each HTML file include only one JavaScript file in a <script> tag instead of a bunch of files and worrying about import order and global name clashes,
  • serve as a Make-like build system for TypeScript (see below).

I won’t go into technical details here since those tend to get outdated quickly, but here’s my webpack.config.js and npm package.json config files for reference.

Webpack makes JavaScript feel more like a traditional desktop programming language like Python, Java, or C++ where each file is a module that encompasses its own isolated namespace. Each module can import other modules, and Webpack takes care of resolving all dependencies. (Think import in Python or Java, or #include in C++.) Webpack also bundles all of my code and third-party libraries together into a single JavaScript file that can be included in HTML using a single <script> tag. It can do all sorts of other nice optimizations to JavaScript code such as minification and clever splitting into multiple bundles, but bringing a module system to JavaScript is the main selling point for me.

The main challenge in porting my legacy 10,000-line Python Tutor codebase over to using Webpack was getting rid of all the nasty ways that I abused global variables. Since globals defined in different JavaScript files share a common namespace, I often found it convenient to reference globals from other files or, even worse, to override variables and functions in one file with identically-named ones in another file. I knew all along that these were bad habits, but JavaScript made it all so easy :) With Webpack, I was forced to define tighter interfaces between files and to explicitly expose selected variables to the outside world using module.exports . No more global namespace pollution!

One subtle class of bugs I had to fix after the Webpack migration was those caused by my bad bad bad habit of variable and function overloading. For instance, if file X defines a function foo , but file Y defines its own foo to provide overriding behavior, then when my code in X calls foo() , it will actually call Y’s foo , which is exactly what I want. But when X and Y are “modularized” using Webpack, all calls to foo() from file X will call X’s foo , not Y’s foo , since foo is no longer global; each version of foo is visible only within its own file. To enable X to call Y’s foo , Y must first export foo , then my code in file X must import Y and call Y.foo() .

Another annoying issue was getting third-party libraries bundled up inside of a Webpack project. After twiddling with various webpack.config.js options (especially for jQuery plug-ins) and using features like script-loader , I was able to get every required library imported into my project except for TogetherJS . I ended up including TogetherJS into my HTML with its own separate <script> tag (not using Webpack), and that sort of works for now.

Migrating to TypeScript

For the past 6.5 years, I’ve been coding the Python Tutor web application in “old-school” HTML/CSS/JavaScript of the sort that I’ve been writing since I started making goofy websites as a kid. While that’s fine for small projects, a 10,000-line JavaScript codebase (the current size of Python Tutor’s code) feels like the breaking point above which it’s really hard to make progress without a more “structured” language.

In the coming years, I want to code in a more modern and structured dialect of JavaScript to get the productivity benefits that many traditional desktop languages have always had. But my main constraint is that I already have 10,000 lines of legacy JavaScript code that I don’t want to throw away and rewrite from scratch. I need a solution that allows me to gradually modernize that code piece by piece, so that’s why I picked TypeScript.

TypeScript is a superset of JavaScript that is completely backwards-compatible with legacy JavaScript (exactly what I needed!) and adds useful new features such as:

  • the ability to write next-generation ES6 JavaScript code and have it run on current and older Web browsers,
  • optional type declarations in a powerful type system, which enables compile-time type checking,
  • other compile-time lint-like checks for code quality.

These features are available separately in a mix of other open-source JavaScript tools, but I picked TypeScript because it’s super convenient to have everything together in one tool.


I first set up ts-loader to allow Webpack to compile and package my TypeScript code together with third-party libraries. If configured properly, my whole project gets built when I run a single webpack command (sort of like make ).

The first real code migration step was simply changing all of my existing .js files into .ts files. Then when I run Webpack again, it invokes the TypeScript compiler to compile all the .ts files into plain (ES5) JavaScript. This all works out of the box since TypeScript is a superset of JavaScript. Thus, all of my 10,000 lines of JavaScript code are also legal TypeScript, so in theory I’ve just ported my entire codebase with no work. That was easy!

Not so fast. Although my code works as before, the TypeScript compiler shows hundreds of daunting errors since many expressions don’t type check. My goal is to now reduce the error count to zero. That’s when I’ll declare victory for TypeScript migration and be ready to start coding again.

Types for third-party libraries

The first order of business was to fetch type declarations for all of the third-party libraries that I used, such as jQuery, jQuery plug-ins, Ace, d3, etc. Doing so eliminated a large fraction of type errors for code that interfaces with those libraries. The DefinitelyTyped repository and tsd tools were helpful here (although I’ve heard that Microsoft wants to directly integrate with npm in the near future). Check out my tsconfig.json and tsd.json config files for details.

Even after importing type declarations for third-party libraries, I still saw annoying type errors because some were either outdated or incorrect. The “proper” thing to do would be to submit a patch, but I didn’t have time for that. So instead I just patched up the declarations locally in my own .ts files. It’s easy to do: Mimic the format of the declaration files, and then add your own. The compiler will conveniently join the declarations together. For instance, here are some declarations I added at the top of one of my .ts files to fill in missing parts from library definitions:

interface JQuery {   // attr can also take a boolean as a second argument   attr(attributeName: string, b: boolean): JQuery; }  interface JQueryStatic {   doTimeout: any; }  declare namespace AceAjax {   interface IEditSession {     setFoldStyle: any;     setOption: any;     gutterRenderer: any;   }    interface Editor {     setHighlightGutterLine: any;     setDisplayIndentGuides: any;     on: any;   } }

(I was super-lazy and just used the catch-all any type. If I feel like it, I can declare more precise types later.)

The key point is that I didn’t modify the original declaration files installed by tsd; I augmented those definitions in my own source files. Doing so keeps a clean separation so that I can use tsd to update the declarations later without clobbering my local changes.

Use the catch-all “any” type to silence remaining errors

To silence most remaining errors, I either added any declarations for missing variables:

declare var initCodeopticon: any;

or as any type casts, like:

(ret as any).survey = surveyObj;

or by declaring the variables with any types:

var s: any = { mode: 'display' };

Giant warning: These are hacks to silence the compiler, but they eliminate the benefits of type checking for those objects. However, they’re no worse than writing in plain-old JavaScript, so that’s why I don’t feel bad recommending them. If you go down this path, though, consider returning later to put in the proper types instead of leaving those ugly any declarations and casts in the codebase. I did this to quickly get my compiler error count down to zero so that I could start making progress on coding again without worrying about types for now. But I also recognize that the more disciplined I am at writing good types, the more benefits I can get from them in the future.

Final touches

Finally, the TypeScript compiler provides some lint-like warnings that are useful to fix, such as calling functions with incorrect numbers of parameters (you can fix that by, say, declaring optional arguments), or forgetting to write var to declare local variables so that they inadvertently become global. Also, since TypeScript supports ES6, you can simply prepend export to declarations to export them instead of using the module.exports object, which again cleans up your code a bit.

It took about two full days of effort to migrate my 10,000-line codebase over to Webpack and TypeScript. Of course, I haven’t manually written any types yet ; I’m taking advantage of the default type inference and checking that TypeScript provides out of the box. That’s the great thing about a gradual type system like TypeScript’s; I get immediate benefits out of the box, and if I want better type checking, then I can write more precise types later.

Now that this is all set up, I can code in modern ES6 JavaScript with optional types and then run a single Webpack build command to compile everything into an optimized bundle to deploy online. This felt like the right level of tooling for my current needs; not too much, and not too little. Hopefully this tool chain will allow me to iterate faster and write more reliable code in the coming years … until it, too, starts to feel outdated :)

Created: 2016-06-16Last modified: 2016-06-16

转载本站任何文章请注明:转载至神刀安全网,谢谢神刀安全网 » Migrating a 10,000-line legacy JavaScript codebase to TypeScript

分享到:更多 ()

评论 抢沙发

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址