“Science is what we understand well enough to explain to a computer; art is everything else.” – Donald E. Knuth
About the Lectures¶
Overview¶
Programming, mathematics and statistics are powerful tools for analyzing the functioning of economies
This website provides a hands on instruction manual, with all code written in modern, open source programming languages
Topics include
- algorithms and numerical methods for studying economic problems
- related mathematical and statistical concepts
- basics of coding skills and software engineering
The intended audience is
- upper level undergraduate students and
- graduate students and researchers in economics, finance and related fields
There are two versions of the website, a Python version and a Julia version
Both languages are modern, open source, high productivity languages
If you’re not sure which one to pick, then you should probably choose Python
Python is a general purpose language featuring a massive user community in the sciences and an outstanding scientific ecosystem
Python is our go-to language in almost everything we do, from day to day scripting to high performance computing
Julia is more recent and still relatively unstable, although it has many exciting features
If you want more detail on language then please read on
Python or Julia¶
In many cases the syntax and design choices of Python and Julia are similar (e.g., Julia borrows many nice features from Python)
Both can be interfaced with low level languages either directly or using existing tools
Both offer convenient interfaces to parallel programming
Both offer high quality just-in-time compilation
That’s said, there are some differences that might help shape your choice
Julia’s Advantages¶
Third party libraries are often written entirely in Julia, making them
- Easy to install
- Easy to dive into and read / change / edit
The focus of the language is bound to scientific applications, which means that
- Syntax for common scientific operations can be more straightforward
- Many standard scientific functions are part of the core language
Julia’s Disadvantages¶
It’s still early days for Julia, which means that
- The language itself and the libraries have not fully stabilized and are likely to break backwards compatibility
- The set of existing scientific tools is still only a fraction of what’s available in Python
Knowledge of Julia is still a niche skill
Python’s Advantages¶
Python is supported by a vast collection of standard and external software libraries
Knowledge of Python is a highly marketable skill
The fact that Python is a general purpose programming language means that knowledge of Python can be useful for all manner of problems
- For example, these lectures are themselves compiled from templates using a variety of tools written in Python
Over the last decade, Python has become one of the core languages of scientific computing
Python is popular because it is simple to pick up and yet powerful enough for very sophisticated applications
Python’s Disadvantages¶
- For scientific operations, the standard implementation of Python is typically slower than Julia, C or Fortran, so additional steps are required to obtain fast execution speeds
(The steps are mostly straightforward and detailed in the lectures)
- Many scientific Python libraries include compiled code or code in other languages that needs to be compiled, making them less accessible
Open Source¶
As with researchers in many other scientific fields, we are drawn to Python and Julia mainly by their quality
The second best thing about Python and Julia is that both are free and open source
When you start out, the “free” component of this pair will probably be the most appealing
It means that you, your coauthors and your students can install them and their libraries on all of your computers without cost or concern about licenses
Over time, however, you will most likely come to value the “open source” property as much, if not more
The first advantage of open source libraries is that you can read them and learn how they work
For example, let’s say you want to know exactly how pandas computes Newey-West covariance matrices
No problem: You can go ahead and read the code
While dipping into external code libraries takes a bit of coding maturity, it’s very useful for
- Helping you understand the details of a particular implementation
- Building your programming skills by showing you code written by first rate programmers
Even better, you can modify the library to suit your needs
In particular, if the functionality provided by a given library is not exactly what you want, you can always modify it
Another, more philosophical advantage of open source software is that it conforms to the scientific ideal of reproducibility
Research you produce using Julia or Python will be open, transparent and reproducible
How about Other Languages?¶
But why didn’t you include language XYZ?
MATLAB¶
MATLAB is a high productivity scripting language with fast vectorized operations and a large user base
While MATLAB has many useful routines and libraries, it’s starting to show its age
It can no longer match Python or Julia in terms of performance (Julia or Python + Numba) and design
MATLAB is also proprietary, which comes with its own set of disadvantages
Given what’s available now, it’s hard to find any good reasons to invest in MATLAB
Incidentally, if you decide to jump from MATLAB to Python, this cheat-sheet will be useful
R¶
R is a very useful open source statistical environment and programming language
Its primary strength is its vast collection of extension packages
Julia and Python are more general purpose than R and hence a better fit for this course
Moreover, if there are R libraries you find you want to use, you can now call them from within Python or Julia
C / C++ / Fortran?¶
Isn’t Fortran / C / C++ faster than Julia / Python? In which case it must be better, right?
Actually this is an outdated view
For a start, you can now achieve speeds close to those of compiled languages in Julia or Python (Python + Numba) through just in time compilation
But more importantly, remember that the correct objective function to minimize is
total time = development time + execution time
In assessing this trade off, it’s necessary to bear in mind that
- Your time is a far more valuable resource than the computer’s time
- Languages like Python or Julia are much faster to write and debug in
- In any one program, the vast majority of CPU time will be spent iterating over just a few lines of your code
What this means is that there still might be a role for C / C++ / Fortran in your program but it will at most be
- Speeding up just a few lines of code and then calling this compiled code from Python or Julia
- Taking a legacy library written in C / C++ / Fortran and calling it from Python or Julia
Last Word¶
Writing your entire program in Fortran / C / C++ is best thought of as “premature optimization”
On this topic we quote the godfather:
We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. – Donald Knuth
Credits¶
These lectures have benefited greatly from comments and suggestions from our colleagues, students and friends. Special thanks are due to our sponsoring organization the Alfred P. Sloan Foundation and our research assistants Chase Coleman, Spencer Lyon and Matthew McKay for innumerable contributions to the code library and functioning of the website.
We also thank Andrij Stachurski for his great web skills, and the many others who have contributed suggestions, bug fixes or improvements. They include but are not limited to Anmol Bhandari, Long Bui, Jeong-Hun Choi, David Evans, Shunsuke Hori, Chenghan Hou, Doc-Jin Jang, Qingyin Ma, Akira Matsushita, Tomohito Okabe, Daisuke Oyama, David Pugh, Alex Olssen, Nathan Palmer, Bill Tubbs, Natasha Watkins, Pablo Winant and Yixiao Zhou.