It is important to start at the beginning, which means setting up a development environment to do the coding exercises that are part of the Data Engineering Bootcamp. A development environment is a collection of programs and settings that enable you to code carefree on your computer. Writing code includes different tasks, from writing the actual code, to finding and fixing errors, compiling the code and finally run the program. All of these steps need to be done every single time, even for small programs. Luckily, there are programs that can make our life easier, these are called IDE’s (Integrated Development Environment).
I will start from scratch again, so this tutorial will hopefully demonstrate how you can use Visual Studio Code (the IDE I will be using) and write and run code in Python. This post will not dive deep into what an IDE is and how to use VS code.
Firstly, we need to install Visual Studio Code *, this can be done through their official website. We also need to download Python (I will be using Python 3.9). Since I am working on a MacBook I use Homebrew to install Python.
- If you do not have Homebrew on your mac yet, you can install it with the following command:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
- Install Python with Homebrew:
brew install python
It is important to note when you want to use Python, to use the python3
command, since python
(Python 2) is by default installed in MacOS. To check whether the installation was succesfull you can use the following command: python3 --version
. If the output is 3.9.X you have done it correctly.
Python
Python is one of the most populair programming languages created by Guido van Rossum and named after Monty Python. Some of the reasons why python is currently so populair:
- Python can be used for different use cases, from building websites to machine learning models.
- Python is considered easier to learn for a beginner than other languages.
- Python is one of the most popular programming languages used in many companies, training programs and universities (I learned it for the first time during my Bachelor’s).
Python is open source, interpreted, high level language and provides great approach for object-oriented programming. It is one of the best language used by data scientist for various data science projects/application. Python provide great functionality to deal with mathematics, statistics and scientific function. It provides great libraries to deals with data science application. (source)
VS code extra info
Some interesting videos and links about VS Code: