Virtual Environments II: Creating Virtual Environments with Pyenv
In Virtual Environments I, I explained how to install package managers, the latest release of Python, and Pyenv and Virtualenv. Now that those items are installed, we can set up a virtual environment for a Python project.
This blog is part of a series of tutorials called Data in Day. Follow these tutorials to create your first end-to-end data science project in just one day. This is a fun easy project that will teach you the basics of setting up your computer for a data science project and introduce you to some of the most popular tools available. It is a great way to get acquainted with the data science workflow.
I. Create a Unique Virtual Environment for a Data Science Project with Pyenv-Virtualenv
It’s a good idea to create a new virtual environment for each of your project. So we will start by making a directory for our project.
- Enter the following in Terminal:
$ mkdir my_project
This creates a new folder inside the home folder, especially for this project. Eventually, we will activate the virtual environment in this folder.
2. Remain in your home folder (not the project folder) and enter:
$ pyenv install 3.9.0
This command tells Pyenv to create a new copy of Python especially for this project and its virtual environment. There will be a prompt that states that Python is already installed and asks if you wish to proceed. Enter Y to continue.
3. It may take a few minutes for Python to install, but when it is finished, we are going to assign that version of python to our new virtual environment.
$ pyenv virtualenv 3.9.0 my_project_env
The command above tells pyenv to to create a virtual environment using the new version of Python 3.9.0, and call it “my_project_env”.
4. Now that we have a virtual environment for the project and a directory for the project, we need to activate the environment in the directory.
To navigate back to the environment, enter:
$ cd my_project
Once you are in there, enter
$ pyenv local my_project_env
If you look to your left in the terminal window, you should see the name of the environment before the dollar sign. It will look like this:
II. Installing Handy Data Science Packages: Jupyter Notebook, Ipykernel, Pandas, and Matplotlib
There are a few packages that you will likely want to have installed right away (Pandas, Matplotlib, Seaborn), and a few that you must have installed to finish your set up (Jupyter Notebook, Ipykernel).
Installing Packages with Pip
5. Enter the following to begin installing these popular data science packages. Pay attention to any prompts.
$ pip3 install pandas matplotlib seaborn ipykernel jupyter notebook
Pandas, Matplotlib, and Seaborn
Pandas is a Python package that is great for data science. It allows you to import data and interact with it in Python as a spreadsheet. It’s a powerful tool and is foundational to many data science projects. Matplotlib and Seaborn are great for visualizations. It’s easy to create simple plots, but there are plenty of options for making more complex plots.
Jupyter Notebook allows you to create interactive Python notebooks that include both code and markdown so you can annotate your code and explain your methods and findings. You’ll need Ipykernel to use Jupyter Notebooks. For now, we are just going to install both, and setting up Jupyter Notebook will be explained in another tutorial.
III. What Did We Do?
1. Created a project folder in the home directory.
2. Created a virtual environment and set it as the local environment for our project folder.
3. Installed some necessary packages into the virtual environment.
IV. What’s Next?
In Jupyter Notebook I, I’ll show you how to begin using Jupyter Notebooks to create your data science project.
If you liked this tutorial, check out more tutorials as Data in Day.