Promptar Lead | Python Trainer

Python Virtual Environments: Fundamentals

In this article I go over an introduction to Python Virtual Environments: what they are, why they’re useful and, for those with an interest, and to a certain extent, how they work. I also link to further readings and related tools.

Both beginners and seasoned Pythonistas should be using Virtual Environments. Read on, and decide for yourself.

Beginning at the beginning

Begin at the beginning,” the King said, very gravely, “and go on till you come to the end: then stop.”

Inspired by Lewis Carroll — stating what some would say is seemingly obvious, but that my experience motivates an “often disregarded” remark and, thus, an additional “probably wise” one — I’ll begin at the very beginning: Python Virtual Environments, three words. Let’s consider each one, individually, skimming through the first — Python — which should be pretty clear and well understood in this context, paying attention to each of the other two — Virtual and Environment.

Here are a few definitions I hand-picked from both the Oxford and Merriam-Webster online dictionaries1:

  • For Virtual, the first says “Not physically existing as such but made by software to appear to do so”, while the second gives us “being on or simulated on a computer or computer network”.

  • For Environment, the first suggests “The setting or conditions in which a particular activity is carried on” and “The overall structure within which a user, computer, or program operates”, and the second “a computer interface from which various tasks can be performed”.

These definitions seem to make sense, being in line with the general understanding of these individual terms, as far as I can tell (hopefully you’ll agree with that, otherwise, please take a moment to review them and let the definitions sink in). Cutting and pasting a few snippets here and there, we could then claim that:

  • A Python Virtual Environment is a “not existing” Python Environment “made by software to appear to do so”.

  • A Python Environment is the “setting or conditions” “within which a” Python “program operates”.

This calls for an explanation of what a Python Environment actually is, first, helping us then go for the Python Virtual Environment thing. Let’s dive in…

Python Environment

A Python Environment is comprised of a few fundamental components:

  • The Python Executable itself, invoked to run Python programs or to get an interactive prompt.
  • Tools and Scripts, like pip, used to manage third-party packages, normally sourced from PyPI.
  • The Standard Library, containing an extensive collection of very useful, always at hand modules.
  • Site Packages, the location where third-party libraries and packages are kept, often installed with pip.

All of these are just files and directories laid out in a predetermined way, mostly depending on the platform, that, as we will see, is flexible enough to be customized. Keep in mind that even though not all of them are strictly required to exist for a given Python program to run, most Python installations do include them.

Here’s a quick, real-word relation of where each component is located after installing Python 3.6 on three common platforms:

Python Executable /Library/Frameworks/Python.framework/Versions/3.6/bin/python3.6
Tools and Scripts /Library/Frameworks/Python.framework/Versions/3.6/bin/
Standard Library /Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/
Site Packages ./site-packages/ sub-directory in Standard Library
Python Executable {PREFIX}\python.exe
Tools and Scripts {PREFIX}\Tools and {PREFIX}\Scripts
Standard Library {PREFIX}\Lib
Site Packages .\site-packages\ sub-directory in Standard Library
Python Executable /usr/bin/python3.6
Tools and Scripts /usr/bin/
Standard Library /usr/lib/python3.6/
Site Packages ./site-packages/ sub-directory in Standard Library

If you’d like, now would be a great time to locate these same components on your system. Go ahead, don’t worry, I’ll wait here…

How are these components related?

The relationship between these components has lot to do with the way imports work. In simple terms, here’s how Python runs a line like:

import module
  • It first tries to locate a module named module in:
    • The current working directory, for interactive sessions.
    • The directory containing the file passed to the Python executable, otherwise.
  • Failing that, the Standard Library is searched.
  • Lastly, if the module wasn’t found, Site Packages is searched.

If everything fails, and Python doesn’t find the module, it throws an ImportError exception, which you may have seen before (more recent Python versions, from 3.6 onwards, actually throw the more specific ModuleNotFoundError, instead).

This behavior is driven by sys.path, which is a Python list of places — for lack of a better word — where Python goes looking for modules, in order, at import time (such places are often directories, but can also be other things, like ZIP files, and more). For completeness, sys is a module built into the Python executable itself: it can be imported and used like other modules, but you won’t find it in the Standard Library or anywhere else but in the Python source code itself.

Here’s a sample interactive session showing the value of sys.path in a Linux installation:

$ python3
Python 3.6.3 (default, Nov 21 2017, 14:55:19) 
[GCC 6.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> import pprint
>>> pprint.pprint(sys.path)

The list has a few entries, which we’ll group this way:

  • The first, an empty string, referring to the current working directory.
  • The next three ones point to the Standard Library. We won’t delve into more details, for now.
  • The last entry is the Site Packages directory, as its name properly indicates.

The Site Packages directory is normally managed by pip, a tool that searches, downloads, installs, updates and uninstalls third-party packages available from PyPI. With the simplified Python import behavior described above, the way everything is tied together is hopefully getting clear: Python imports work in a specific order, looking for modules in the current project directory first, then in the Standard Library and lastly in Site Packages, accounting for third-party packages that may have been installed with tools like pip.

The inquiring mind may be left with a few open questions, though:

  • How does the Python executable know the location of the Standard Library?
  • Why is it spread across multiple locations, in the example above?
  • What about Site Packages?
  • How does pip know the location of Site Packages?

We will answer these questions later, when tying everything nicely together. For now, let’s move on to Python Virtual Environments.

Virtual Environments

Python Virtual Environments are isolated, lightweight Python Environments. The main idea here is isolation.

Say you have been collaborating in a project that uses Django to build a web based application — being an established project it depends on the stable and long term supported Django version 1.11. You might have installed it with pip install django==1.11, while setting up your Python development environment, having it happily sitting in Site Packages and everything working as expected.

What if you want to create a new project based on a more recent, non backwards compatible Django version, like 2.0? Python does not support coexisting installations of different versions of the same package. What can you do? Repeatedly uninstalling, then installing the required Django version each time you want to work on one project or the other is, to say the least, impractical, unfeasible, and, more importantly, very prone to mistakes and subtle (and not so subtle!) errors. Nobody wants that.

More, what if you want to work on two projects, with separate, non-conflicting direct dependencies that, in turn, have conflicting dependencies themselves? Maybe project P1 depends on package A, and project P2 depends on package B, but A and B can’t be installed side by side because each one depends on a different, conflicting version of package C. Again, what can you do?

Going further, what if you want to install a third-party package, with pip, but you don’t have the necessary OS level permissions to write to the Site Packages directory? Maybe you’re running Python on a shared system, where you hold no administrative privileges.

Python Virtual Environments — as isolated, lightweight Python Environments — exist to address these particular types of challenges. They are shallow copies of Python Environments, having their own independent on disk location, being comprised of the same fundamental components: the Python executable, Tools and Scripts, the Standard Library and Site Packages:

  • They are lightweight, consuming a relative low amount of disk space, because:
    • The Python executable is soft-linked2 from the underlying Python Environment.
    • The Standard Library is empty: the Python executable will use the one in the underlying Python Environment.
  • They are isolated:
    • Both Tools and Scripts, and Site Packages are created from scratch and empty for practical purposes3.

To use a Python Virtual Environment, it needs to be created first. While there are other tools and ways of doing it, we will be doing it with the venv module, included in the Standard Library since Python 3.3, by bringing up a command line shell, and running:

$ python3 -m venv my-venv
C:\Users\...> python -m venv my-venv

This creates a new Virtual Environment called my-venv, within a new directory of the same name, where the isolated Python Environment components can be found:

Python Executable ./my-venv/bin/python3.6
Tools and Scripts ./my-venv/bin/
Standard Library ./my-venv/lib/python3.6/ which is empty. The Python executable uses the Standard Library from the Python Environment used to create this Virtual Environment.
Site Packages ./my-venv/lib/python3.6/site-packages/
Python Executable .\my-venv\Scripts\python.exe
Tools and Scripts .\my-venv\Scripts\
Standard Library .\my-venv\Lib\ which is empty. The Python executable uses the Standard Library from the Python Environment used to create this Virtual Environment.
Site Packages .\my-venv\Lib\site-packages\

From this point onward, using the Virtual Environment boils down to using the Python executable (or Tools or Scripts) within the Virtual Environment directory. This should be enough and is the way common IDEs and other development tools are configured: you tell them which Python executable you want to use for a given project and they go from there.

For those preferring to work on command line shells, myself included, it is not practical to constantly use the full path to the Python executable we want to use all the time. For that, the Virtual Environment includes an activate script that, once run, does two things: it prepends the Virtual Environment’s Python executable path to the shell’s PATH environment variable, and it updates the command line prompt with a (virtual-environment-name) prefix. How to run it depends on the platform and type of shell in use:

$ source my-venv/bin/activate
$ source my-venv/bin/activate.csh
C:\Users\...> my-venv\Scripts\activate.bat

Once completed, the shell PATH and prompt will be updated, and invoking python or pip no longer require the full path, using, and only affecting, the isolated Virtual Environment. Going back to a non-activated Virtual Environment state is a matter of running deactivate, and removing the Virtual Environment directory and its contents completely wipes out any traces of it.

In Action

Let’s go through an example of creating and using two Python Virtual Environments:

  • One we will call “web-venv”, to work on a Django based web app.
  • Another, called “nums-venv”, where we will install requests, to collect data from the web, and NumPy to process it afterwards.

We start by creating a work directory to hold everything we need, and change the current working directory there:

~ $ mkdir work
~ $ cd work

Setting up the web-venv

Then, within the work directory, we’ll create our web-venv Python Virtual Environment:

~/work $ .../python3 -m venv web-venv           # Windows: ...\python -m venv web-venv

Next, we activate the newly created web-venv Virtual Environment and use pip to list what’s installed in Site Packages:

~/work $ source web-venv/bin/activate       # Windows: web-venv\Scripts\activate
(web-venv) ~/work $ pip list
pip (9.0.1)
setuptools (28.8.0)
(web-venv) ~/work $

Confirming the Virtual Environment is indeed activated — noting the (web-venv) command line shell prompt prefix — we use pip to install Django; wanting version 1.11, we specifically ask for it with django==1.11, otherwise pip would install the latest available version:

(web-venv) ~/work $ pip install django==1.11
Collecting django==1.11
  Downloading Django-1.11-py2.py3-none-any.whl (6.9MB)
    100% |████████████████████████████████| 6.9MB 186kB/s 
Collecting pytz (from django==1.11)
  Downloading pytz-2018.3-py2.py3-none-any.whl (510kB)
    100% |████████████████████████████████| 512kB 2.3MB/s 
Installing collected packages: pytz, django
Successfully installed django-1.11 pytz-2018.3
(web-venv) ~/work $

The pip installation output is verbose enough for the attentive eye to see that pytz was also brought in and installed, as a Django dependency. Going back to pip list we now get:

(web-venv) ~/work $ pip list
Django (1.11)
pip (9.0.1)
pytz (2018.3)
setuptools (28.8.0)
(web-venv) ~/work $

As a simple but effective test we bring up the Python interactive prompt to confirm the django package can be imported:

(web-venv) ~/work $ python
Python 3.6.3 (default, Nov 21 2017, 14:55:19) 
[GCC 6.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.__version__
>>> exit()
(web-venv) ~/work $

It does, indeed. Let’s proceed to the next step and create the nums-venv Virtual Environment. Before that we deactivate the web-venv one:

(web-venv) ~/work $ deactivate
~/work $

Setting up the nums-venv

Like we did before, we create and activate the nums-venv Virtual Environment, confirming that the Site Packages is, for all practical purposes, empty:

~/work $ .../python3 -m venv nums-venv      # Windows: ...\python -m venv nums-venv
~/work $ source nums-venv/bin/activate      # Windows: nums-venv\Scripts\activate
(nums-venv) ~/work $ pip list
pip (9.0.1)
setuptools (28.8.0)
(nums-venv) ~/work $

Knowing that nums-venv is activated, we proceed to install the latest versions of requests and numpy:

(nums-venv) ~/work $ pip install requests numpy
Collecting requests
  Downloading requests-2.18.4-py2.py3-none-any.whl (88kB)
    100% |████████████████████████████████| 92kB 2.3MB/s 
Collecting numpy
  Downloading (4.9MB)
    100% |████████████████████████████████| 4.9MB 261kB/s 
Collecting urllib3<1.23,>=1.21.1 (from requests)
  Downloading urllib3-1.22-py2.py3-none-any.whl (132kB)
    100% |████████████████████████████████| 133kB 4.5MB/s 
Collecting idna<2.7,>=2.5 (from requests)
  Downloading idna-2.6-py2.py3-none-any.whl (56kB)
    100% |████████████████████████████████| 61kB 6.5MB/s 
Collecting chardet<3.1.0,>=3.0.2 (from requests)
  Downloading chardet-3.0.4-py2.py3-none-any.whl (133kB)
    100% |████████████████████████████████| 143kB 5.7MB/s 
Collecting certifi>=2017.4.17 (from requests)
  Downloading certifi-2017.11.5-py2.py3-none-any.whl (330kB)
    100% |████████████████████████████████| 153kB 4.9MB/s 
Installing collected packages: urllib3, idna, chardet, certifi, requests, numpy
  Running install for numpy ... done
Successfully installed certifi-2017.11.5 chardet-3.0.4 idna-2.6 numpy-1.14.0 requests-2.18.4 urllib3-1.22
(nums-venv) ~/work $

Unsurprisingly pip fetched and installed not only requests and numpy but also their dependencies, which are quite a few, in this case. Going for a quick import test on the interactive Python prompt we get:

(nums-venv) ~/work $ python
Python 3.6.3 (default, Nov 21 2017, 14:55:19) 
[GCC 6.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import requests
>>> requests.__version__
>>> import numpy
>>> numpy.__version__

Note, however, that trying to import django, installed in the other Virtual Environment, fails, as expected:

>>> import django
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'django'

Of course, deactivating nums-venv and activating web-venv gives us the opposite results, where django can be successfully imported, while importing requests or numpy fails:

(nums-venv) ~/work $ deactivate
~/work $ source web-venv/bin/activate
(web-venv) ~/work $ python
Python 3.6.3 (default, Nov 21 2017, 14:55:19) 
[GCC 6.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import django
>>> django.__version__
>>> import requests
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'requests'
>>> import numpy
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ModuleNotFoundError: No module named 'numpy'
>>> exit()
(web-venv) ~/work $

Using the web-venv and nums-venv Virtual Environments

Having set up both Virtual Environments, verifying each contains the third-party packages appropriate for the respective project, properly isolated, we’re ready to start coding, either from scratch or picking up on an existing codebase.

Given that individual preferences and tools vary widely, there is no “universally correct” way of moving forward (more, even the Virtual Environment creation could have been completed with different techniques or tools, but let’s keep ourselves focused!). There are, however, a few fundamental ideas and tips I would share when guiding someone through their first steps using Python Virtual Environments:

  • Keep the Virtual Environment and the project code directories separate
    Having created the Virtual Environments under the web-venv and nums-venv directories, in the example, it’s reasonable to ask where should the actual project code go. There are several options.

    I tend to use side-by-side Virtual Environment and project code directories — in this case, I would put each project’s code under the web-src and nums-src directories, respectively, such that my work directory would contain two directories per project: one holding the project’s Virtual Environment, and another with the code itself.

    Others opt to include the Virtual Environment directory within the project directory itself — say, having a generically named venv directory, with the Virtual Environment, inside web-src, holding the actual project code, while having a separate Virtual Environment venv directory under nums-src, for example. This is perfectly workable as long as the Virtual Environment directory is excluded from any kind of source code control that tools like git perform.

    Others, still, opt to have a common Virtual Environment base directory — say, work/virtual-environments, holding different project Virtual Environments in separate sub-directories — while keeping project code directories under a separate common base directory.

  • Use the Virtual Environment’s Python executable, Tools and Scripts
    When using a command line shell based workflow, activate the correct Virtual Environment. Not only when running or debugging your code, but also when installing, upgrading or uninstalling third-party packages with pip.

    When using IDEs and other development tools, learn how to configure them to use the correct Virtual Environment for the project you’re working in. Normally, telling them to use the Python executable in the Virtual Environment directory is enough, and they pick up everything, including properly handling third-party package management, from there.

  • Virtual Environment directories are not relocatable
    Keep in mind once a Virtual Environment is created under a given directory, it will probably fail to work later if you decide to move it somewhere else, or if it is renamed. If you ever need to move or rename a Virtual Environment directory, you should create a new one from scratch with the new name and/or directory, not forgetting to delete its previous incarnation.

Wrap up

Once you grasp the fundamentals of Python Virtual Environments — as lightweight, isolated Python Environments — using them and, hopefully, if needed, understanding why something may not be working the way it should, can become simple and almost second-nature. Recent Python versions include the venv module in the Standard Library, giving us a simple, always at hand tool to create Virtual Environments (that wasn’t always the case, though, as it isn’t with current Python 2 releases, for example).

If asked, I would generally recommend all Python users to use Virtual Environments: they may require an initial adaptation period and effort but their benefits clearly outweigh any costs I can spot. Software isolation is a good principle we should all be pursuing more and more.

To recap, creating a Virtual Environment can be done with:

$ .../python3 -m venv virtual-env-name      # Windows: ...\python -m venv virtual-env-name

Which can then be activated with:

$ source ./virtual-env-name/bin/activate    # Windows: .\virtual-env-name\Scripts\activate.bat

And later deactivated with:

(virtual-env-name) $ deactivate

Using the Virtual Environment with an IDE is a matter of configuring it to use the Python executable at ./virtual-env-name/bin/python (or .\virtual-env-name\Scripts\python.exe, on Windows).

Related Tools and Further Reading

Throughout the times, many different tools have been created around the idea of Virtual Environments. Some predating the Standard Library’s somewhat recent venv module, some taking different approaches at essentially the same idea. Here are a few you may want to explore:

  • Ian Bicking’s virtualenv
    Going as far back as 2007, is the first such tool I used. To this day, I still go for it whenever I need to work on a Python 2 codebase.
  • Doug Hellmann’s virtualenvwrapper
    A set of shell scripts around virtualenv, keeping all Virtual Environments under a common place and simplifying various tasks.
  • David Marble’s virtualenvwrapper-win
    A port of virtualenvwrapper to the Windows command shell.
  • Kenneth Reitz’s pipenv
    Having gained recent traction, combines Virtual Environment management and third-party package management in to a single tool.
  • The Anaconda and Miniconda Python distributions’ conda
    A package manager with support for Python Virtual Environments, third-party packages and even other programming languages.

Naturally, pip is worth mentioning, as the nearly universal tool used to manage third-party packages in Python Environments, Virtual or not.

For those interested in reading further, on this topic, I would recommend:

Words for the Inquiring Minds

I promised earlier I would answer a few questions inquiring minds might have been left with. That’s what I’ll try to do now, in relatively high-level and somewhat simple terms. The questions we identified were:

  • How does the Python executable know the location of the Standard Library?
  • Why is it spread across multiple locations?
  • What about Site Packages?
  • How does pip know the location of Site Packages?

These will be easier to answer backwards, so let’s start with the last one:

  • How does pip know the location of Site Packages?

    pip is a tool written itself in Python, thus, it manages the Site Packages directory known to Python.

    The reality is a bit more complex than that, as is often the case: pip is written in Python, depending on the third-party setuptools package which, in turn, builds on top of the distutils module in the Standard Library. It’s this last one that ends up “telling” pip where third-party packages should be installed4.

  • How does the Python executable know the location of Site Packages?

    When starting up, the Python executable automatically imports the site module5. It is imported from the Standard Library and, among other duties, it adds the Site Packages directory to sys.path, being clever enough to detect whether or not the Python executable is under a Virtual Environment, such that the correct Site Packages directory is used.

    For obvious motives, the code in the site module needs to work in lock-step with the distutils module, which somehow directs where pip installations go. Should one or the other have different ideas about which directory Site Packages is in, installed third-party packages would certainly not be found at import time.

  • How does the Python executable know the location of the Standard Library?

    We could call this the million-dollar question and, in a way, its answer could help us fully wrap our heads around the fundamentals of Python runtime environments and the associated import mechanisms.

    From what we’ve seen till now, the Python import machinery is based on sys.path, a list of places where it goes looking for modules at import time (there is more to it, but let’s keep it simple, here). We also know that Python automatically imports the site module at start up, and that the site module is shipped in the Standard Library, being responsible for adding Site Packages to sys.path.

    Given that imports are based on sys.path and that Python automatically imports the site module on startup, the question could then be rephrased as “How is sys.path initially populated?”

    A short, simplified answer would be something along these lines: the Python executable first determines where it is running from, in the sense of “which file is this Python executable?” With that, it sets sys.executable. From there, it follows a series of steps, including considering whether or not the Python executable is within a Virtual Environment, and looking around in parent and neighbouring directories for what looks like it could be a Standard Library. Throughout that process, it sets sys.prefix and sys.exec_prefix, respectively, as the platform independent and platform dependent prefixes where Python files are installed. Lastly it builds an initial version of sys.path with the current working directory first, followed by a combination of paths built out of sys.prefix and sys.exec_prefix, accounting for platform differences.

    A more detailed answer can be found here, while the full, nitty-gritty details are available in CPython’s source code in getpath.c, handling the initial setting of sys.path for UNIX-like platforms, and getpathp.c, for Windows platforms.

  • Why is the Standard Library spread across multiple locations?

    This question stems from the fact that sys.path often includes more than one directory associated to the Standard Library.

    The short answer boils down to the fact that Python modules can be written in both Python and in C. That is precisely the case with CPython’s Standard Library: some modules like datetime are written in Python, while others like math are written in C.

    Separating platform independent Python written modules and platform dependent C written modules into different on disk locations is a generally sound systems management principle. It may simplify multi-/cross-platform packaging, and, for example, deployment solutions where the platform independent parts of the Standard Library are shared across a network, while requiring clients to only have local copies of the platform dependent modules of the Standard Library (whether this would be a great idea or not, I’m not sure).

[ 2019-08-30 UPDATE: Fixed grammatical typo in the “Setting up the web-venv” section. Thanks to Sid for pointing it out. ]

  1. Dictionaries are amazing tools. And I don’t mean Python’s dictionaries (which are also amazing), I mean the real world, physical books with word definitions we call dictionaries. Get a good one, if you haven’t already. It will be worth every penny, cent, satoshi or whichever currency you’ll be using. 

  2. They will include pip and its depedency, the setuptools package, to assist in managing third-party packages in the Virtual Environment. Utility scripts to activate the Virtual Environment will also be there. Other than that, they’re like a blank canvas. 

  3. See the get_python_lib function in the distutils sysconfig module. 

  4. Unless it is passed the -S command line argument.