Plotting Graphs with Matplotlib
Contents
10. Plotting Graphs with Matplotlib#
Estimated time to complete: one hour.
10.1. Introduction: Matplotlib and Pyplot#
Numerical data is often presented with graphs, and the tools we use for this come from the module matplotlib.pyplot
which is part of the Python package matplotlib
. (A Python package is essentially a module that also contains other modules.)
10.2. Sources on Matplotlib#
Matplotlib is a huge collection of graphics tools, of which we see just a few here. For more information, the home site for Matplotlib is http://matplotlib.org and the section on pyplot is at http://matplotlib.org/1.3.1/api/pyplot_api.html
However, another site that I find easier as an introduction is https://scipy-lectures.org/intro/matplotlib/
In fact, that whole site https://scipy-lectures.org/ is quite useful a a reference on Python, Numpy, and so on.
Note: the descriptions here are for now about working in notebooks: see the note below on differences when using Spyder and IPython
10.3. Choosing where the graphs appear#
In a notebook, we can choose between having the figures produced by Matplotlib appear “inline” (that is, within the notebook window) or in separate windows. For now we will use the inline option, which is the default, but can also be specified explicitly with the command
%matplotlib inline
To activate that, uncomment the line below; that is, remove the leading hash character “#”
#%matplotlib inline
This is an IPython magic command, indicated by starting with the percent character “%” — you can read more about them at https://ipython.org/ipython-doc/dev/interactive/magics.html
Alternatively, one can have figures appear in separate windows, which might be useful when you want to save them to files, or zoom and pan around the image. That can be chosen with the magic command
%matplotlib tk
#%matplotlib tk
As far as I know, this magic works for Windows and Linux as well as Mac OS; let me know if it does not!
We need some NumPy stuff, for example to create arrays of numbers to plot.
Note that this is NumPy only: Python lists and tuples do not work for this,
and nor do the versions of functions like sin
from module math
!
# Import a few favorites, and let them be known by their first names:
from numpy import linspace, sin, cos, pi
And for now, just the one main matplotlib
graphics function, plot
from matplotlib.pyplot import plot
To access all of pyplot, add its common nickname plt
:
import matplotlib.pyplot as plt
Producing arrays of “x” values with the numpy function linspace
#
To plot the graph of a function, we first need a collection of values for the abscissa (horizontal axis).
The function linspace
(from numpy
) gives an array containing a specified number of equally spaced values over a specified interval,
so that
tenvalues = linspace(1., 6., 10)
gives ten equally spaced values ranging from 1 to 6:
print(f"Array 'tenvalues' is:\n{tenvalues}")
Array 'tenvalues' is:
[1. 1.55555556 2.11111111 2.66666667 3.22222222 3.77777778
4.33333333 4.88888889 5.44444444 6. ]
Not quite what you expected? To get values with ten intervals in between them, you need 11 values:
tenintervals = linspace(1., 6., 11)
print(f"Array 'tenintervals' is: \n {tenintervals}")
Array 'tenintervals' is:
[1. 1.5 2. 2.5 3. 3.5 4. 4.5 5. 5.5 6. ]
Basic graphs with plot
#
We could use these 11 values to graph a function, but the result is a bit rough, because the given points are joined with straight line segments:
plot(tenintervals, sin(tenintervals))
[<matplotlib.lines.Line2D at 0x7f8c01c68400>]
Here we see the default behavior of joining the given points with straight lines.
Aside: That text output above the graph is a message returned as the output value of function plot
; that is what happens when you execute a function in the last line of a cell but do not “use” its return value by either saving its result into a variable or making it input to another function.
You might want to suppress that, and that can be done by ending the command with a semi-colon.
More generally, if the last line of a cell gives a value, that value is displayed when the cell is run, and appending a semi-colon supprresss that.
Semi-colons at the end of any other line do nothing, but are harmless.
plot(tenintervals, sin(tenintervals));
For discrete data it might be better to mark each point, unconnected. This is done by adding a third argument, a text string specifying a marker, such as a star:
plot(tenvalues, sin(tenvalues), '*');
Or maybe both lines and markers:
plot(tenvalues, sin(tenvalues), '-*');
10.4. Smoother graphs#
It turns out that 50 points is often a good choice for a smooth-looking curve, so the function linspace
has this as a default input parameter: you can omit that third input value, and get 50 points.
Let’s use this to plot some trig. functions.
x = linspace(-pi, pi)
print(x)
[-3.14159265 -3.01336438 -2.88513611 -2.75690784 -2.62867957 -2.5004513
-2.37222302 -2.24399475 -2.11576648 -1.98753821 -1.85930994 -1.73108167
-1.60285339 -1.47462512 -1.34639685 -1.21816858 -1.08994031 -0.96171204
-0.83348377 -0.70525549 -0.57702722 -0.44879895 -0.32057068 -0.19234241
-0.06411414 0.06411414 0.19234241 0.32057068 0.44879895 0.57702722
0.70525549 0.83348377 0.96171204 1.08994031 1.21816858 1.34639685
1.47462512 1.60285339 1.73108167 1.85930994 1.98753821 2.11576648
2.24399475 2.37222302 2.5004513 2.62867957 2.75690784 2.88513611
3.01336438 3.14159265]
# With a line through the points
plot(x, sin(x), '-');
10.5. Multiple curves on a single figure#
As we have seen when using plot
to produce inline figures in a Jupyter notebook, plot
commands in different cells produce separate figures.
To combine curves on a single graph, one way is to use successive plot
commands within the same cell:
plot(x, cos(x), '*')
plot(x, sin(x));
On the other hand, when plotting externally, or from a Python script file or the IPython command line, successive plot
commands keep adding to the same figure until you explicitly specify otherwise, with the function figure
introduced below.
Note: The semi-colon was only needed on the final plot
command, because a Juypter cell only displays the output of the last command
in the cell (along with anything explicitly output with a print
function of course).
10.6. Two curves with a single plot
command#
Several curves can be specified in a single plot
command (which also works with external figure windows of course.)
plot(x, cos(x), '*', x, sin(x));
Note that even with multiple curves in a single plot command, markers can be specified on some, none or all: Matplotlib uses the difference between an array and a text string to recognize which arguments specify markers instead of data.
Here are some other marker options — particularly useful if you need to print in back-and-white.
plot(x, cos(x), '.', x, sin(x), ':');
10.7. Multiple curves in one figure#
There can be any number of curves in a single plot
command:
x = linspace(-1,1)
plot(x, x+1, x, x+2, x, x+3, x, x+4, x, x+5, x, x+6, x, x+7, x, x+8, x, x+9, x, x+10, x, x+11, x, x+12);
Note the color sequence: blue, orange, green, red, … blue, orange …
With enough curves (ten here, but it can vary between versions of matplotlib) the color sequence eventually repeats – but you probably don’t want that many curves on one graph.
Aside on long lines of code: The above illustrates a little Python coding hack: one way to have a long command continue over several lines is simply to have parentheses wrapped around the part that spans multiple lines—when a line ends with an opening parenthesis not yet matched, Python knowns that something is still to come.
Aside: using IPython magic commands in Spyder and with the IPython command line
If using Spyder and the IPython command line, there is a similar choice of where graphs appear, but with a few differences to note:
With the “inline” option (which is again the default) figures then appear in a pane within the Spyder window.
The “tk” option works exactly as with notebooks, with each figure appearing in its own window.
Note: Any such IPython magic commands must be entered at the IPython interactive command line, not in a Python code file.
10.8. Plotting sequences#
A curve can also be specified by a single array of numbers: these are taken as the values of a sequence, indexed Pythonically from zero, and plotted as the ordinates (vertical values):
plot(tenvalues**2, '.');
10.9. Plotting curves in separate figures (from a single cell)#
From within a single Jupyter cell, or when working with Python files or in the IPython command window (as used within Spyder), successive plot
commands keep adding to the previous figure.
To instead start the next plot
in a separate figure, first create a new “empty” figure, with the function matplotlib.pyplot.figure
.
With a full name as long as that, it is worth importing so that it can be used on a first name basis:
from matplotlib.pyplot import figure
x = linspace(0, 2*pi)
plot(x, sin(x))
figure()
plot(x, cos(x), 'o');
The figure
command can also do other things, like attach a name or number to a figure when it is displayed externally, and change from the default size.
So even though this is not always needed in a notebook, from now on each new figure will get an explicit figure
command. Revisiting the last example:
x = linspace(0, 2*pi)
figure(99)
# What does 99 do?
# See with external "tk" display of figures,
# as with `%matplotlib tk`
plot(x, sin(x))
figure(figsize=(12,8))
plot(x, cos(x), 'o');
10.10. Decorating the Curves#
Curves can be decorated in different ways. We have already seen some options, and there are many more. One can specify the color, line styles like dashed or dash-dot instead of solid, many different markers, and to have both markers and lines. As seen above, this can be controlled by an optional text string argument after the arrays of data for a curve:
figure()
plot(x, sin(x), '*-')
plot(x, cos(x), 'r--');
These three-part curve specifications can be combined:
in the following, plot
knows that there are two curves each specified by three arguments, not three curves each specified by just an “x-y” pair:
figure()
plot(x, sin(x), 'g-.', x, cos(x), 'm+-.');
Exercise A: Explore ways to refine your figures#
There are many commands for refining the appearance of a figure after its initial creation with plot
.
Experiment yourself with the commands title
, xlabel
, ylabel
, grid
, and legend
.
Using the functions mentioned above, produce a refined version of the above sine and cosine graph, with:
a title at the top
labels on both axes
a legend identifying each curve
a grid or “graph paper” background, to make it easier to judge details like where a function has zeros.
Exercise B: Saving externally displayed figures to files#
Then work out how to save this figure to a file (probably in format PNG), and turn that in, along with the file used to create it.
This is most readily done with externally displayed figures; that is, with %matplotlib tk
.
Making that change to tk
in a notebook requires then restarting the kernel for it to take effect;
use the “fast foerwrd button or open menu Kernel above and select “Restart Kernel and Run All Cells …*
For your own edification, explore other features of externally displayed figures, like zooming and panning: this cannot be done with inline figures.
Getting help from the documentation#
For some of these, you will probably need to read up. For simple things, there is a function help
, which is best used in the IPython interactive input window (within Spyder for example), but I will illustrate it here.
The entry for plot
is unusually long! It provides details about all the options mentioned above, like marker styles.
So this might be a good time to learn how to clear the output in a cell, to unclutter the view:
either use the above menu “Edit’ or open the menu with Control-click or right-click on the code cell;
then use “Clear Outputs” to remove the output of just the current cell.
help(plot)
Help on function plot in module matplotlib.pyplot:
plot(*args, scalex=True, scaley=True, data=None, **kwargs)
Plot y versus x as lines and/or markers.
Call signatures::
plot([x], y, [fmt], *, data=None, **kwargs)
plot([x], y, [fmt], [x2], y2, [fmt2], ..., **kwargs)
The coordinates of the points or line nodes are given by *x*, *y*.
The optional parameter *fmt* is a convenient way for defining basic
formatting like color, marker and linestyle. It's a shortcut string
notation described in the *Notes* section below.
>>> plot(x, y) # plot x and y using default line style and color
>>> plot(x, y, 'bo') # plot x and y using blue circle markers
>>> plot(y) # plot y using x as index array 0..N-1
>>> plot(y, 'r+') # ditto, but with red plusses
You can use `.Line2D` properties as keyword arguments for more
control on the appearance. Line properties and *fmt* can be mixed.
The following two calls yield identical results:
>>> plot(x, y, 'go--', linewidth=2, markersize=12)
>>> plot(x, y, color='green', marker='o', linestyle='dashed',
... linewidth=2, markersize=12)
When conflicting with *fmt*, keyword arguments take precedence.
**Plotting labelled data**
There's a convenient way for plotting objects with labelled data (i.e.
data that can be accessed by index ``obj['y']``). Instead of giving
the data in *x* and *y*, you can provide the object in the *data*
parameter and just give the labels for *x* and *y*::
>>> plot('xlabel', 'ylabel', data=obj)
All indexable objects are supported. This could e.g. be a `dict`, a
`pandas.DataFrame` or a structured numpy array.
**Plotting multiple sets of data**
There are various ways to plot multiple sets of data.
- The most straight forward way is just to call `plot` multiple times.
Example:
>>> plot(x1, y1, 'bo')
>>> plot(x2, y2, 'go')
- If *x* and/or *y* are 2D arrays a separate data set will be drawn
for every column. If both *x* and *y* are 2D, they must have the
same shape. If only one of them is 2D with shape (N, m) the other
must have length N and will be used for every data set m.
Example:
>>> x = [1, 2, 3]
>>> y = np.array([[1, 2], [3, 4], [5, 6]])
>>> plot(x, y)
is equivalent to:
>>> for col in range(y.shape[1]):
... plot(x, y[:, col])
- The third way is to specify multiple sets of *[x]*, *y*, *[fmt]*
groups::
>>> plot(x1, y1, 'g^', x2, y2, 'g-')
In this case, any additional keyword argument applies to all
datasets. Also this syntax cannot be combined with the *data*
parameter.
By default, each line is assigned a different style specified by a
'style cycle'. The *fmt* and line property parameters are only
necessary if you want explicit deviations from these defaults.
Alternatively, you can also change the style cycle using
:rc:`axes.prop_cycle`.
Parameters
----------
x, y : array-like or scalar
The horizontal / vertical coordinates of the data points.
*x* values are optional and default to ``range(len(y))``.
Commonly, these parameters are 1D arrays.
They can also be scalars, or two-dimensional (in that case, the
columns represent separate data sets).
These arguments cannot be passed as keywords.
fmt : str, optional
A format string, e.g. 'ro' for red circles. See the *Notes*
section for a full description of the format strings.
Format strings are just an abbreviation for quickly setting
basic line properties. All of these and more can also be
controlled by keyword arguments.
This argument cannot be passed as keyword.
data : indexable object, optional
An object with labelled data. If given, provide the label names to
plot in *x* and *y*.
.. note::
Technically there's a slight ambiguity in calls where the
second label is a valid *fmt*. ``plot('n', 'o', data=obj)``
could be ``plt(x, y)`` or ``plt(y, fmt)``. In such cases,
the former interpretation is chosen, but a warning is issued.
You may suppress the warning by adding an empty format string
``plot('n', 'o', '', data=obj)``.
Returns
-------
list of `.Line2D`
A list of lines representing the plotted data.
Other Parameters
----------------
scalex, scaley : bool, default: True
These parameters determine if the view limits are adapted to the
data limits. The values are passed on to `autoscale_view`.
**kwargs : `.Line2D` properties, optional
*kwargs* are used to specify properties like a line label (for
auto legends), linewidth, antialiasing, marker face color.
Example::
>>> plot([1, 2, 3], [1, 2, 3], 'go-', label='line 1', linewidth=2)
>>> plot([1, 2, 3], [1, 4, 9], 'rs', label='line 2')
If you specify multiple lines with one plot call, the kwargs apply
to all those lines. In case the label object is iterable, each
element is used as labels for each set of data.
Here is a list of available `.Line2D` properties:
Properties:
agg_filter: a filter function, which takes a (m, n, 3) float array and a dpi value, and returns a (m, n, 3) array
alpha: scalar or None
animated: bool
antialiased or aa: bool
clip_box: `.Bbox`
clip_on: bool
clip_path: Patch or (Path, Transform) or None
color or c: color
dash_capstyle: `.CapStyle` or {'butt', 'projecting', 'round'}
dash_joinstyle: `.JoinStyle` or {'miter', 'round', 'bevel'}
dashes: sequence of floats (on/off ink in points) or (None, None)
data: (2, N) array or two 1D arrays
drawstyle or ds: {'default', 'steps', 'steps-pre', 'steps-mid', 'steps-post'}, default: 'default'
figure: `.Figure`
fillstyle: {'full', 'left', 'right', 'bottom', 'top', 'none'}
gid: str
in_layout: bool
label: object
linestyle or ls: {'-', '--', '-.', ':', '', (offset, on-off-seq), ...}
linewidth or lw: float
marker: marker style string, `~.path.Path` or `~.markers.MarkerStyle`
markeredgecolor or mec: color
markeredgewidth or mew: float
markerfacecolor or mfc: color
markerfacecoloralt or mfcalt: color
markersize or ms: float
markevery: None or int or (int, int) or slice or list[int] or float or (float, float) or list[bool]
path_effects: `.AbstractPathEffect`
picker: float or callable[[Artist, Event], tuple[bool, dict]]
pickradius: float
rasterized: bool
sketch_params: (scale: float, length: float, randomness: float)
snap: bool or None
solid_capstyle: `.CapStyle` or {'butt', 'projecting', 'round'}
solid_joinstyle: `.JoinStyle` or {'miter', 'round', 'bevel'}
transform: unknown
url: str
visible: bool
xdata: 1D array
ydata: 1D array
zorder: float
See Also
--------
scatter : XY scatter plot with markers of varying size and/or color (
sometimes also called bubble chart).
Notes
-----
**Format Strings**
A format string consists of a part for color, marker and line::
fmt = '[marker][line][color]'
Each of them is optional. If not provided, the value from the style
cycle is used. Exception: If ``line`` is given, but no ``marker``,
the data will be a line without markers.
Other combinations such as ``[color][marker][line]`` are also
supported, but note that their parsing may be ambiguous.
**Markers**
============= ===============================
character description
============= ===============================
``'.'`` point marker
``','`` pixel marker
``'o'`` circle marker
``'v'`` triangle_down marker
``'^'`` triangle_up marker
``'<'`` triangle_left marker
``'>'`` triangle_right marker
``'1'`` tri_down marker
``'2'`` tri_up marker
``'3'`` tri_left marker
``'4'`` tri_right marker
``'8'`` octagon marker
``'s'`` square marker
``'p'`` pentagon marker
``'P'`` plus (filled) marker
``'*'`` star marker
``'h'`` hexagon1 marker
``'H'`` hexagon2 marker
``'+'`` plus marker
``'x'`` x marker
``'X'`` x (filled) marker
``'D'`` diamond marker
``'d'`` thin_diamond marker
``'|'`` vline marker
``'_'`` hline marker
============= ===============================
**Line Styles**
============= ===============================
character description
============= ===============================
``'-'`` solid line style
``'--'`` dashed line style
``'-.'`` dash-dot line style
``':'`` dotted line style
============= ===============================
Example format strings::
'b' # blue markers with default shape
'or' # red circles
'-g' # green solid line
'--' # dashed line with default color
'^k:' # black triangle_up markers connected by a dotted line
**Colors**
The supported color abbreviations are the single letter codes
============= ===============================
character color
============= ===============================
``'b'`` blue
``'g'`` green
``'r'`` red
``'c'`` cyan
``'m'`` magenta
``'y'`` yellow
``'k'`` black
``'w'`` white
============= ===============================
and the ``'CN'`` colors that index into the default property cycle.
If the color is the only part of the format string, you can
additionally use any `matplotlib.colors` spec, e.g. full names
(``'green'``) or hex strings (``'#008000'``).
The jargon used in help
can be confusing at first; fortunately there are other online sources that are more readable and better illustrated, like http://scipy-lectures.github.io/intro/matplotlib/matplotlib.html mentioned above.
However, that does not cover everything; the official pyplot documentation at http://matplotlib.org/1.3.1/api/pyplot_api.html is more complete: explore its search feature.