mwaskom/seaborn: v0.9.0 (July 2018)
Abstract
v0.9.0 (July 2018) Note: a version of these release notes with working links appears in the online documentation. This is a major release with several substantial and long-desired new features. There are also updates/modifications to the themes and color palettes that give better consistency with matplotlib 2.0 and some notable API changes. New relational plots Three completely new plotting functions have been added: catplot, scatterplot, and lineplot. The first is a figure-level interface to the latter two that combines them with a FacetGrid. The functions bring the high-level, dataset-oriented API of the seaborn categorical plotting functions to more general plots (scatter plots and line plots). These functions can visualize a relationship between two numeric variables while mapping up to three additional variables by modifying hue, size, and/or style semantics. The common high-level API is implemented differently in the two functions. For example, the size semantic in scatterplot scales the area of scatter plot points, but in lineplot it scales width of the line plot lines. The API is dataset-oriented, meaning that in both cases you pass the variable in your dataset rather than directly specifying the matplotlib parameters to use for point area or line width. Another way the relational functions differ from existing seaborn functionality is that they have better support for using numeric variables for hue and size semantics. This functionality may be propagated to other functions that can add a hue semantic in future versions; it has not been in this release. The lineplot function also has support for statistical estimation and is replacing the older tsplot function, which still exists but is marked for removal in a future release. lineplot is better aligned with the API of the rest of the library and more flexible in showing relationships across additional variables by modifying the size and style semantics independently. It also has substantially improved support for date and time data, a major pain factor in tsplot. The cost is that some of the more esoteric options in tsplot for representing uncertainty (e.g. a colormapped KDE of the bootstrap distribution) have not been implemented in the new function. There is quite a bit of new documentation that explains these new functions in more detail, including detailed examples of the various options in the API reference and a more verbose tutorial. These functions should be considered in a "stable beta" state. They have been thoroughly tested, but some unknown corner cases may remain to be found. The main features are in place, but not all planned functionality has been implemented. There are planned improvements to some elements, particularly the default legend, that are a little rough around the edges in this release. Finally, some of the default behavior (e.g. the default range of point/line sizes) may change somewhat in future releases. Updates to themes and palettes Several changes have been made to the seaborn style themes, context scaling, and color palettes. In general the aim of these changes was to make the seaborn styles more consistent with the style updates in matplotlib 2.0 and to leverage some of the new style parameters for better implementation of some aspects of the seaborn styles. Here is a list of the changes: Reorganized and updated some axes_style/plotting_context parameters to take advantage of improvements in the matplotlib 2.0 update. The biggest change involves using several new parameterss in the "style" spec while moving parameters that used to implement the corresponding aesthetics to the "context" spec. For example, axes spines and ticks are now off instead of having their width/length zeroed out for the darkgrid style. That means the width/length of these elements can now be scaled in different contexts. The effect is a more cohesive appearance of the plots, especially in larger contexts. These changes include only minimal support for the 1.x matplotlib series. Users who are stuck on matplotlib 1.5 but wish to use seaborn styling may want to use the seaborn parameters that can be accessed through the matplotlib stylesheet interface. Updated the seaborn palettes ("deep", "muted", "colorblind", etc.) to correspond with the new 10-color matplotlib default. The legacy palettes are now available at "deep6", "muted6", "colorblind6", etc. Additionally, a few individual colors were tweaked for better consistency, aesthetics, and accessibility. Calling color_palette (or set_palette) with a named qualitative palettes (i.e. one of the seaborn palettes, the colorbrewer qualitative palettes, or the matplotlib matplotlib tableau-derived palettes) and no specified number of colors will return all of the colors in the palette. This means that for some palettes, the returned list will have a different length than it did in previous versions. Enhanced color_palette to accept a parameterized specification of a cubehelix palette in in a string, prefixed with "ch:" (e.g. "ch:-.1,.2,l=.7"). Note that keyword arguments can be spelled out or referenced using only their first letter. Reversing the palette is accomplished by appending "_r", as with other matplotlib colormaps. This specification will be accepted by any seaborn function with a palette= parameter. Slightly increased the base font sizes in plotting_context and increased the scaling factors for "talk" and "poster" contexts. Calling set will now call set_color_codes to re-assign the single letter color codes by default API changes A few functions have been renamed or have had changes to their default parameters. The factorplot function has been renamed to catplot. The new name ditches the original R-inflected terminology to use a name that is more consistent with terminology in pandas and in seaborn itself. This change should hopefully make catplot easier to discover, and it should make more clear what its role is. factorplot still exists and will pass its arguments through to catplot with a warning. It may be removed eventually, but the transition will be as gradual as possible. The other reason that the factorplot name was changed was to ease another alteration which is that the default kind in catplot is now "strip" (corresponding to stripplot). This plots a categorical scatter plot which is usually a much better place to start and is more consistent with the default in relplot. The old default style in factorplot ("point", corresponding to pointplot) remains available if you want to show a statistical estimation. The lvplot function has been renamed to boxenplot. The "letter-value" terminology that was used to name the original kind of plot is obscure, and the abbreviation to lv did not help anything. The new name should make the plot more discoverable by describing its format (it plots multiple boxes, also known as "boxen"). As with factorplot, the lvplot function still exists to provide a relatively smooth transition. Renamed the size parameter to height in multi-plot grid objects (FacetGrid, PairGrid, and JointGrid) along with functions that use them (factorplot, lmplot, pairplot, and jointplot) to avoid conflicts with the size parameter that is used in scatterplot and lineplot (necessary to make relplot work) and also makes the meaning of the parameter a bit more clear. Changed the default diagonal plots in pairplot to use `func`:kdeplot` when a "hue" dimension is used. Deprecated the statistical annotation component of JointGrid. The method is still available but will be removed in a future version. Two older functions that were deprecated in earlier versions, coefplot and interactplot, have undergone final removal from the code base. Documentation improvements There has been some effort put into improving the documentation. The biggest change is that the introduction to the library has been completely rewritten to provide much more information and, critically, examples. In addition to the high-level motivation, the introduction also covers some important topics that are often sources of confusion, like the distinction between figure-level and axes-level functions, how datasets should be formatted for use in seaborn, and how to customize the appearance of the plots. Other improvements have been made throughout, most notably a thorough re-write of the categorical tutorial categorical_tutorial. Other small enhancements and bug fixes Changed rugplot to plot a matplotlib LineCollection instead of many Line2D objects, providing a big speedup for large arrays. Changed the default off-diagonal plots to use scatterplot. (Note that the "hue" currently draws three separate scatterplots instead of using the hue semantic of the scatterplot function). Changed color handling when using kdeplot with two variables. The default colormap for the 2D density now follows the color cycle, and the function can use color and label kwargs, adding more flexibility and avoiding a warning when using with multi-plot grids. Added the subplot_kws parameter to PairGrid for more flexibility. Removed a special case in PairGrid that defaulted to drawing stacked histograms on the diagonal axes. Fixed jointplot/JointGrid and regplot so that they now accept list inputs. Fixed a bug in FacetGrid when using a single row/column level or using col_wrap=1. Fixed functions that set axis limits so that they preserve auto-scaling state on matplotlib 2.0. Avoided an error when using matplotlib backends that cannot render a canvas (e.g. PDF). Changed the install infrastructure to explicitly declare dependencies in a way that pip is aware of. This means that pip install seaborn will now work in an empty environment. Additionally, the dependencies are specified with strict minimal versions. Updated the testing infrastructure to execute tests with pytest (although many individual tests still use nose assertion).
- Publication:
-
Zenodo
- Pub Date:
- DOI:
- Bibcode:
- 2018zndo...1313201W