{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Lecture 9 - Data Visualization with Seaborn" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "[](https://github.com/avakanski/Fall-2024-Applied-Data-Science-with-Python/blob/main/docs/Lectures/Theme_2-Data_Engineering/Lecture_9-Seaborn/Lecture_9-Seaborn.ipynb)\n", "[](https://colab.research.google.com/github/avakanski/Fall-2024-Applied-Data-Science-with-Python/blob/main/docs/Lectures/Theme_2-Data_Engineering/Lecture_9-Seaborn/Lecture_9-Seaborn.ipynb) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": { "id": "6Kmse2vPRh9q" }, "source": [ "**Seaborn** is a Python library for data visualization that provides an interface for drawing plots and conducting data exploration via visualization and informative graphics. Internally, Seaborn uses Matplotlib to create the plots, therefore it can be considered a high-level interface for Matplotlib which allows to quickly and easily customize our plots.\n", "\n", "For more information please visit the official [website](https://seaborn.pydata.org). Examples of plots created with Seaborn can be found in the [gallery page](https://seaborn.pydata.org/examples/index.html).\n", "\n", "- [9.1 Relational Plots](#9.1-relational-plots)\n", "- [9.2 Distributional Plots](#9.2-distributional-plots)\n", "- [9.3 Categorical Plots](#9.3-categorical-plots)\n", "- [9.4 Regression Plots](#9.4-regression-plots)\n", "- [9.5 Multiple Plots](#9.5-multiple-plots)\n", "- [9.6 Matrix Plots](#9.6-matrix-plots)\n", "- [9.7 Styles, Themes, and Colors](#9.7-styles,-themes,-and-colors)\n", "- [References](#references)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By convention, Seaborn is imported as `sns`. The name Seaborn was originally based on a character named Samuel Norman Seaborn from a television show, and the alias `sns` is based on the character's initials. \n", "\n", "To explain the functionality of Seaborn, in this lecture we will use the following four datasets: `titanic`, `fmri`, `tips`, and `flights`, which can be loaded directly as DataFrames from the Seaborn datasets. \n", "\n", "The first few rows of these datasets are shown below. " ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "nvxlkAmwZKyS", "tags": [] }, "outputs": [], "source": [ "# Import libraries\n", "import seaborn as sns\n", "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt\n", "\n", "# Loading the datasets for this lecture\n", "titanic = sns.load_dataset('titanic')\n", "fmri = sns.load_dataset('fmri')\n", "tips = sns.load_dataset('tips')\n", "flights = sns.load_dataset('flights')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 204 }, "id": "TX88Kfghavd6", "outputId": "22936892-90e7-41ff-b4d6-6d13331246bf", "tags": [] }, "outputs": [ { "data": { "text/html": [ "
| \n", " | survived | \n", "pclass | \n", "sex | \n", "age | \n", "sibsp | \n", "parch | \n", "fare | \n", "embarked | \n", "class | \n", "who | \n", "adult_male | \n", "deck | \n", "embark_town | \n", "alive | \n", "alone | \n", "
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | \n", "0 | \n", "3 | \n", "male | \n", "22.0 | \n", "1 | \n", "0 | \n", "7.2500 | \n", "S | \n", "Third | \n", "man | \n", "True | \n", "NaN | \n", "Southampton | \n", "no | \n", "False | \n", "
| 1 | \n", "1 | \n", "1 | \n", "female | \n", "38.0 | \n", "1 | \n", "0 | \n", "71.2833 | \n", "C | \n", "First | \n", "woman | \n", "False | \n", "C | \n", "Cherbourg | \n", "yes | \n", "False | \n", "
| 2 | \n", "1 | \n", "3 | \n", "female | \n", "26.0 | \n", "0 | \n", "0 | \n", "7.9250 | \n", "S | \n", "Third | \n", "woman | \n", "False | \n", "NaN | \n", "Southampton | \n", "yes | \n", "True | \n", "
| \n", " | subject | \n", "timepoint | \n", "event | \n", "region | \n", "signal | \n", "
|---|---|---|---|---|---|
| 0 | \n", "s13 | \n", "18 | \n", "stim | \n", "parietal | \n", "-0.017552 | \n", "
| 1 | \n", "s5 | \n", "14 | \n", "stim | \n", "parietal | \n", "-0.080883 | \n", "
| 2 | \n", "s12 | \n", "18 | \n", "stim | \n", "parietal | \n", "-0.081033 | \n", "
| \n", " | total_bill | \n", "tip | \n", "sex | \n", "smoker | \n", "day | \n", "time | \n", "size | \n", "
|---|---|---|---|---|---|---|---|
| 0 | \n", "16.99 | \n", "1.01 | \n", "Female | \n", "No | \n", "Sun | \n", "Dinner | \n", "2 | \n", "
| 1 | \n", "10.34 | \n", "1.66 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "3 | \n", "
| 2 | \n", "21.01 | \n", "3.50 | \n", "Male | \n", "No | \n", "Sun | \n", "Dinner | \n", "3 | \n", "
| \n", " | year | \n", "month | \n", "passengers | \n", "
|---|---|---|---|
| 0 | \n", "1949 | \n", "Jan | \n", "112 | \n", "
| 1 | \n", "1949 | \n", "Feb | \n", "118 | \n", "
| 2 | \n", "1949 | \n", "Mar | \n", "132 | \n", "
| \n", " | survived | \n", "pclass | \n", "age | \n", "sibsp | \n", "parch | \n", "fare | \n", "adult_male | \n", "alone | \n", "
|---|---|---|---|---|---|---|---|---|
| survived | \n", "1.000000 | \n", "-0.338481 | \n", "-0.077221 | \n", "-0.035322 | \n", "0.081629 | \n", "0.257307 | \n", "-0.557080 | \n", "-0.203367 | \n", "
| pclass | \n", "-0.338481 | \n", "1.000000 | \n", "-0.369226 | \n", "0.083081 | \n", "0.018443 | \n", "-0.549500 | \n", "0.094035 | \n", "0.135207 | \n", "
| age | \n", "-0.077221 | \n", "-0.369226 | \n", "1.000000 | \n", "-0.308247 | \n", "-0.189119 | \n", "0.096067 | \n", "0.280328 | \n", "0.198270 | \n", "
| sibsp | \n", "-0.035322 | \n", "0.083081 | \n", "-0.308247 | \n", "1.000000 | \n", "0.414838 | \n", "0.159651 | \n", "-0.253586 | \n", "-0.584471 | \n", "
| parch | \n", "0.081629 | \n", "0.018443 | \n", "-0.189119 | \n", "0.414838 | \n", "1.000000 | \n", "0.216225 | \n", "-0.349943 | \n", "-0.583398 | \n", "
| fare | \n", "0.257307 | \n", "-0.549500 | \n", "0.096067 | \n", "0.159651 | \n", "0.216225 | \n", "1.000000 | \n", "-0.182024 | \n", "-0.271832 | \n", "
| adult_male | \n", "-0.557080 | \n", "0.094035 | \n", "0.280328 | \n", "-0.253586 | \n", "-0.349943 | \n", "-0.182024 | \n", "1.000000 | \n", "0.404744 | \n", "
| alone | \n", "-0.203367 | \n", "0.135207 | \n", "0.198270 | \n", "-0.584471 | \n", "-0.583398 | \n", "-0.271832 | \n", "0.404744 | \n", "1.000000 | \n", "