ireapps · zstumgoren · Mar 9, 2024 · Mar 8, 2024 · Mar 8, 2024 · Mar 8, 2024
diff --git a/completed/filter_csv_notebook_complete.ipynb b/completed/filter_csv_notebook_complete.ipynb
@@ -11,25 +11,21 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We're going to use built-in Python modules - programs really - to download a csv file from the Internet and save it locally.\n",
+    "So we know how to [download a CSV from the Interent and read the data](read_csv_notebook_complete.ipynb) using the `csv` library.\n",
     "\n",
-    "CSV stands for comma-separated values. It's a common file format a file format that resembles a spreadsheet or database table in a text file.\n",
+    "Now we're ready to do something a bit more useful using these libraries, combined with the basics we learned on Day 1.\n",
     "\n",
-    "So first, let's import two built-in Python modules: urllib and csv. \n",
-    "\n",
-    "* ```urllib``` is a module that allows Python to make http requests to URLs on the web to fetch HTML. It contains a submodule called request. And inside there we want a specific method called urlretrieve\n",
-    "\n",
-    "* ```csv``` is a module that helps Python work with tabular data extracted from spreadsheets and databases"
+    "Once again, we'll start by importing the library code for downloading a file and processing a CSV:"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 87,
+   "execution_count": 22,
    "metadata": {},
    "outputs": [],
    "source": [
-    "from urllib.request import urlretrieve\n",
-    "import csv"
+    "import csv\n",
+    "from urllib.request import urlretrieve"
    ]
   },
   {
@@ -41,7 +37,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 88,
+   "execution_count": 23,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -52,31 +48,24 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Now we need a URL to a CSV file out on the Internet.\n",
-    "\n",
-    "For this project we're going to download a CSV file that the [FDIC](https://www.fdic.gov/bank/individual/failed/banklist.html) compiles of all the banks that have failed since October 1, 2000.\n",
-    "\n",
-    "The file we want is at https://s3.amazonaws.com/datanicar/banklist.csv.\n",
+    "Now we can download the file using `urlretrieve`. Don't forget that it takes two arguments:\n",
     "\n",
-    "If the internet is uncooperative, we can also use the local version of the file in the ```project1/data/``` directory, and structure out code a little differently.\n",
-    "\n",
-    "To do this, we use that program within the ```urllib``` module to download the file and save it to our project folder. It's called ```urlretrieve``` and for our purposes starting out think of it as a way to download a file from the Internet.\n",
-    "\n",
-    "`urlretrieve` takes two arguments to download a file. First specify our target URL, and then we give it a name for the file we want to create."
+    "- the URL\n",
+    "- the name we'll give the file when we save it locally"
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 89,
+   "execution_count": 24,
    "metadata": {},
    "outputs": [
     {
      "data": {
       "text/plain": [
-       "('banklist.csv', <http.client.HTTPMessage at 0x110c740f0>)"
+       "('banklist.csv', <http.client.HTTPMessage at 0x114bb5910>)"
       ]
      },
-     "execution_count": 89,
+     "execution_count": 24,
      "metadata": {},
      "output_type": "execute_result"
     }
@@ -89,216 +78,118 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The output shows we successfully downloaded the file and saved it\n",
+    "The output shows we successfully downloaded the file and saved it.\n",
+    "\n",
+    "For this exercise, we'll:\n",
     "\n",
-    "Let's open a new file so we can filter just the data we want. We add the newline parameter when we open the file to write so it doesn't add [additional, blank rows on Windows machines](https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row/3348664)."
+    "- read the data\n",
+    "- filter the data for banks in California and save them in a list\n",
+    "- save the California banks list to a new CSV file\n",
+    "\n",
+    "> NOTE: Below, we use the newline argument when we open the file to write so it doesn't add [additional, blank rows on Windows machines](https://stackoverflow.com/questions/3348460/csv-file-written-with-python-has-blank-lines-between-each-row/3348664)."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 90,
+   "execution_count": 52,
    "metadata": {},
    "outputs": [],
    "source": [
-    "filtered_file = open('california_banks.csv', 'w', newline='')"
+    "ca_banks = []"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We will use the writer method to write data to a file by passing in the name of the new file as the first argument and delimiter as the the second.\n",
+    "We'll use the `csv` library's [writer](https://docs.python.org/3/library/csv.html#csv.writer) functionality to...well...write data to a file by passing in the name of the new file as the first argument and delimiter as the second.\n",
     "\n",
-    "Then we will go ahead and use python's csv reader to open the file and see what is inside.\n",
+    "Then we'll use `reader` to open the file and see what is inside.\n",
     "\n",
     "We specify the name of the file we just created, and we add a setting so we can open and read almost any CSV file."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 91,
+   "execution_count": 53,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stdout",
-     "output_type": "stream",
-     "text": [
-      "['Bank Name', 'City', 'ST', 'CERT', 'Acquiring Institution', 'Closing Date', 'Updated Date']\n",
-      "['Frontier Bank, FSB D/B/A El Paseo Bank', 'Palm Desert', 'CA', '34738', 'Bank of Southern California, N.A.', '7-Nov-14', '10-Nov-16']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Palm Desert National Bank', 'Palm Desert', 'CA', '23632', 'Pacific Premier Bank', '27-Apr-12', '7-Dec-15']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Citizens Bank of Northern California', 'Nevada City', 'CA', '33983', 'Tri Counties Bank', '23-Sep-11', '7-Jan-18']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['San Luis Trust Bank, FSB', 'San Luis Obispo', 'CA', '34783', 'First California Bank', '18-Feb-11', '12-Sep-16']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Charter Oak Bank', 'Napa', 'CA', '57855', 'Bank of Marin', '18-Feb-11', '29-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Canyon National Bank', 'Palm Springs', 'CA', '34692', 'Pacific Premier Bank', '11-Feb-11', '19-Aug-14']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['First Vietnamese American Bank', 'Westminster', 'CA', '57885', 'Grandpoint Bank', '5-Nov-10', '29-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Western Commercial Bank', 'Woodland Hills', 'CA', '58087', 'First California Bank', '5-Nov-10', '12-Sep-16']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Sonoma Valley Bank', 'Sonoma', 'CA', '27259', 'Westamerica Bank', '20-Aug-10', '8-Aug-18']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Los Padres Bank', 'Solvang', 'CA', '32165', 'Pacific Western Bank', '20-Aug-10', '29-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Butte Community Bank', 'Chico', 'CA', '33219', 'Rabobank, N.A.', '20-Aug-10', '29-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Pacific State Bank', 'Stockton', 'CA', '27090', 'Rabobank, N.A.', '20-Aug-10', '29-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Granite Community Bank, NA', 'Granite Bay', 'CA', '57315', 'Tri Counties Bank', '28-May-10', '7-Sep-17']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['1st Pacific Bank of California', 'San Diego', 'CA', '35517', 'City National Bank', '7-May-10', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Tamalpais Bank', 'San Rafael', 'CA', '33493', 'Union Bank, N.A.', '16-Apr-10', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Innovative Bank', 'Oakland', 'CA', '23876', 'Center Bank', '16-Apr-10', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['La Jolla Bank, FSB', 'La Jolla', 'CA', '32423', 'OneWest Bank, FSB', '19-Feb-10', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['First Regional Bank', 'Los Angeles', 'CA', '23011', 'First-Citizens Bank & Trust Company', '29-Jan-10', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['First Federal Bank of California, F.S.B.', 'Santa Monica', 'CA', '28536', 'OneWest Bank, FSB', '18-Dec-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Imperial Capital Bank', 'La Jolla', 'CA', '26348', 'City National Bank', '18-Dec-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Pacific Coast National Bank', 'San Clemente', 'CA', '57914', 'Sunwest Bank', '13-Nov-09', '10-Apr-17']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['United Commercial Bank', 'San Francisco', 'CA', '32469', 'East West Bank', '6-Nov-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Pacific National Bank', 'San Francisco', 'CA', '30006', 'U.S. Bank N.A.', '30-Oct-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['California National Bank', 'Los Angeles', 'CA', '34659', 'U.S. Bank N.A.', '30-Oct-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['San Diego National Bank', 'San Diego', 'CA', '23594', 'U.S. Bank N.A.', '30-Oct-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['San Joaquin Bank', 'Bakersfield', 'CA', '23266', 'Citizens Business Bank', '16-Oct-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Affinity Bank', 'Ventura', 'CA', '27197', 'Pacific Western Bank', '28-Aug-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Temecula Valley Bank', 'Temecula', 'CA', '34341', 'First-Citizens Bank & Trust Company', '17-Jul-09', '20-Oct-16']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Vineyard Bank', 'Rancho Cucamonga', 'CA', '23556', 'California Bank & Trust', '17-Jul-09', '14-Sep-18']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Mirae Bank', 'Los Angeles', 'CA', '57332', 'Wilshire State Bank', '26-Jun-09', '21-Feb-18']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['MetroPacific Bank', 'Irvine', 'CA', '57893', 'Sunwest Bank', '26-Jun-09', '5-Feb-15']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['First Bank of Beverly Hills', 'Calabasas', 'CA', '32069', 'No Acquirer', '24-Apr-09', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['County Bank', 'Merced', 'CA', '22574', 'Westamerica Bank', '6-Feb-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Alliance Bank', 'Culver City', 'CA', '23124', 'California Bank & Trust', '6-Feb-09', '8-Aug-18']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['1st Centennial Bank', 'Redlands', 'CA', '33025', 'First California Bank', '23-Jan-09', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['PFF Bank & Trust', 'Pomona', 'CA', '28344', 'U.S. Bank, N.A.', '21-Nov-08', '31-Jan-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Downey Savings & Loan', 'Newport Beach', 'CA', '30968', 'U.S. Bank, N.A.', '21-Nov-08', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Security Pacific Bank', 'Los Angeles', 'CA', '23595', 'Pacific Western Bank', '7-Nov-08', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['First Heritage Bank, NA', 'Newport Beach', 'CA', '57961', 'Mutual of Omaha Bank', '25-Jul-08', '12-Sep-16']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['IndyMac Bank', 'Pasadena', 'CA', '29730', 'OneWest Bank, FSB', '11-Jul-08', '1-Feb-19']\n",
-      "<class 'list'>\n",
-      "7\n",
-      "['Southern Pacific Bank', 'Torrance', 'CA', '27094', 'Beal Bank', '7-Feb-03', '20-Oct-08']\n",
-      "<class 'list'>\n",
-      "7\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
-    "# create our output\n",
-    "output = csv.writer(filtered_file, delimiter=',')\n",
-    "\n",
     "# open our downloaded file\n",
     "with open(downloaded_file, 'r') as file:\n",
-    "    \n",
-    "    # use python's csv reader to access the contents\n",
-    "    # and create an object that represents the data\n",
-    "    csv_data = csv.reader(file)\n",
-    "    \n",
-    "    # write our header row to the output csv\n",
-    "    header_row = next(csv_data)\n",
-    "    print(header_row)    \n",
-    "    output.writerow(header_row)\n",
-    "    \n",
-    "    # loop through each row of the csv\n",
-    "    for row in csv_data:\n",
     "\n",
-    "        # now we're going to use an IF statement\n",
-    "        # to find items where the state field\n",
-    "        # is equal to California\n",
-    "        if row[2] == 'CA':\n",
-    "            \n",
-    "            # write the row to the new csv file\n",
-    "            output.writerow(row)          \n",
-    "        \n",
-    "            # and print the row to the terminal\n",
-    "            print(row)\n",
+    "    # Create a counter\n",
+    "    count = 0\n",
     "\n",
-    "            # print the data type to the terminal\n",
-    "            print(type(row))\n",
-    "\n",
-    "            # print the length of the row to the terminal\n",
-    "            print(len(row))            \n",
-    "            \n",
-    "        # otherwise continue on\n",
-    "        else:\n",
-    "            continue\n",
-    "            \n",
-    "# close the output file\n",
-    "filtered_file.close()"
+    "    # Read the data\n",
+    "    reader = csv.reader(file)\n",
+    "    \n",
+    "    import pdb\n",
+    "    # loop through each row of the csv\n",
+    "    for row in reader:\n",
+    "        # Save the header row\n",
+    "        if count == 0:\n",
+    "            ca_banks.append(row)\n",
+    "\n",
+    "        # Save the row if it's a CA bank\n",
+    "        if row[2].strip().upper() == 'CA':\n",
+    "            # Increment the count\n",
+    "            count += 1\n",
+    "            # Save the row to a our list\n",
+    "            ca_banks.append(row)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "How do we find out how many banks failed in California?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 54,
+   "metadata": {},
+   "outputs": [
+    {
+     "data": {
+      "text/plain": [
+       "65"
+      ]
+     },
+     "execution_count": 54,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "len(ca_banks)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "And we can create a new CSV with just the CA banks."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 55,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "with open('ca_banks.csv', 'w') as output:\n",
+    "    writer = csv.writer(output)\n",
+    "    for row in ca_banks:\n",
+    "        writer.writerow(row)"
    ]
   }
  ],
  "metadata": {
   "anaconda-cloud": {},
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -312,9 +203,9 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.7.0"
+   "version": "3.11.8"
   }
  },
  "nbformat": 4,
- "nbformat_minor": 1
+ "nbformat_minor": 4
 }