{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "7mRfQA20DAot" }, "source": [ "## Purpose of this Notebook\n", "\n", "The purpose of this notebook is to provide you with an example solution to the Guided Project for the Natural Language Processing for Deep Learning course. Since the choice of data preparation, preprocessing, and modelling strategy is up to you, your results may be slightly different. Use this solution as a guide on how you could structure your own approach." ] }, { "cell_type": "markdown", "metadata": { "id": "WtJe4zVyDAot" }, "source": [ "## Loading the libraries and the Data" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2023-06-04T14:09:50.107771Z", "iopub.status.busy": "2023-06-04T14:09:50.107316Z", "iopub.status.idle": "2023-06-04T14:09:50.702941Z", "shell.execute_reply": "2023-06-04T14:09:50.702035Z", "shell.execute_reply.started": "2023-06-04T14:09:50.107739Z" }, "id": "wUVP4rMly-L0" }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import seaborn as sns\n", "import matplotlib.pyplot as plt\n", "import tensorflow as tf\n", "from tensorflow import keras\n", "from tensorflow.keras import layers\n", "from sklearn.model_selection import train_test_split" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2023-06-04T14:09:50.705857Z", "iopub.status.busy": "2023-06-04T14:09:50.704951Z", "iopub.status.idle": "2023-06-04T14:09:50.770556Z", "shell.execute_reply": "2023-06-04T14:09:50.769526Z", "shell.execute_reply.started": "2023-06-04T14:09:50.705815Z" }, "id": "1NVifoD4xzt5", "outputId": "bbebee14-76f3-473b-e005-bfedb36bd22f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(7613, 5)\n", "Index(['id', 'keyword', 'location', 'text', 'target'], dtype='object')\n" ] } ], "source": [ "df = pd.read_csv(\"train.csv\")\n", "print(df.shape)\n", "print(df.columns)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "execution": { "iopub.execute_input": "2023-06-04T14:09:50.772234Z", "iopub.status.busy": "2023-06-04T14:09:50.771831Z", "iopub.status.idle": "2023-06-04T14:09:50.791453Z", "shell.execute_reply": "2023-06-04T14:09:50.790593Z", "shell.execute_reply.started": "2023-06-04T14:09:50.772204Z" }, "id": "XJNo5rl4yDjH", "outputId": "2e463a95-bafd-42a4-e6d4-0faa592a7f1a" }, "outputs": [ { "data": { "text/html": [ "
\n", " | id | \n", "keyword | \n", "location | \n", "text | \n", "target | \n", "
---|---|---|---|---|---|
0 | \n", "1 | \n", "NaN | \n", "NaN | \n", "Our Deeds are the Reason of this #earthquake M... | \n", "1 | \n", "
1 | \n", "4 | \n", "NaN | \n", "NaN | \n", "Forest fire near La Ronge Sask. Canada | \n", "1 | \n", "
2 | \n", "5 | \n", "NaN | \n", "NaN | \n", "All residents asked to 'shelter in place' are ... | \n", "1 | \n", "
3 | \n", "6 | \n", "NaN | \n", "NaN | \n", "13,000 people receive #wildfires evacuation or... | \n", "1 | \n", "
4 | \n", "7 | \n", "NaN | \n", "NaN | \n", "Just got sent this photo from Ruby #Alaska as ... | \n", "1 | \n", "
\n", " | id | \n", "keyword | \n", "location | \n", "text | \n", "target | \n", "
---|---|---|---|---|---|
7608 | \n", "10869 | \n", "NaN | \n", "NaN | \n", "Two giant cranes holding a bridge collapse int... | \n", "1 | \n", "
7609 | \n", "10870 | \n", "NaN | \n", "NaN | \n", "@aria_ahrary @TheTawniest The out of control w... | \n", "1 | \n", "
7610 | \n", "10871 | \n", "NaN | \n", "NaN | \n", "M1.94 [01:04 UTC]?5km S of Volcano Hawaii. htt... | \n", "1 | \n", "
7611 | \n", "10872 | \n", "NaN | \n", "NaN | \n", "Police investigating after an e-bike collided ... | \n", "1 | \n", "
7612 | \n", "10873 | \n", "NaN | \n", "NaN | \n", "The Latest: More Homes Razed by Northern Calif... | \n", "1 | \n", "
\n", " | text | \n", "target | \n", "text_tokenized | \n", "text_stop | \n", "text_lemmatized | \n", "final | \n", "
---|---|---|---|---|---|---|
0 | \n", "our deeds are the reason of this earthquake m... | \n", "1 | \n", "[our, deeds, are, the, reason, of, this, earth... | \n", "[deeds, reason, earthquake, may, allah, forgiv... | \n", "[deeds, reason, earthquake, may, allah, forgiv... | \n", "deeds reason earthquake may allah forgive us | \n", "
1 | \n", "forest fire near la ronge sask canada | \n", "1 | \n", "[forest, fire, near, la, ronge, sask, canada] | \n", "[forest, fire, near, la, ronge, sask, canada] | \n", "[forest, fire, near, la, ronge, sask, canada] | \n", "forest fire near la ronge sask canada | \n", "
2 | \n", "all residents asked to shelter in place are ... | \n", "1 | \n", "[all, residents, asked, to, shelter, in, place... | \n", "[residents, asked, shelter, place, notified, o... | \n", "[residents, ask, shelter, place, notify, offic... | \n", "residents ask shelter place notify officer eva... | \n", "
3 | \n", "people receive wildfires evacuation or... | \n", "1 | \n", "[people, receive, wildfires, evacuation, order... | \n", "[people, receive, wildfires, evacuation, order... | \n", "[people, receive, wildfires, evacuation, order... | \n", "people receive wildfires evacuation order cali... | \n", "
4 | \n", "just got sent this photo from ruby alaska as ... | \n", "1 | \n", "[just, got, sent, this, photo, from, ruby, ala... | \n", "[got, sent, photo, ruby, alaska, smoke, wildfi... | \n", "[get, send, photo, ruby, alaska, smoke, wildfi... | \n", "get send photo ruby alaska smoke wildfires pou... | \n", "