Introduction to Deep Learning in Python
Run the hidden code cell below to import the data used in this course.
# Import pandas
import pandas as pd
# Import the course datasets
wages = pd.read_csv('datasets/hourly_wages.csv')
mnist = pd.read_csv('datasets/mnist.csv')
titanic = pd.read_csv('datasets/titanic_all_numeric.csv')
Take Notes
Add notes about the concepts you've learned and code cells with code you want to keep.
Add your notes here
model_output = model.predict(inputs)
Back propagation
Calculating the Slopes associated to any weight (Gradient for weight is the product of)
- the node value feeding into that weight
- the slope of the activation function for the node being fed into (1 in this case)
- the slope of the loss function w.r.t output node
d_activation = activation * (1 - activation)
The slope for the top node: input node value = 0 slope of the output node = 6 derivative of the slope of the relu function = 1
061 == 0
The slope for the top node from down node: input node value = 1 slope of the output node = 6 derivative of the slope of the relu function = 1
161 == 6
The slope for the down node: input node value = 0 slope of the output node = 18 derivative of the slope of the relu function = 1
0181 == 0
The slope for the top node from down node: input node value = 1 slope of the output node = 18 derivative of the slope of the relu function = 1
1181 == 18
Model recap
Backpropagation:
- start with random weights
- use forward propagation to make prediction
- use backpropagation to calculate the slope of the loss function w.r.t each weight
- multiply the slpe by the learning rate, and subtract from the current weights
- keep going with the cycle until we get a flat part
Stochstic Gradient descent
For computational efficiency, it is recommended to calculate the slope on only a subsection of the data called a batch.
Then use a different batch to calculate the update of the next cycle
Ones you use all the data, we start all over from the beginning.
Each time through the training data is called an epoch.
This is called stochastic gradient descent.
6
Introduction to KERAS Model
ACFP {Architecture, compile, fit, predict}
Building a Keras module
- Specify the architecture
- how many layers do you want
- how many nodes in each layer
- what activation functions do you want to use in each layer
- Compile the model
- This specifies the loss function
- how the optimization works
- Fit the model
- Cycle of back propagation and optimization of the model weight with your data
- Predict
#Code
#This block imports the required libraries for the model. numpy to read the data, keras to build the nodes
import numpy as np
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
#We read the data and store it as predictors variable
#n_cols defines the number of nodes in the input layer which is the number of columns in the data set
predictors = np.loadtxt('datasets/hourly_wages.csv', delimiter=',', skiprows=1) # Skip the header row
n_cols = predictors.shape[1]
#We build the model. Sequesntial is the easiest way (Each node has weights or connections only to the one layer coming directly after it)
#Dense layers
model = Sequential()
model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
model.add(Dense(100, activation='relu'))
model.add(Dense(1)) #the output layer is one node as an output of one predicted variable
# Importing the required libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
# Define the model architecture
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Load and preprocess the data
train_data = tf.keras.preprocessing.image_dataset_from_directory(
datasets/titanic_all_numeric.csv',
labels='inferred',
label_mode='binary',
image_size=(64, 64),
batch_size=32
)
test_data = tf.keras.preprocessing.image_dataset_from_directory(
'datasets/titanic_all_numeric.csv',
labels='inferred',
label_mode='binary',
image_size=(64, 64),
batch_size=32
)
# Train the model
model.fit(train_data, epochs=10)
# Evaluate the model
model.evaluate(test_data)
# Importing the required libraries
import pandas as pd
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
# Load the CSV data using pandas
data = pd.read_csv('path/to/data.csv')
# Split the data into features and target
X = data.drop('target_column', axis=1)
y = data['target_column']
# Define the model architecture
model = Sequential()
model.add(Dense(64, activation='relu', input_shape=(X.shape[1],)))
model.add(Dense(32, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Train the model
model.fit(X, y, epochs=10)
This code defines a convolutional neural network (CNN) model using the Keras API in TensorFlow. Let's break down each line and parameter:
-
model = Sequential()
: This line creates a sequential model, which allows you to add layers to your neural network sequentially. It initializes an empty model. -
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
:Conv2D(32, (3, 3))
: This line adds a 2D convolutional layer with 32 filters/kernels of size 3x3. The number of filters defines the depth of the layer, and the filter size specifies the spatial dimensions of each filter.activation='relu'
: The ReLU (Rectified Linear Unit) activation function is used as the activation function for this layer. It introduces non-linearity into the model.input_shape=(64, 64, 3)
: This specifies the shape of the input data that the model expects. It's a 3D tensor with dimensions 64x64x3, where 64x64 represents the width and height of the input image, and 3 represents the number of color channels (RGB).
-
model.add(MaxPooling2D((2, 2)))
:MaxPooling2D((2, 2))
: This line adds a max-pooling layer with a pool size of 2x2. Max-pooling reduces the spatial dimensions of the feature maps, helping to capture the most important information while reducing computation.
-
The following lines repeat the pattern of adding convolutional and max-pooling layers to the model:
model.add(Conv2D(64, (3, 3), activation='relu'))
: Another convolutional layer with 64 filters and ReLU activation.model.add(MaxPooling2D((2, 2)))
: Another max-pooling layer.
-
model.add(Flatten())
: This line adds a flatten layer that converts the 2D feature maps produced by the convolutional layers into a 1D vector. This is necessary before connecting to fully connected (dense) layers. -
model.add(Dense(128, activation='relu'))
:Dense(128, activation='relu')
: This line adds a fully connected (dense) layer with 128 units and ReLU activation. It's a densely connected layer, meaning each unit is connected to every unit in the previous layer.
-
model.add(Dense(1, activation='sigmoid'))
:Dense(1, activation='sigmoid')
: This adds the final dense layer with a single unit and sigmoid activation. This is commonly used in binary classification problems where the model outputs a probability value between 0 and 1.
In summary, this code defines a CNN architecture for image classification. It starts with convolutional and max-pooling layers to extract features from the input images, followed by flattening and fully connected layers for classification. The final layer uses sigmoid activation to produce binary classification outputs.