Introduction to Deep Learning in Python (copy)
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Introduction to Deep Learning in Python

    Run the hidden code cell below to import the data used in this course.

    # Import pandas
    import pandas as pd
    
    # Import the course datasets 
    wages = pd.read_csv('datasets/hourly_wages.csv')
    mnist = pd.read_csv('datasets/mnist.csv')
    titanic = pd.read_csv('datasets/titanic_all_numeric.csv')

    Take Notes

    Add notes about the concepts you've learned and code cells with code you want to keep.

    Add your notes here

    model_output = model.predict(inputs)

    Back propagation

    Calculating the Slopes associated to any weight (Gradient for weight is the product of)

    1. the node value feeding into that weight
    2. the slope of the activation function for the node being fed into (1 in this case)
    3. the slope of the loss function w.r.t output node

    d_activation = activation * (1 - activation)

    The slope for the top node: input node value = 0 slope of the output node = 6 derivative of the slope of the relu function = 1

    061 == 0

    The slope for the top node from down node: input node value = 1 slope of the output node = 6 derivative of the slope of the relu function = 1

    161 == 6

    The slope for the down node: input node value = 0 slope of the output node = 18 derivative of the slope of the relu function = 1

    0181 == 0

    The slope for the top node from down node: input node value = 1 slope of the output node = 18 derivative of the slope of the relu function = 1

    1181 == 18

    Run cancelled

    Model recap

    Backpropagation:

    • start with random weights
    • use forward propagation to make prediction
    • use backpropagation to calculate the slope of the loss function w.r.t each weight
    • multiply the slpe by the learning rate, and subtract from the current weights
    • keep going with the cycle until we get a flat part

    Stochstic Gradient descent

    For computational efficiency, it is recommended to calculate the slope on only a subsection of the data called a batch.

    Then use a differet batch to calculate the update of the next cycle

    Ones you use all the data, we start all over from the beginning.

    Each time through the training data is called an epoch.

    This is called stochastic gradient descent.

    6

    Introduction to KERAS Model

    ACFP {Architecture, compile, fit, predict}

    Building a Keras module

    1. Specify the architecture
    • how many layers do you want
    • how many nodes in each layer
    • what activation functions do you want to use in each layer
    1. Compile the model
    • This specifies the loss function
    • how the optimization works
    1. Fit the model
    • Cycle of back propagation and optimization of the model weight with your data
    1. Predict
    Run cancelled
    #Code
    #This block imports the required libraries for the model. numpy to read the data, keras to build the nodes
    
    import numpy as np
    from tensorflow.keras.layers import Dense
    from tensorflow.keras.models import Sequential
    
    #We read the data and store it as predictors variable
    #n_cols defines the number of nodes in the input layer which is the number of columns in the data set
    
    predictors = np.loadtxt('datasets/hourly_wages.csv', delimiter=',', skiprows=1) # Skip the header row
    n_cols = predictors.shape[1]
    
    #We build the model. Sequesntial is the easiest way (Each node has weights or connections only to the one layer coming directly after it)
    #Dense layers
    
    model = Sequential()
    model.add(Dense(100, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(100, activation='relu'))
    model.add(Dense(1)) #the output layer is one node as an output of one predicted variable
    
    
    
    
    Run cancelled
    # Importing the required libraries
    import tensorflow as tf
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
    
    # Define the model architecture
    model = Sequential()
    model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(64, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Conv2D(128, (3, 3), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    model.add(Flatten())
    model.add(Dense(128, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
    # Load and preprocess the data
    train_data = tf.keras.preprocessing.image_dataset_from_directory(
        datasets/titanic_all_numeric.csv',
        labels='inferred',
        label_mode='binary',
        image_size=(64, 64),
        batch_size=32
    )
    test_data = tf.keras.preprocessing.image_dataset_from_directory(
        'datasets/titanic_all_numeric.csv',
        labels='inferred',
        label_mode='binary',
        image_size=(64, 64),
        batch_size=32
    )
    
    # Train the model
    model.fit(train_data, epochs=10)
    
    # Evaluate the model
    model.evaluate(test_data)
    Run cancelled
    # Importing the required libraries
    import pandas as pd
    import tensorflow as tf
    from tensorflow.keras.models import Sequential
    from tensorflow.keras.layers import Dense
    
    # Load the CSV data using pandas
    data = pd.read_csv('path/to/data.csv')
    
    # Split the data into features and target
    X = data.drop('target_column', axis=1)
    y = data['target_column']
    
    # Define the model architecture
    model = Sequential()
    model.add(Dense(64, activation='relu', input_shape=(X.shape[1],)))
    model.add(Dense(32, activation='relu'))
    model.add(Dense(1, activation='sigmoid'))
    
    # Compile the model
    model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
    
    # Train the model
    model.fit(X, y, epochs=10)

    This code defines a convolutional neural network (CNN) model using the Keras API in TensorFlow. Let's break down each line and parameter:

    1. model = Sequential(): This line creates a sequential model, which allows you to add layers to your neural network sequentially. It initializes an empty model.

    2. model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3))):

      • Conv2D(32, (3, 3)): This line adds a 2D convolutional layer with 32 filters/kernels of size 3x3. The number of filters defines the depth of the layer, and the filter size specifies the spatial dimensions of each filter.
      • activation='relu': The ReLU (Rectified Linear Unit) activation function is used as the activation function for this layer. It introduces non-linearity into the model.
      • input_shape=(64, 64, 3): This specifies the shape of the input data that the model expects. It's a 3D tensor with dimensions 64x64x3, where 64x64 represents the width and height of the input image, and 3 represents the number of color channels (RGB).
    3. model.add(MaxPooling2D((2, 2))):

      • MaxPooling2D((2, 2)): This line adds a max-pooling layer with a pool size of 2x2. Max-pooling reduces the spatial dimensions of the feature maps, helping to capture the most important information while reducing computation.
    4. The following lines repeat the pattern of adding convolutional and max-pooling layers to the model:

      • model.add(Conv2D(64, (3, 3), activation='relu')): Another convolutional layer with 64 filters and ReLU activation.
      • model.add(MaxPooling2D((2, 2))): Another max-pooling layer.
    5. model.add(Flatten()): This line adds a flatten layer that converts the 2D feature maps produced by the convolutional layers into a 1D vector. This is necessary before connecting to fully connected (dense) layers.

    6. model.add(Dense(128, activation='relu')):

      • Dense(128, activation='relu'): This line adds a fully connected (dense) layer with 128 units and ReLU activation. It's a densely connected layer, meaning each unit is connected to every unit in the previous layer.
    7. model.add(Dense(1, activation='sigmoid')):

      • Dense(1, activation='sigmoid'): This adds the final dense layer with a single unit and sigmoid activation. This is commonly used in binary classification problems where the model outputs a probability value between 0 and 1.

    In summary, this code defines a CNN architecture for image classification. It starts with convolutional and max-pooling layers to extract features from the input images, followed by flattening and fully connected layers for classification. The final layer uses sigmoid activation to produce binary classification outputs.