Who's That Abalone?
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    Is it possible to estimate the age of an abalone?

    📖 Background

    Japan has a developed seafood market and farming abalones is a significant part of it. For operational and environmental reasons, it is an important consideration to estimate the age of the abalones when they go to market.

    Determining an abalone's age involves counting the number of rings in a cross-section of the shell through a microscope. Since this method is somewhat cumbersome and complex, you are interested in helping the farmers estimate the age of the abalone using its physical characteristics.

    It is crucial for the analysis design to decide whether to attempt predicting the age of a live abalone or to use all of its given characteristics as a unit prepared for seafood market. In this take we would focus on predictions based on all of the data, but age prediction based only from the measures obtained of a live abalone could be a promising future study.

    💾 The data

    The dataset was made from the following historical data (source):

    Abalone characteristics
    VariableExplanation
    0sexM, F, and I (infant)
    1lengthlongest shell measurement
    2diameterperpendicular to the length
    3heightmeasured with meat in the shell
    4whole_wtwhole abalone weight
    5shucked_wtthe weight of abalone meat
    6viscera_wtgut-weight
    7shell_wtthe weight of the dried shell
    8ringsnumber of rings in a shell cross-section
    9agethe age of the abalone: the number of rings + 1.5

    Acknowledgments: Warwick J Nash, Tracy L Sellers, Simon R Talbot, Andrew J Cawthorn, and Wes B Ford (1994) "The Population Biology of Abalone (Haliotis species) in Tasmania. I. Blacklip Abalone (H. rubra) from the North Coast and Islands of Bass Strait", Sea Fisheries Division, Technical Report No. 48 (ISSN 1034-3288).

    Imports and settings
    %%capture
    
    pip install synthia pyvinecopulib tensorflow seaborn
    import pandas as pd
    import seaborn as sns
    import seaborn.objects as so
    from seaborn import axes_style
    import numpy as np
    import matplotlib as mpl
    import matplotlib.pyplot as plt
    import synthia as syn
    import pyvinecopulib as pv
    
    print(sns.__version__)
    Run cancelled
    import os
    os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
    
    # Make NumPy and pandas printouts easier to read
    %matplotlib inline
    rc_params = {**axes_style('whitegrid'),
                 'legend.markerscale': 3,
                 'grid.linestyle': ':', 
                 'axes.spines.top': False,
                 'axes.spines.right':  False}
    mpl.rcParams.update(rc_params)
    cmap = mpl.cm.get_cmap('plasma')
    np.set_printoptions(precision=2, suppress=True)
    pd.set_option('display.precision', 2)
    pd.set_option('display.float_format', lambda x: '%.2f' % x)
    Run cancelled
    from tensorflow.keras.models import Model
    from tensorflow.keras.layers import (Input, Dense, Concatenate, 
                                         Embedding, Flatten, Normalization)
    from tensorflow.keras.utils import to_categorical
    Run cancelled
    from sklearn.model_selection import train_test_split
    from sklearn.linear_model import Lasso
    from sklearn.pipeline import make_pipeline
    from sklearn.preprocessing import StandardScaler
    from sklearn.metrics import mean_squared_error, r2_score
    Read and process the dataset
    Run cancelled
    abalone = pd.read_csv('./data/abalone.csv', 
                          dtype={'sex': 'category'})
    Run cancelled
    abalone.info()
    Run cancelled
    abalone.sample(n=10)

    Data exploration

    Run cancelled
    abalone.describe().T
    ‌
    ‌
    ‌