Bruno Gonçalves/

Markowitz & Tech Stocks


Markowitz & Tech Stocks

This notebook simulates markowitz portfolio for every semester using the stocks prices from the last semester. It includes 3 parts:

  • Data Preparation
  • Portfolio Simulation
  • Data Visualization

Part of the code is from:

Feedbacks are appreciated.

Data Preparation

Importing libraries and Datasets

# libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import math

# Datasets
df1 = pd.read_csv("data/AAPL.csv")
df2 = pd.read_csv("data/AMZN.csv")
df3 = pd.read_csv("data/BABA.csv")
df4 = pd.read_csv("data/CRM.csv")
df5 = pd.read_csv("data/FB.csv")
df6 = pd.read_csv("data/GOOG.csv")
df7 = pd.read_csv("data/INTC.csv")
df8 = pd.read_csv("data/MSFT.csv")
df9 = pd.read_csv("data/NVDA.csv")
df10 = pd.read_csv("data/TSLA.csv")

Checking datasets


For our simulation we'll use only Close prices. We can see that's two columns for this price. The "nominal" one and other that is adjusted for events such as split and/or reverse split. Another observation is the period of time for each stock.

To keep it simple, I'll select stocks that have no splits and have the same period of time.

To check if there was splits, I'll compare close and adjusted close prices from the earliest date possible. If its different, than there was a split.

let's build a table with the following columns:

  • Earliest date;
  • Total return for each stock;
  • Binary column telling if there was any split (1 = yes, 0 = no).
frames = [df1, df2, df3, df4, df5, df6, df7, df8, df9, df10]
stocks =["AAPL", "AMZN", "BABA", "CRM", "FB", "GOOG", "INTC", "MSFT", "NVDA","TSLA"]

dates = []
returns = []
splits = []

for i in range(len(frames)):
    parcial = frames[i]    
    date = parcial.iloc[0][0]
    return_ = round(100*((parcial.iloc[-1][4]/parcial.iloc[0][4])-1),2)
    return_ = str(return_) + str('%')
    parcial.sort_values(by=['Date'], inplace=True, ascending=False)
    close = parcial.iloc[-1][4]
    adj_close = parcial.iloc[-1][5]
    if close != adj_close:
        split = 1
        split = 0

dfs = []
for df in frames:
    df = df.round(2)
summary = pd.DataFrame(list(zip(stocks, dates, returns, splits)),
               columns =['Stock', 'Initial Date','Returns', 'Splits'])

summary.sort_values(by=['Initial Date'], inplace=True, ascending=False)


Dataframe adjustments

We can see that only 3 stocks satisfy the two conditions:

  • Amazon (AMZN)
  • Google (GOOG)
  • Salesforce (CRM)

These 3 stocks have the same period of time, as we can see the earliest date is the same for each one. But let's check if there isn't any missing date (e.g: let's say we have prices for AMZN in 2015-01-04, but not for GOOG).

# Adjusting dataframes
df2 = df2[['Date','Close']]

df4 = df4[['Date','Close']]

df6 = df6[['Date','Close']]

# Merge Dataframes
final = df2.merge(df4, how='outer', on='Date')
final = final.merge(df6, how='outer', on='Date')

# Check for null values

There's no missing date. I create new dataframes for the 3 stocks with only date and close price. Let me add a column for the semester.

sel_stocks1 = [df2,df4,df6]

for i in range(len(sel_stocks1)):
    parcial = sel_stocks1[i]
    # Column for date
    parcial['Date'] = pd.to_datetime(df2['Date'])
    parcial.sort_values(by=['Date'], inplace=True, ascending=True)
    # Column for Semester
    parcial['Semester'] = parcial['Date'].dt.year.astype(str) + '-S'+ np.where(parcial['Date'],2,1).astype(str)
    # Round close prices
    parcial['Close'] = parcial['Close'].round(2)
# Let's drop the semester (2021-S2) as we only have few days of it
sel_stocks = []
for i in range(len(sel_stocks1)):
    parcial = sel_stocks1[i]
    parcial = parcial[parcial['Semester']!='2021-S2']

check = sel_stocks[1]

Data Visualization

Let's check how the stocks performed over the time. To do so I'll normalize prices such as all 3 stocks will start at price 100.

  • AI Chat
  • Code