Where are the oldest businesses
  • AI Chat
  • Code
  • Report
  • Beta
    Spinner

    1. The oldest businesses in the world

    This is Staffelter Hof Winery, Germany's oldest business, which was established in 862 under the Carolingian dynasty. It has continued to serve customers through dramatic changes in Europe such as the Holy Roman Empire, the Ottoman Empire, and both world wars. What characteristics enable a business to stand the test of time? Image credit: Martin Kraft The entrance to Staffelter Hof Winery, a German winery established in 862.

    To help answer this question, BusinessFinancing.co.uk researched the oldest company that is still in business in almost every country and compiled the results into a dataset. Let's explore this work to to better understand these historic businesses. Our datasets, which are all located in the datasets directory, contain the following information:

    businesses and new_businesses

    columntypemeaning
    businessvarcharName of the business.
    year_foundedintYear the business was founded.
    category_codevarcharCode for the category of the business.
    country_codecharISO 3166-1 3-letter country code.

    countries

    columntypemeaning
    country_codevarcharISO 3166-1 3-letter country code.
    countryvarcharName of the country.
    continentvarcharName of the continent that the country exists in.

    categories

    columntypemeaning
    category_codevarcharCode for the category of the business.
    categoryvarcharDescription of the business category.

    Now let's learn about some of the world's oldest businesses still in operation!

    # Import the pandas library under its usual alias 
    import pandas as pd
    
    # Load the business.csv file as a DataFrame called businesses
    businesses = pd.read_csv("datasets/businesses.csv")
    
    # Sort businesses from oldest businesses to youngest
    sorted_businesses = businesses.sort_values("year_founded")
    
    # Display the first few lines of sorted_businesses
    sorted_businesses.head()

    2. The oldest businesses in North America

    So far we've learned that Kongō Gumi is the world's oldest continuously operating business, beating out the second oldest business by well over 100 years! It's a little hard to read the country codes, though. Wouldn't it be nice if we had a list of country names to go along with the country codes?

    Enter countries.csv, which is also located in the datasets folder. Having useful information in different files is a common problem: for data storage, it's better to keep different types of data separate, but for analysis, we want all the data in one place. To solve this, we'll have to join the two tables together.

    countries

    columntypemeaning
    country_codevarcharISO 3166-1 3-letter country code.
    countryvarcharName of the country.
    continentvarcharName of the continent that the country exists in.

    Since countries.csv contains a continent column, merging the datasets will also allow us to look at the oldest business on each continent!

    # Load countries.csv to a DataFrame
    countries = pd.read_csv("datasets/countries.csv")
    
    # Merge sorted_businesses with countries
    businesses_countries = sorted_businesses.merge(countries, on="country_code")
    
    # Filter businesses_countries to include countries in North America only
    north_america = businesses_countries[businesses_countries["continent"]=="North America"]
    north_america.head()

    3. The oldest business on each continent

    Now we can see that the oldest company in North America is La Casa de Moneda de México, founded in 1534. Why stop there, though, when we could easily find out the oldest business on every continent?

    # Create continent, which lists only the continent and oldest year_founded
    continent = businesses_countries.groupby("continent").agg({"year_founded":"min"})
    print(continent)
    # Merge continent with businesses_countries
    merged_continent = continent.merge(businesses_countries, on="year_founded")
    
    # Subset continent so that only the four columns of interest are included
    subset_merged_continent = merged_continent[["continent", "country", "business", "year_founded"]]
    subset_merged_continent

    4. Unknown oldest businesses

    BusinessFinancing.co.uk wasn't able to determine the oldest business for some countries, and those countries are simply left off of businesses.csv and, by extension, businesses. However, the countries that we created does include all countries in the world, regardless of whether the oldest business is known.

    We can compare the two datasets in one DataFrame to find out which countries don't have a known oldest business!

    # Use .merge() to create a DataFrame, all_countries
    all_countries = businesses.merge(countries, on="country_code", how="outer")
    
    # Filter to include only countries without oldest businesses
    missing_countries = all_countries[all_countries["business"].isna()]
    
    # Create a series of the country names with missing oldest business data
    missing_countries_series = missing_countries["country"]
    
    # Display the series
    missing_countries_series

    5. Adding new oldest business data

    It looks like we've got some holes in our dataset! Fortunately, we've taken it upon ourselves to improve upon BusinessFinancing.co.uk's work and find oldest businesses in a few of the missing countries. We've stored the newfound oldest businesses in new_businesses, located at "datasets/new_businesses.csv". It has the exact same structure as our businesses dataset.

    new_businesses

    columntypemeaning
    businessvarcharName of the business.
    year_foundedintYear the business was founded.
    category_codevarcharCode for the category of the business.
    country_codecharISO 3166-1 3-letter country code.

    All we have to do is combine the two so that we've got one more complete list of businesses!

    # Import new_businesses.csv
    new_businesses = pd.read_csv("datasets/new_businesses.csv")
    
    # Add the data in new_businesses to the existing businesses
    all_businesses = pd.concat([new_businesses, businesses])
    
    # Merge and filter to find countries with missing business data
    new_all_countries = all_businesses.merge(countries, on="country_code", how="outer")
    new_missing_countries = new_all_countries[new_all_countries["business"].isna()]
    
    # Group by continent and create a "count_missing" column
    count_missing = new_missing_countries.groupby("continent").agg({"country":"count"})
    count_missing.columns = ["count_missing"]
    count_missing

    6. The oldest industries

    Remember our oldest business in the world, Kongō Gumi?

    businessyear_foundedcategory_codecountry_code
    64Kongō Gumi578CAT6JPN

    We know Kongō Gumi was founded in the year 578 in Japan, but it's a little hard to decipher which industry it's in. Information about what the category_code column refers to is in "datasets/categories.csv":

    categories

    columntypemeaning
    category_codevarcharCode for the category of the business.
    categoryvarcharDescription of the business category.

    Let's use categories.csv to understand how many oldest businesses are in each category of industry.

    # Import categories.csv and merge to businesses
    categories = pd.read_csv("datasets/categories.csv")
    businesses_categories = businesses.merge(categories, on="category_code")
    
    # Create a DataFrame which lists the number of oldest businesses in each category
    count_business_cats = businesses_categories.groupby("category").agg({"business":"count"})
    
    # Create a DataFrame which lists the cumulative years businesses from each category have been operating
    years_business_cats = businesses_categories.groupby("category").agg({"year_founded":"sum"})
    
    # Rename columns and display the first five rows of both DataFrames
    count_business_cats.columns = ["count"]
    years_business_cats.columns = ["total_years_in_business"]
    display(count_business_cats.head(), years_business_cats.head())

    7. Restaurant representation

    No matter how we measure it, looks like Banking and Finance is an excellent industry to be in if longevity is our goal! Let's zoom in on another industry: cafés, restaurants, and bars. Which restaurants in our dataset have been around since before the year 1800?

    # Filter using .query() for CAT4 businesses founded before 1800; sort results
    old_restaurants = businesses_categories.query('category_code == "CAT4" and year_founded < 1800')
    
    # Sort the DataFrame
    old_restaurants = old_restaurants.sort_values(by="year_founded")
    old_restaurants

    8. Categories and continents

    St. Peter Stifts Kulinarium is old enough that the restaurant is believed to have served Mozart - and it would have been over 900 years old even when he was a patron! Let's finish by looking at the oldest business in each category of commerce for each continent.