Skip to content
Notes for Pandas
  • AI Chat
  • Code
  • Report
  • Spinner

    Summary Notes: Data Manipulation with pandas

    1. Inspecting a DataFrame:

    • .head(): Returns the first few rows of the DataFrame.
    • .info(): Provides information on columns, data types, and missing values.
    • .shape: Returns the number of rows and columns.
    • .describe(): Calculates summary statistics for each column.
    • Example: homelessness.head(), homelessness.info(), homelessness.shape, homelessness.describe()

    2. Parts of a DataFrame:

    • .values: A two-dimensional NumPy array of values.
    • .columns: An index of column names.
    • .index: An index for rows (row numbers or names).
    • Example: homelessness.values, homelessness.columns, homelessness.index

    3. Sorting Rows:

    • Sorting by one column: df.sort_values("column_name")
    • Sorting by multiple columns: df.sort_values(["col_name1", "col_name2"])
    • Example: homelessness.sort_values("num_homeless"), homelessness.sort_values(["region", "num_family_members"], ascending=[True, False])

    4. Subsetting Columns:

    • Selecting a single column: df["column_name"]
    • Selecting multiple columns: df[["col_name1", "col_name2"]]
    • Example: individuals = homelessness["individuals"], state_fam = homelessness[["state", "family_members"]], ind_state = homelessness[["individuals", "state"]]
    # Start coding here...