What is Pandas?
Pandas is "an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language." - Python Data Analysis Library
Installation
# [Mac]
pip install pandas
# [Windows]
setx path "%path%;C:\Python27;"
pip install -U pandas
Once you install it, go to this website to download a pokemon.csv. This data set includes 721 Pokemon, including their number, name, first and second type, and basic stats: HP, Attack, Defense, Special Attack, Special Defense, and Speed.
Then follow the instructions written below to use pandas module to read the csv.
import pandas as pd
df = pd.read_csv('your_path_to_pokemon.csv')
print df
How many rows are there?
print len(df) # Answer is 800.
Let's change column names.
df.columns = ["ID", "Name", "Type_1", "Type_2", "Total", "HP", "Atk", "Def", "Sp_Atk", "Sp_Def", "Speed", "Generation", "Legendary"]
Let’s check if it worked.
# Return first 5 records.
print df.head(5)
# Return last 5 records.
print df.tails(5)
Let’s filter by column.
# Method 1
print df['Name']
# Method 2 -- Usable only if column labels do not contain spaces, dashes, etc.
print df.Name
# Select multiple columns.
print df[['Name', 'Generation', 'Legendary']]
You can also set conditionals to filter.
# Filter by a series of booleans
print df[df.Total > 400]
# Filter by multiple conditionals
print df[(df.Attack > 130) & (df.Legendary == False)]
# Filter by string methods
print df[df.Name.str.startswith("Char")]
df = df.set_index(["Type_1"])
print df.head(10) # Now shows Name column before ID column.
print df.loc["Steel"] # Label-based referencing uses loc.
# Return your index to it's original column form.
print df.reset_index(["Type_1"])
# Rearrange index in descending order
df.sort_index(ascending=False).head(5)
Done! 🙂