CSV Analysis

Goal

Convert a XLSX file to a CSV file. Read the CSV file and get counts per category.

First, download this file: OW Drink Preference. BTW, Blizzard Entertainment owns Overwatch. I created this XLSX for the purpose of this tutorial only.

csv-analysis1

Let's begin!

from pyexcel.cookbook import merge_all_to_a_book
import glob
import csv

# Converts XLSX to CSV.
merge_all_to_a_book("~/file.xslx"), "new_file.csv")
my_csv = "your_path_to_new_file.csv"

count = 0

beer = []
coffee = []
hard_liquor = []
oil = []
tea = []
wine = []
soft_drink = []

# Open and read the CSV.
with open(my_csv, 'rU') as csv_file:
    reader = csv.reader(csv_file)
    next(csv_file) # skips the headers
    for row in reader:
        drink_row = row[1].lower()
        other = True
        count += 1
        if 'beer' in drink_row:
            other = False
            beer.append(row)
        elif 'coffee' in drink_row:
            other = False
            coffee.append(row)
        elif 'hard liquor' in drink_row:
            other = False
            hard_liquor.append(row)
        elif 'oil' in drink_row:
            other = False
            oil.append(row)
        elif 'tea' in drink_row:
            other = False
            tea.append(row)
        elif 'wine' in drink_row:
            other = False
            wine.append(row)
        elif other:
            soft_drink.append(row)

Let's see the result.

print 'Number of Heroes: {count}'.format(count=count)
print 'Beer: {beer}'.format(beer=len(beer))
print 'Coffee: {coffee}'.format(coffee=len(coffee))
print 'Hard Liquor: {hard_liquor}'.format(hard_liquor=len(hard_liquor))
print 'Oil: {oil}'.format(oil=len(oil))
print 'Tea: {tea}'.format(tea=len(tea))
print 'Wine: {wine}'.format(wine=len(wine))
print 'Mountain Dew: {soft_drink}'.format(soft_drink=len(soft_drink))

Then you should see a result like this below.

csv-analysis2

All done 🙂

GitHubGitHubLinkedin