Interview Question
Just wanted to post an interview question I came across which might help those in need.
The task was : Use the language of your choice to write a program that reads a CSV file and outputs the contents of the file sorted by the “age” column. Missing data should sort to the end. Output should show all columns of data and may be sent to the console, an output file or a graphical interface at your discretion.
My answer to this question was to write a Python script that used Sphinx as a document generator.
The code went something like this:
"""
.. module :: runandsort
.. moduleauthor:: John Roach <johnroach1985@gmail.com>
The Task
********
Use the language of your choice to write a program that reads a CSV file and outputs the contents of the file sorted by the "age" column. Missing data should sort to the end. Output should show all columns of data and may be sent to the console, an output file or a graphical interface at your discretion.
The Solution
************
You can call this program as :
>>> python readandsort.py csv_to_be_sorted csv_output
Where csv_to_be_sorted is the full path of the csv file.
And csv_output is the full path where you want to put the csv file.
Just in case it isn't clear...
>>> python readandsort.py /home/john/Documents/original_data.csv /home/john/Documents/output.csv bubble_sort
Requirements
************
Python 2.7.*
Program Reference
*****************
"""
import sys
import csv
import os.path
import operator
def main():
did_file_check = False
if len(sys.argv) >= 1:
error("user_entered_no_data")
else:
if not sys.argv[1]:
error("user_input_string_error")
elif not os.path.isfile(sys.argv[1]):
error("no_such_path_error")
if not sys.argv[2]:
error("user_output_string_error")
elif not os.path.isfile(sys.argv[2]):
warning("output_path_does_not_exist")
did_file_check = True
elif os.path.isfile(sys.argv[2]) and not did_file_check:
warning("output_path_exists")
input_csv = sys.argv[1]
output_csv = sys.argv[2]
print "Reading the file in directory" + input_csv
list_of_users_that_have_ages = []
list_of_users_that_dont_have_ages = []
with open(input_csv, 'rU') as f:
csv_iter = csv.reader(f)
try:
header = next(csv_iter)
for i in range(0, len(header)):
if header[i].lower() == "age":
key_column = i
for row in csv_iter:
data_row = tuple(row)
if row[key_column]:
list_of_users_that_have_ages.append(data_row)
else:
list_of_users_that_dont_have_ages.append(data_row)
except csv.Error as e:
sys.exit('READ ERROR : file %s, line %d: %s' % (input_csv, csv_iter.line_num, e))
print "Sorting the data. This might take some time for large data."
sorted_list = sort(list_of_users_that_have_ages, key_column)
merged_list = [tuple(header)] + sorted_list + list_of_users_that_dont_have_ages
with open(output_csv, "wb") as f:
fileWriter = csv.writer(f, delimiter=',', quotechar=' ', quoting=csv.QUOTE_MINIMAL)
try:
for row in merged_list:
fileWriter.writerow(row)
except csv.Error as e:
sys.exit('Write ERROR : file %s, line %d: %s' % (output_csv, fileWriter.line_num, e))
print "Thank you for using Reading and Sorting CSV script! You can find the output at : "+output_csv
sys.exit()
def error(error_type):
"""This function contains all errors and basically guides the user on what is the error. Right now the script simply
exits via sys.exit() when coming across an error however if I had time I would have added a way so the user can
correct that.
:param error_type: This is the error string. Why string you might ask... Because you can create more types of errors this way.
:type error_type: str.
:returns: prints out error message, ends program
"""
if error_type == "user_input_string_error":
print "You have to enter a CSV so we can sort it out!"
sys.exit()
elif error_type == "user_output_error":
print "You have to enter a CSV so we can print it out!"
sys.exit()
elif error_type == "user_entered_no_data":
print "You have to at least enter some type of data."
sys.exit()
else:
print "unknown error"
sys.exit()
def warning(warning_type):
"""This function simply gives us warnings. And allows the user to continue on with his/her work.
:param warning_type: This is the warning string.
:type warning_type: str.
:returns: prints out warning message, ends program according to user input
"""
if warning_type == "output_path_does_not_exist":
var = raw_input("The output path specified does not exist. Would you like us to create the said file? y/n [y] ")
if not((not var) or (var == "y")):
print "GoodBye!"
sys.exit()
elif warning_type == "output_path_exists":
var = raw_input("The file already exists. Can we overwrite? y/n [y] ")
if not((not var) or (var == "y")):
print "GoodBye!"
sys.exit()
def sort(list, key):
"""This function will use adaptive merge sort list of tuples according to the third index.
:param list: A list of tuples
:type list: list
:param key: This is basically the column number that holds the age.
:type key: integer
:returns: list
"""
list.sort(key=operator.itemgetter(key))
return list
if __name__ == "__main__":
main()
Anyway hope you have fun with it!