Creating a dictionary from a csv file?

Help from @phil-frost was very helpful, was exactly what I was looking for.

I have made few tweaks after that so I'm would like to share it here:

def csv_as_dict(file, ref_header, delimiter=None):

    import csv
    if not delimiter:
        delimiter = ';'
    reader = csv.DictReader(open(file), delimiter=delimiter)
    result = {}
    for row in reader:
        print(row)
        key = row.pop(ref_header)
        if key in result:
            # implement your duplicate row handling here
            pass
        result[key] = row
    return result

You can call it:

myvar = csv_as_dict(csv_file, 'ref_column')

Where ref_colum will be your main key for each row.


You need a Python DictReader class. More help can be found from here

import csv

with open('file_name.csv', 'rt') as f:
    reader = csv.DictReader(f)
    for row in reader:
        print row

Create a dictionary, then iterate over the result and stuff the rows in the dictionary. Note that if you encounter a row with a duplicate date, you will have to decide what to do (raise an exception, replace the previous row, discard the later row, etc...)

Here's test.csv:

Date,Foo,Bar
123,456,789
abc,def,ghi

and the corresponding program:

import csv
reader = csv.reader(open('test.csv'))

result = {}
for row in reader:
    key = row[0]
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row[1:]
print(result)

yields:

{'Date': ['Foo', 'Bar'], '123': ['456', '789'], 'abc': ['def', 'ghi']}

or, with DictReader:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    key = row.pop('Date')
    if key in result:
        # implement your duplicate row handling here
        pass
    result[key] = row
print(result)

results in:

{'123': {'Foo': '456', 'Bar': '789'}, 'abc': {'Foo': 'def', 'Bar': 'ghi'}}

Or perhaps you want to map the column headings to a list of values for that column:

import csv
reader = csv.DictReader(open('test.csv'))

result = {}
for row in reader:
    for column, value in row.items():  # consider .iteritems() for Python 2
        result.setdefault(column, []).append(value)
print(result)

That yields:

{'Date': ['123', 'abc'], 'Foo': ['456', 'def'], 'Bar': ['789', 'ghi']}