Joining data from Excel to attribute table in QGIS without creating duplicates?

You will want to join the excel file to the shapefile. You will join them on a common attribute and the result will be a joined layer where each record hold the attributes of both the shapefile and the excel file.

Ok first you load your excel file and your vector later into the layers. I used some test data I made but your setup should look similar to below. enter image description here

Now right click on the layer (in the layers panel) and choose properties, then choose join. First hit the green + sign in the bottom left (Big Red Arrow in the picture below) and it will bring up a new Add Vector Join menu (like below) Here your join layer will be the excel file (so choose the proper excel file & sheet) the join field is the field in the excel file that holds the common attribute to the shapefile. The target field is the matching field in the shapefile. (in my case both common fields happened to be named unit, but if the field was called UNIT_ in the shapefile i would have used that instead of UNIT for my target field)
enter image description here There are a few other options you can mess with on the join, like if you only want to see certain fields, etc... anyway so now they are joined and if you go back and look at the attributes of the shapefile you will see it now has the corresponding attributes of the excel file like below.

enter image description here

Here are a few handy tutorials to show you step by step how to do it:

https://www.mapbox.com/tilemill/docs/guides/joining-data/

http://maps.cga.harvard.edu/qgis/wkshop/join_csv.php

http://qgis.spatialthoughts.com/2012/03/using-tabular-data-in-qgis.html

How to join external tables with a shapefile's attribute table in QGIS?

http://www.digital-geography.com/qgis-tutorial-ii-how-to-join-data-with-shapefiles/#.Vs9vpmHXKUk

and the tutorial ArMoraer mentioned in the comments.


I would do the following things to make life easier:

Before you do this, make a backup of your shapefile.

  1. In your shapefile, go to the layer properties / fields and turn on edit mode.
  2. select all Field except the ID field
  3. Delete all fields except the ID Field
  4. add your csv file as a layer in QGIS (Main menu/Layer/Add Layer/ Add delimited text layer)enter image description here (choose no geometry)
  5. in your shapefile, select properties/joins and choose both ID fields for source and target. As described in the other answer to this question.
  6. Save your modified Shapefile.

enter image description here

this is the Field tab I am referring to

enter image description here

Don't forget to toggle editing before and after deleting unnecessary field


Just to add another method, you can set up a Project macro which when loaded:

  1. Automatically joins your shapefile to your csv
  2. Updates the IP1 and IP2 fields
  3. Removes the joined fields leaving only the shapefile's fields (i.e. no dulplicates)

First, create a project if you haven't already done so and then go to the toolbar:

Project > Project Properties... > Macros

Then use the following code in the def openProject(): function and enter the names of your layers and the fields you want joined. I used "Example" and "spreadsheet" for my shapefile and csv file respectively with the field ID:

from PyQt4.QtCore import *
import qgis
from qgis.core import QgsMapLayerRegistry, QgsVectorJoinInfo

for layer in QgsMapLayerRegistry.instance().mapLayers().values():
    # Change to your shapefile name
    if layer.name() == "Example":
        qgis.utils.iface.setActiveLayer(layer)
        shp = qgis.utils.iface.activeLayer()

for layer in QgsMapLayerRegistry.instance().mapLayers().values():
    # Change to your csv name
    if layer.name() == "spreadsheet":
        qgis.utils.iface.setActiveLayer(layer)
        csv = qgis.utils.iface.activeLayer()

# Set up join parameters
shpField='ID'
csvField='ID'
joinObject = QgsVectorJoinInfo()
joinObject.joinLayerId = csv.id()
joinObject.joinFieldName = csvField
joinObject.targetFieldName = shpField
shp.addJoin(joinObject)

# Define fields to update and joined fields to copy values from
ip1 = shp.fieldNameIndex('IP1') 
ip1_join = shp.fieldNameIndex('spreadsheet_IP1')
ip2 = shp.fieldNameIndex('IP2') 
ip2_join = shp.fieldNameIndex('spreadsheet_IP2')

shp.startEditing()
for feat in shp.getFeatures():
    shp.changeAttributeValue(feat.id(), ip1, feat.attributes()[ip1_join])
    shp.changeAttributeValue(feat.id(), ip2, feat.attributes()[ip2_join])
shp.commitChanges()

# Remove join 
shp.removeJoin(csv.id())

Project macro

Make sure that the layers are not joined, save the project and enable macros by going to the toolbar:

Settings > General > Enable macros


Now when you close the project and edit the csv file, the next time you load the project, the fields should automatically be updated:

Editing csv file

Results