Duplicate Keys when Generating a Json from a Dictionary in Python

TLDR: A dictionary in json treats all keys as string, while a python dict distinguishes not only between the content but also its datatype (see stackoverflow). When saving a dictionary into a json and reloading the dictionary from it, you have to be careful not to implicitely convert the original numeric key into a key of datatype string.

Observation

In an experiment i am processing data in multiple runs, calculating appropriate metrics and save those results in a dictionary. I would then save this dictionary in a json file. By loading the saved json into a dictionary, i can continue my experiment at any time seemlessly. However it appears, that when i load the data as a dictionary, running my experiment and saving the dictionary again as json, it would create duplicate json keys.

Resolution

This behavor appears, because the keys in a dictionary can be distinguished not only based on the key “content” but also its datatype, while keys in a json are always stored as strings. A small example shall illustrate the subtle dilemma.

import json
###########
# create a simple dictionary with a numeric key
##########
d = {}
d[1] = "hungsblog.de"
# output: d = {1: 'hungsblog.de'}

##########
# Dumping the dictionary into a json
##########
j = json.dumps(d, indent=4)
# output: j = 
#{
#    "1": "hungsblog.de",
#}

##########
# Loading the json into a dictionary and assigining the same numeric key
##########
d = json.loads(j)
d[1] = "hungsblog.de"
# output: d = {1: 'hungsblog.de', '1': 'hungsblog.de'}

We first create save the value and map it to a numeric key in the python dictionary. Then we save it as json and load it back again as dictionary. However, because the key is now loaded as a string instead of an integer, when we insert virtually the same key as before, we create seemlessly a duplicate key entry in our python dictionary.

Leave a Comment

Your email address will not be published. Required fields are marked *

hungsblog | Nguyen Hung Manh | Dresden