GitHub - oxylabs/python-parse-json: A tutorial for parsing JSON data with Python
- What is JSON?
- Converting JSON string to Python object
- Converting JSON file to Python object
- Converting Python object to JSON string
- Writing Python object to a JSON file
- Converting custom Python objects to JSON objects
- Creating Python class objects from JSON objects
JSON is a common standard used by websites and APIs and even natively supported by modern databases such as PostgreSQL. In this article, we’ll present a tutorial on how to handle JSON data with Python
For a detailed explanation, see our blog post.
What is JSON?
JSON, or JavaScript Object Notation, is a format that uses text to store data objects:
{
"name": "United States",
"population": 331002651,
"capital": "Washington D.C.",
"languages": [
"English",
"Spanish"
]
}Converting JSON string to Python object
Let’s start with a simple example:
# JSON string country = '{"name": "United States", "population": 331002651}' print(type(country))
The output of this snippet will confirm that this is indeed a string:
We can call the json.loads() method and provide this string as a parameter.
import json country = '{"name": "United States", "population": 331002651}' country_dict = json.loads(country) print(type(country)) print(type(country_dict))
The output of this snippet will confirm that the JSON data, which was a string, is now a Python dictionary.
<class 'str'> <class 'dict'>
This dictionary can be accessed as usual:
print(country_dict['name']) # OUTPUT: United States
It is important to note here that the json.loads() method will not always return a dictionary. The data type that is returned will depend on the input string. For example, this JSON string will return a list, not a dictionary.
countries = '["United States", "Canada"]'
counties_list= json.loads(countries)
print(type(counties_list))
# OUTPUT: <class 'list'>
Similarly, if the JSON string contains true, it will be converted to Python equivalent boolean value, which is True.
import json
bool_string = 'true'
bool_type = json.loads(bool_string)
print(bool_type)
# OUTPUT: True
The following table shows JSON objects and the Python data types after conversion. For more details, see Python docs.
Converting JSON file to Python object
Save the following JSON data as a new file and name it united_states.json:
{
"name": "United States",
"population": 331002651,
"capital": "Washington D.C.",
"languages": [
"English",
"Spanish"
]
}Enter this Python script in a new file:
import json with open('united_states.json') as f: data = json.load(f) print(type(data))
Running this Python file prints the following:
The dictionary keys can be checked as follows:
print(data.keys())
# OUTPUT: dict_keys(['name', 'population', 'capital', 'languages'])
Using this information, the value of name can be printed as follows:
data['name'] # OUTPUT: United States
Converting Python object to JSON string
Save this code in a new file as a Python script:
import json languages = ["English","French"] country = { "name": "Canada", "population": 37742154, "languages": languages, "president": None, } country_string = json.dumps(country) print(country_string)
When this file is run with Python, the following output is printed:
{"name": "Canada", "population": 37742154, "languages": ["English", "French"],
"president": null}Lists can be converted to JSON as well. Here is the Python script and its output:
import json languages = ["English", "French"] languages_string = json.dumps(languages) print(languages_string) # OUTPUT: ["English", "French"]
It’s not just limited to a dictionary and a list. string, int, float, bool and even None value can be converted to JSON.
Writing Python object to a JSON file
The method used to write a JSON file is dump():
import json # Tuple is encoded to JSON array. languages = ("English", "French") # Dictionary is encoded to JSON object. country = { "name": "Canada", "population": 37742154, "languages": languages, "president": None, } with open('countries_exported.json', 'w') as f: json.dump(country, f)
To make it more readable, we can pass one more parameter to the dump() function as follows:
json.dump(country, f, indent=4)
This time when you run the code, it will be nicely formatted with indentation of 4 spaces:
{
"languages": [
"English",
"French"
],
"president": null,
"name": "Canada",
"population": 37742154
}Converting custom Python objects to JSON objects
Save the following code as a Python script and run it:
import json class Country: def __init__(self, name, population, languages): self.name = name self.population = population self.languages = languages canada = Country("Canada", 37742154, ["English", "French"]) print(json.dumps(canada)) # OUTPUT: TypeError: Object of type Country is not JSON serializable
To convert the objects to JSON, we need to write a new class that extends JSONEncoder:
import json class CountryEncoder(json.JSONEncoder): def default(self, o): if isinstance(o, Country): # JSON object would be a dictionary. return { "name" : o.name, "population": o.population, "languages": o.languages } else: # Base class will raise the TypeError. return super().default(o)
This class can now be supplied to the json.dump() as well as json.dumps() methods.
print(json.dumps(canada, cls=CountryEncoder)) # OUTPUT: {“name": "Canada", "population": 37742154, "languages": ["English", "French"]}
Creating Python class objects from JSON objects
Using a custom encoder, we were able to write code like this:
# Create an object of class Country canada = Country("Canada", 37742154, ["English", "French"]) # Use json.dump() to create a JSON file in writing mode with open('canada.json','w') as f: json.dump(canada,f, cls=CountryEncoder)
If we try to parse this JSON file using the json.load() method, we will get a dictionary:
with open('canada.json','r') as f: country_object = json.load(f) # OUTPUT: <type ‘dict'>
To get an instance of the Country class instead of a dictionary, we need to create a custom decoder:
import json class CountryDecoder(json.JSONDecoder): def __init__(self, object_hook=None, *args, **kwargs): super().__init__(object_hook=self.object_hook, *args, **kwargs) def object_hook(self, o): decoded_country = Country( o.get('name'), o.get('population'), o.get('languages'), ) return decoded_country
Finally, we can call the json.load() method and set the cls parameter to CountryDecoder class.
with open('canada.json','r') as f: country_object = json.load(f, cls=CountryDecoder) print(type(country_object)) # OUTPUT: <class ‘Country'>
If you wish to find out more about Reading & Parsing JSON Data With Python, see our blog post.
