Estimated reading time: 3 minutes
If you are working with files and trying to open them to read in their contents, sometimes errors will pop in that may not immediately be obvious how to resolve.
In this blog post, we discuss a TypeError that is associated with opening a file in Python and reading in its contents.
So how may this issue appear?
To start off you have a file that has some contents in it, and you would like to read it. We have created a file called countries.txt, which we are going to try and read in and print. The contents of the file are as follows:
Name Capital
0 Ireland Dublin
1 Englad London
2 Wales Cardiff
3 France Paris
Separate from this you are going to create a data frame that will have data that will match what is in the text file, this is for comparative purposes only. The logic for this is as follows:
import pandas as pd
country_data = {'Name': ['Ireland', 'Englad', 'Wales', 'France'],
'Capital': ['Dublin','London','Cardiff', 'Paris'],}
# Convert the dictionary into DataFrame
df = pd.DataFrame(country_data)
print(df)
As a follow on, you are going to use a with open statement like the one below, which will read the contents of the file and then print them.
with open(name of file, 'r', encoding='utf-8') as fileopen:
data_in_file = fileopen.readlines()
print(data_in_file)
So when you run both of them you get the following error:
Traceback (most recent call last):
File "C:/Users/haugh/OneDrive/dataanalyticsireland/YOUTUBE/test/youtube.py", line 10, in <module>
with open(df, 'r', encoding='utf-8') as fileopen:
TypeError: expected str, bytes or os.PathLike object, not DataFrame
Name Capital
0 Ireland Dublin
1 Englad London
2 Wales Cardiff
3 France Paris
So what causes this problem?
The problem lies in where you have referenced ‘df’. In the with open statement, we have passed the wrong data type, that being ‘df’ which is a data frame. The logic expects a string to be passed, and the string in this instance should be ‘countries.txt’
When we correct the logic, the error disappears, and the resulting output will be as follows:
mport pandas as pd
country_data = {'Name': ['Ireland', 'Englad', 'Wales', 'France'],
'Capital': ['Dublin','London','Cardiff', 'Paris'],}
# Convert the dictionary into DataFrame
df = pd.DataFrame(country_data)
print(df)
with open('countries.txt', 'r', encoding='utf-8') as fileopen:
data_in_file = fileopen.readlines()
print(data_in_file)
Output:
Name Capital
0 Ireland Dublin
1 Englad London
2 Wales Cardiff
3 France Paris
[' Name Capital\n', '0 Ireland Dublin\n', '1 Englad London\n', '2 Wales Cardiff\n', '3 France Paris']
Process finished with exit code 0
What happens if I pass ‘df’ as a string?
If we simply pass ‘df’ as a string, as follows:
import pandas as pd
country_data = {'Name': ['Ireland', 'Englad', 'Wales', 'France'],
'Capital': ['Dublin','London','Cardiff', 'Paris'],}
# Convert the dictionary into DataFrame
df = pd.DataFrame(country_data)
print(df)
with open('df', 'r', encoding='utf-8') as fileopen:
data_in_file = fileopen.readlines()
print(data_in_file)
We get the following output:
Name Capital
0 Ireland Dublin
1 Englad London
2 Wales Cardiff
3 France Paris
Traceback (most recent call last):
File "C:/Users/haugh/OneDrive/dataanalyticsireland/YOUTUBE/test/youtube.py", line 10, in <module>
with open('df', 'r', encoding='utf-8') as fileopen:
FileNotFoundError: [Errno 2] No such file or directory: 'df'
In essence, the logic is looking to open a file, not a data frame. So you should always be referring to a file with its extension inside the string for the program to complete with no errors.