The Ultimate Guide to Reading and Writing Files in Python
Python is one of the best languages for many of the elements of computer science including data science, server-side developments, and database handling.
Almost all the time, all these tasks require reading and writing information from and to external files.
Therefore, it should be in every Python developer’s repertoire to properly handle and deal with files.
In this tutorial, we will learn about how to perform various kinds of file handling tasks that might be required for tasks such as, but not limited to, writing a text file, logging data to a server log, and reading big data files.
Table of Contents
You can skip to a specific section of this guide to reading and writing files in Python using the table of contents below:
For today’s topic, it’s important that we have a proper idea of what a file is if we are to handle them the right way. If we look at any file, it can be identified as a set of bytes that consists of data. These files generally have three main components called ‘’header, ‘data’, ‘end of file’. As the name suggests, the header contains metadata such as file name, its type. Data consists of the contents of the body of the file that has been written by an author. Finally, the end of the file contains a character that signals the end of the whole file. There are thousands of registered file types and in this tutorial, we will mainly focus on the .txt format. We will now look at a few key concepts that are important when we are learning about file handling.
When a file is to be accessed, its location has to be known. We can specify using the file path. The path contains 3 major components. They are the folder path, the file name, and the extension. The extension is the portion that starts after the filename with a period(.) symbol. If the file we are interested in resides in the current working directory, then we are not required to specify the folder path as we can directly mention the file using only its filename and the extension.
Newline/End of File
In files, it is important that we do understand that there are line endings or newlines. When we consider different operating system environments, sometimes the way these line endings have been defined is different from each other. For instance, in Windows, carriage return (CR) and line feed (LF) markers are used together. However, in Unix/Linux environments, only the line feed marker is used. Therefore, developers should be aware of these kinds of subtle differences between different OS environments.
Character encoding is another aspect of files one should keep an out for. Encoding can be defined as the mechanism of translating the byte data to characters that are readable by humans. The most popular encoding methods are ASCII and Unicode. If you are to parse these without errors, the correct encoding type has to be known.
Now that we have acquired enough background knowledge, we are ready to move on. Before we move to the next section, go ahead and open up a Jupyter Notebook, or a Python script in your favorite IDE. Have a sample text file called ‘mytextfile.txt’ saved inside your working directory.
Let’s get started.
Opening a File
Opening a file is the very first thing we should do before we are able to read its content or to write new content to it. Opening a file in Python is intuitive. Let’s go ahead and open a file.
The mode here is open for reading and is specified by the ‘r’. The encoding has been identified as Unicode ‘UTF-8’.
Every time we open a file, we should always remember to close the file. Unless the file is closed, there could be unwanted resource leaks and might lead to unexpected behavior. Therefore, we should establish a way to close the file following the opening of it. There are two major ways of doing that. We will look at them now.
try , finally blocks
try:my_file=open('mytextfile.txt')#Further processing comes here
In this, we are opening the file inside a try block and then closing it inside a finally block. If this method looks lengthy, then you might be interested in knowing the next approach.
withopen('mytextfile.txt')asmy_file:#Further processing comes here
We are using the ‘with’ statement here followed by the open statement. This way, Python handles any issues internally and closes the file by itself. We recommend that you use this approach along with another argument inside the open function specifying why you are opening this file.
withopen('mytextfile.txt','r')asmy_file:#Further processing comes here
As we already know, ‘r’ specifies that this is in the read mode. There are a few more modes as follows.
‘r’ - Open for reading
‘w’ - Open for writing or overwriting
‘rb’ - Read using binary mode
‘wb’ - Write using binary mode
‘a’ - append
We will now go ahead and see how we can read the content in a file.
Reading a File
The content of mytextfile.txt is as follows.
How you been
Good to see you
My old friend
As you can see, there are four lines in our text file.
There are a few approaches we can use to read the above file as follows
This function takes in an optional size argument that specifies the number of bytes to be read. If size = -1 is passed or if no arguments are passed, the entire file is read.
If we look at the code, we can see that we are instructing the interpreter to read 5 bytes from the file that has returned us with 5 characters from the file we opened. If we are to further understand how this works, look at the following.
We have instructed to perform the read() function twice and read 5 bytes each time. From the output, we can understand that it has returned 5 characters in two chunks. Although in the 2nd line, there are only 4 letters, the space character has also been taken into consideration when the function returned the 2nd chunk.
This function also takes in an optional size argument. It specifies the number of characters from a line that should be read. If size = -1 is passed or if no arguments are passed, the entire line is read.
If we look at the code, we are instructing the interpreter to read and print 1, 2 characters from the first line respectively and then finally print all the remaining characters of the same line. (Notice the newline character as well)
This function reads the rest of the lines from the file and returns them as a list.
The output is the same as the above. However, it should be kept in mind to pass the end parameter to the print function as set it as the empty string instead of the default newline character. We are doing this as each line of our files already contain a newline character and we do not want to include an additional newline character by the print function.
We will now look at how we can write to files.
Writing to a File
If you recall, we did have multiple functions to read files. More or less the same, we do have two functions as below that can be used to write to files.
We are passing the text we want to write to the file
If you look at what we have done here, we are first reading the content of our original file mytextfile.txt and, saving it in the content_list object. Then later, we are opening a mytextfile_reversed.txt file that gets created as if it does not exist already. Then we are writing the elements in the reverse list as lines.
If you go ahead and open the my_new_file.txt, you will see that we have written the lines of our original file to this new file in the reversed order.
We can write a sequence to the file using this function.
Let’s run the same code as above but only change the parameter we are passing from ‘w’ to ‘a’ that stands for append. We are ultimately expecting this instruction to append the reversed list from our original file again to this mytextfile_reverse.txt to the end.
If we execute the above and open the file, we will find that its content has been changed to the following.
My old friend
Good to see you
How you been
My old friend
Good to see you
How you been
Opening Two Files Simultaneously
It would not be surprising if you ever wondered how we can work with two (or more) files at the same time. Let’s say we are to read content from one file and write to the other file in one go. Let’s try to do that.
We will go ahead and create two files as follows.
We will add content as I am file 1 and I am file 2 respectively.
If we read the new_text_2.txt, we will read its content as follows.
I am file 2
I am file 1
Therefore, we can say that we have successfully read the content from one file and appended it to the other simultaneously.
Reading an Image
Before we do this, find an image (In this, we are opening a png image), and save it in the working directory. Then go ahead and open it as follows. Make sure to change the file name to the name of your image.
If you want to see how the file looks in binary, go ahead and print the whole while using read() method. This can be useful when we want to read files and prepare them to be emailed or to change its metadata.
In today’s tutorial, we learned about how to handle files using Python. We opened and closed files while reading from and writing content to them. We learned a lot about files, file types, character encoding, and other related aspects to manipulating files using Python. If you want to learn more about file handling using Python, feel free to go ahead and read the official documentation as you may find lots of resources that might be useful to you as a Python developer.