Python Read File Line by Line Fast
Read Time: 3 mins Languages:
Let me start directly by request, do nosotros really demand Python to read big text files? Wouldn't our normal word processor or text editor suffice for that? When I mention large here, I mean extremely large files!
Well, let's see some evidence on whether we would demand Python for reading such files or not.
Obtaining the File
In gild to acquit out our experiment, we need an extremely large text file. In this tutorial, we will be obtaining this file from the UCSC Genome Bioinformatics downloads website. The file we will exist using in detail is thehg38.fa.gz
file, which as described here, is:
"Soft-masked" assembly sequence in ane file. Repeats from RepeatMasker and Tandem Repeats Finder (with period of 12 or less) are shown in lower case; non-repeating sequence is shown in upper case.
I don't want you lot to worry if you didn't understand the above statement, as it is related to Genetics terminology. What matters in this tutorial is the concept of reading extremely large text files using Python.
Get ahead and downloadhg38.fa.gz
(please be careful, the file is 938 MB). You tin utilise vii-zip to unzip the file, or any other tool you prefer.
Subsequently you unzip the file, you will get a file calledhg38.fa
. Rename it to hg38.txt
to obtain a text file.
Opening the File the Traditional Way
What I mean here by the traditional way is using our discussion processor or text editor to open the file. Let's meet what happens when we try to practise that.
I first tried using Microsoft Word to open the file, and got the following bulletin:
Although opening the file didn't also work using WordPad and Notepad on a Windows based machine, it did open using TextEdit on a Mac Os X machine.
Simply yous get the betoken, and having some guaranteed way to open up such extremely large files would be a nice idea. In this quick tip, nosotros will run across how to do that using Python.
Reading the Text File Using Python
In this section, we are going to run across how we can read our large file using Python. Let'due south say we wanted to read the starting time 500 lines from our big text file. We tin can simply do the following:
input_file = open('hg38.txt','r') output_file = open('output.txt','w') for lines in range(500): line = input_file.readline() output_file.write(line)
Notice that we read 500 lines from hg38.txt
, line by line, and wrote those lines to a new text file output.txt
, which should wait equally shown in this file.
But say that we wanted to directly navigate through the text file without extracting it line by line and sending that to another text file, specially since this way seems more than flexible.
Navigating Through Large Text Files
Although the above step immune the states to read big text files by extracting lines from that large file and sending those lines to some other text file, direct navigating through the large file without the need to extract it line by line would be a preferable idea.
We can simply do that using Python to read the text file through the terminal screen as follows (navigating through the file 50 lines at a time):
input_file = open('hg38.txt','r') while(1): for lines in range(50): impress input_file.readline() user_input = raw_input('Type STOP to quit, otherwise press the Enter/Return primal ') if user_input == 'STOP': break
Every bit you can see from this script, you lot can now read and navigate through the large text file immediately using your concluding. Whenever you want to quit, you just need to type STOP
(case sensitive) in your terminal.
I'one thousand sure that you lot will notice how smooth Python makes it to navigate through such an extremely big text file without having any issues. Python is again proving itself to be a language striving to make our lives easier!
Source: https://code.tutsplus.com/tutorials/quick-tip-how-to-read-extremely-large-text-files-using-python--cms-25992
0 Response to "Python Read File Line by Line Fast"
إرسال تعليق