Month: May 2014

Read a text file, line by line

Aside Posted on Updated on

How to read lines of text file ?


File.open(filename)     { |f|  while line=f.gets ; puts line.chomp; end }
File.open(filename)     { |f|  f.each_line {|line| puts line } }
File.foreach(filename) {|line|  puts line }
IO.foreach(filename)   { |line| puts line }

Load file content and iterate on each line ?


File.readlines(filename).each { |line| puts line }
File.read(filename).split(/\r?\n/).each { |line| puts line }

Benchmark

see https://gist.github.com/glurp/9217b45e3def78c218b3

Linux:


Create file : a big file
File size=97000 KB
Read file line by line
  while gets   Duration: 5.824797630310059 sec
  f.each_line  Duration: 2915.4045581817627 ms
  File.foreach Duration: 3.3510355949401855 sec
  IO.foreach   Duration: 3.3840091228485107 sec
  File.readlines  Duration: 3.601409435272217 sec
  File.read.split Duration: 6.609829664230347 sec
  File.readlines  Duration: 3.1886909008026123 sec
  File.read.split Duration: 6.332392692565918 sec
Create file : a little file
File size=16.2978515625 KB
Read file line by line
  while gets   Duration:  by iteration 9.529590606689453 micro s.
  f.each_line  Duration:  by iteration 4.6253204345703125 micro s.
  File.foreach Duration:  by iteration 5.245208740234375 micro s.
  IO.foreach   Duration:  by iteration 6.377696990966797 micro s.
  File.readlines  Duration:  by iteration 4.470348358154297 micro s.
  File.read.split Duration:  by iteration 10.356903076171875 micro s.

Windows:


Create a big file
File size=104000 KB

Read file line by line
while gets   Duration: 5.799330711364746 sec
f.each_line  Duration: 3.1441800594329834 sec
File.foreach Duration: 3.768216133117676 sec
IO.foreach   Duration: 3.5362019538879395 sec
File.readlines  Duration: 3.7572150230407715 sec
File.read.split Duration: 5.539316892623901 sec


Create a little file
File size=17.4736328125 KB
Read file line by line
while gets   Duration:  by iteration 9.999275207519531 micro s.
f.each_line  Duration:  by iteration 9.999275207519531 micro s.
File.foreach Duration:  by iteration 9.999275207519531 micro s.
IO.foreach   Duration:  by iteration 10.004043579101562 micro s.
File.readlines  Duration:  by iteration 9.999275207519531 micro s.
File.read.split Duration:  by iteration 10.008811950683594 micro s.

Awk like ?

def awk(filename,option) 
  noline=0
  File.readlines(filename).each { |line| 
    option.each { |reg,proc| 
      if Regexp === reg
        if line=~reg then proc.call(line,noline) ; break end 
      end
    }
    noline+=1
  }
end 

awk(filename, {
/A/  => proc { |line,no|  p "A:#{line}" },
/a/  => proc { |line,no|  p "a:#{line}" },
/\d/ => proc { |line,no|  p "d:#{line}" },
/^$/ => proc { p "empty" }
})