Read and Replace Content in a CSV

Become a Subscriber

Data scrubbing is something we all have to deal with from time to time. Ruby is such a great language to do this scrubbing in, because most of the time it’s already installed on your operating system and doesn’t require any additional dependencies to work with. This is a little example that scrubs a CSV and converts the data in a file.

Given an original CSV file:

 text;text;text;Mon 14 Nov 2016 13:07:30                                                                                                                                                                                                
 text;text;text;Mon 15 Nov 2016 13:07:30                                                                                                                                                                                                
 text;text;text;Mon 16 Nov 2016 13:07:30 

We are going to scrub this and change the date to a format: DD-MM-YYYY HH:MM:SS. Create this file as parser.rb and save it’s contents. This will assume that the original data is in a file in the same directory called original.csv:

require 'date'                                                                                                                                                                                                                             

newFile = File.new("new.csv", "w+")                                                                                                                                                                                                        
oldFile = File.read("original.csv")                                                                                                                                                                                                        

oldFile.lines.each do |line|                                                                                                                                                                                                               
  lineArray = line.split(';')                                                                                                                                                                                                              
  formattedDate = DateTime.parse(lineArray[3]).strftime('%d-%m-%Y %H:%M:%S')                                                                                                                                                 
  lineArray[3] = formattedDate                                                                                                                                                                                                             

  newFile.puts "#{lineArray.join(';')}\n"                                                                                                                                                                                                  
end                                                                                                                                                                                                                                        

newFile.close  

Now all we have to do is run this with: ruby parser.rb and it will create a new file new.csv in the same directory with our scrubbed output!