Is grep Slow? Sure it is. Read on..
Just a few days back there was an electrical outage and lot of applications were dead.
In one case, lot of customer orders could not be processed. Hence, there rose a need of manual intervention and extraction of orders (XML) from a logfile and re-feeding them to another system. The task was simple, I had order numbers in orders.txt and I had to write a shell script to grep for a particular xml containing each of these orders, extract XML and create a file for each order.
1 2 3 4 5 6 7 8 9 10 11 12 13
But the problem was that the log file in which I was searching was too huge. It was 5GBs in total. Hence the grep was taking minimum 4-5 Minutes to search one order and create an xml file for that. Clearly this was not a solution, as I had to find thousand orders in those log files and it was very critical for end customer.
If my calulation was right, I had to spend:
4 Mins = 1 Xml
60 Mins = 1 Hr = 15 Xml
at this rate I would have spent atleast 3-4 days CPU time , to get all those 1000 XMLs. (Not to mention the pain of getting screwed and frustration). Meaning which, we all would have been screwed over and over again for 3-4 days by the customer.
One liner saved us.
I used this ruby command, to first find the relevant generic string then create order xml files using a normal shell script as above. I thought I would keep this fir future reference.
Wondereful, Ruby took just few minutes to grep the regular exp string into a 5GB log file, and now I had to search orders into this smaller reduced size intermediate file.
Thus, this saved us 3 days and did wonders in just half an hour.
Credit for the one liner goes to Garry Tan, where I found this wonderful ruby command.