Thursday 2 June 2022

How you can use perl -i -pe to make modifications to alot of different files. Also grep sort uniq wc basics ... A PROCESS for fixing up multiple files.

A PROCESS for fixing up multiple files.

 Sometimes you have a bunch of files ... and you need to change same or similar thing 100s of times.

e.g. remove all the "Stat not found" from the files in /logs/stats/

[o@t12 ~]$ perl -pi -e "s/Stat not found/0/" /logs/stats/*_absent_subscriber
[o@t12 ~]$ grep " not " /logs/stats/*_absent_subscriber
[o@t12 ~]$ grep " not " /logs/stats/*
[o@t12 ~]$ ssh o@t11 'perl -pi -e "s/Stat not found/0/" /logs/stats/*_absent_subscriber'
[o@t12 ~]$ ssh o@t11 "grep ' not ' /logs/stats/*"
[o@t12 ~]$ ssh o@v21 "grep ' not ' /logs/stats/*"
[o@t12 ~]$ ssh o@v22 "grep ' not ' /logs/stats/*"

e.g. in cconf-dir find all references to item and replace/rename

e.g. in source code renaming some common function .. or commenting out .. or in .. or removing

 

Perl "in-place edit" == perl -pi -e is useful.  (or perl -p -i -e but not perl -pie because -i takes an optional arg like .bak)

see `perldoc perlrun` or  https://perldoc.perl.org/perlrun#i

 -p = makes perl loop/iterate over filename cmd-line args
 
 -i = in-place edit of files passed on command-line
 
 -e = perl command/script one-liner to run

from stackoverflow.com:

We can use the B::Deparse backend processor to see what Perl code is being executed like this

$ perl -MO=Deparse -pi.bak -e 's/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)//' lib/cat/smsc/*.py

 shows the equivalent Perl program to be

BEGIN { $^I = ".bak"; }
LINE: while (defined($_ = readline ARGV)) {
    s/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)//;
}
continue {
    die "-p destination: $!\n" unless print $_;
}
-e syntax OK

see also https://stackoverflow.com/questions/32225091/speed-up-a-series-of-perl-pi-commands for other usages

 
e.g. perl -pi -e 's/old_string/new_string/g' file_pattern

Perl is actually not that scary, syntax very close to c, also regular expressions same as grep and sed.

You could do this with awk but the syntax is weirder and harder to learn.

As a demo for this we will look at tc_qa test source directory grepping and changing calls to waitForCCR.

ALSO we will see how to use:

grep with counting, using diff and meld to verify changes...

https://learnbyexample.github.io/learn_perl_oneliners/one-liner-introduction.html

https://en.wikipedia.org/wiki/Perl

 

Overview of procedure


1.
grep and count with wc ... to see extent of work to be done

2.0
Always backup everything first!

And backup after you think you have done a good batch of useful work.

Just in case.

  tar -zcvf backup_xxx.tgz tc_qa dir and files list 

2.
work out command expression ...
perl -pi -e to replace bulk of similar things
other edits with vi/emacs/...

3. set the hounds free, do the replace - on batches e.g. tc_qa/tests/mmsc  or tc_qa/cat/smsc  ....
review, grep and count, meld, cvs diff
for any anomalies edit file directly and adjust ...
   and maybe adjust the grep expression or paths or the perl replace expression to deal with future similar cases

 

Demo of Procedure

### 1. grep and count with wc ... to see extent of work to be done ###

# Eyeball lines found and count 

tc_qa$ find . -type f -exec grep waitForCCR {} + |less

tc_qa$ find . -type f -exec grep waitForCCR {} + |wc -l
3425

Breakdown of some useful grep args:

grep -c  #  count - count in each file
grep -h  #  show the match
grep -H  #  show name and match
grep -l  #  show just the file name 

tc_qa$ find . -type f -exec grep waitForCCR {} +
tc_qa$ find . -type f -exec grep -c waitForCCR {} +
tc_qa$ find . -type f -exec grep -h waitForCCR {} +
tc_qa$ find . -type f -exec grep -H waitForCCR {} +
tc_qa$ find . -type f -exec grep -l waitForCCR {} +

tc_qa$ find . -type f -exec grep -h waitForCCR {} + |sort -i |uniq -c -i

# SORT AND GREP WITH -i to ignore whitespace or non-printing chars. 

BE PARANOID. PARANOID is GOOD.

 1. You don't want to miss things you need to change due to whitespace or positioning or syntax differences

 2. You want to be very specific in the thing you do want to change 

     e.g. want to remove waitFOrCCR calls in source code, 

       BUT NOT waitForCCR mentions in CHANGES 

       BUT NOT waitForCCR function definition and non-call references

 

#grep -C n -A n -B n  to look at context
tc_qa$ find . -type f -exec grep -C 3 -waitForCCR {} + |less

 

### 2.0 BACKUP full work area

Again, PARANOID is GOOD.

A backup can restore all or part of your work.

If using git or something with local commits you can just locally commit changes as you go.

# e.g.
tar -zcf backup_tcqa.tgz tc_qa

# and verify:
tar -ztvf backup_tcqa.tgz


# tar -f <tarfile> use tarfile as output, not stdout

# tar -z use gzip compression (-j -b for others)

# tar -c create tar archive

# tar -t view Table of contents

# tar -x extract tar archive

# tar -v be verbose

 

### 2. work out command expression ...

pick one or two files
tc_qa$ find . -type f -name "*.py" -exec grep -H waitForCCR {} + |less

./lib/cat/smsc/cat_mt_mt_sms_ems.py:        self.sut.waitForCCR()
./lib/cat/smsc/cat_mt_mt_sms_ems.py:        self.tc.waitForCCR()
./lib/cat/smsc/cat_mt_mt_sms_ems.py:        self.sut.waitForCCR()
./lib/cat/smsc/cat_mt_mt_sms_ems.py:        self.sut.waitForCCR()
./lib/cat/smsc/cat_mt_mt_sms_ems.py:        self.sut.waitForCCR()

# use -i.bak to make a backup as you go 
perl -pi.bak -e 's/self.tc.waitForCCR()//;s/tc.waitForCCR()//;s/sut.waitForCCR()//' lib/cat/smsc/cat_mt_mt_sms_ems.py
ls -alstr lib/cat/smsc/cat_mt_mt_sms_ems.py*
diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

 

### WHOOPS! restore and adjust expression

cp -p lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

# or restore from cvs if you fluff that up!   rm lib/cat/smsc/cat_mt_mt_sms_ems.py && cvs up lib/cat/smsc/cat_mt_mt_sms_ems.py

perl -pi.bak -e 's/^\s*(self.tc.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)//' lib/cat/smsc/cat_mt_mt_sms_ems.py
diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

 

### WHOOPS! restore and adjust expression

cp -p lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}
perl -pi.bak -e 's/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)//' lib/cat/smsc/cat_mt_mt_sms_ems.py
diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

# okay, that looks close to right. 

 

### It would be nice to remove entire line instead of leaving blank lines where waitForCCR() calls used to be.

cp -p lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

perl -pi.bak -e 's/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)\s*#*.*$//' lib/cat/smsc/cat_mt_mt_sms_ems.py

diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

# Hummm, NICE. That looks good now.

 

# BE PARANOID.   CHECK changes e.g. using meld or diff

# also can use meld like diff ...

meld lib/cat/smsc/cat_mt_mt_sms_ems.py{.bak,}

## COMPARE against cvs

tc_qa$ cvs diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py

 

## e.g. RESTORE FROM CVS if needed:

tc_qa$ rm lib/cat/smsc/cat_mt_mt_sms_ems.py
tc_qa$ cvs up -d -P lib/cat/smsc/cat_mt_mt_sms_ems.py

 

#### 2.1. Deal with commments and miscellaneous stuff ... 

 

2.1.1 As you page through cvs diff -u reviewing each line, if you see a stray comment or custom edit needed then just open that file and do the edit.

         # Wait for cconf replication
-        self.sut.waitForCCR()
+

e.g. get rid of now superflous comments like that.

perl -pi.bak -e 's/^\s*#\s*(Wait for cconf replication|Wait for CCR).*$//' lib/cat/smsc/cat_mt_mt_sms_ems.py

 

We can put our multiple replace commands into one script, e.g. vi tidyUpCCR.pl

We add a SEMI-COLON at end of each line - perl syntax to denote end of command.

s/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)\s*#*.*$//;

s/^\s*#\s*(Wait for cconf replication|Wait for CCR).*$//;

 

And run like this:

perl -pi.bak2 tidyUpCCR.pl lib/cat/smsc/cat_mt_mt_sms_ems.py

tc_qa$ cvs diff -u lib/cat/smsc/cat_mt_mt_sms_ems.py

 

#### 3. ONCE HAPPY, set the hounds free by combining find and the perl -pi -e:

## cautiously at first, let's look at the first 10 files ... (again BEING PARANOID is GOOD)


tc_qa$ find . -type f -name "*.py" -exec grep -l waitForCCR {} + |head
./lib/mhlib/CimdLib.py
./lib/mhlib/SmsHubLib.py
./lib/mhlib/SmppLib.py
./lib/cat/smsc/cat_mo_to_esme_with_segmented_chinese_text.py
./lib/cat/smsc/cat_mtmt_lonely_mtfsm_transit.py
./lib/cat/smsc/cat_ue_sms_over_ip_to_short_number.py
./lib/cat/smsc/cat_mo_to_mt_DCS.py
./lib/cat/smsc/cat_ccsd_blocking_against_mo_spoof_by_a_vmsc_gt.py
./lib/cat/smsc/cat_smart_control.py
./lib/cat/smsc/cat_segmented_message_no_delivery_report.py

## check what we will change ...
tc_qa$ find . -type f -name "*.py" -exec grep -l waitForCCR {} + |head |xargs grep waitForCCR

## set the hounds free on first 10 files:
### just by expression tc_qa$ find . -type f -name "*.py" -exec grep -l waitForCCR {} + |head |xargs perl -pi.bak -e 's/^\s*(self.tc.waitForCCR|self.sut.waitForCCR|tc.waitForCCR|sut.waitForCCR)\(\)\s*#*.*$//' 

### better using the script:

tc_qa$ find . -type f -name "*.py" -exec grep -l waitForCCR {} + |head |xargs perl -pi.bak tidyUpCCR.pl

# or in batches, by directory:

tc_qa$ find lib/tests/mmsc -type f -name "*.py" -exec grep -l waitForCCR {} + |head |xargs perl -pi.bak tidyUpCCR.pl

cvs up -d -P lib/tests/mmsc

cvs diff -u lib/tests/mmsc |less

### HUMM, interesting,
 * after storeCConfItem we want to remove ok
 * after cfg.write() we want to keep, in fact there are some cfg.writes without a waitForCCR after   ----  NO, actually they should also be removed (double check review code)
 * there are some general waitForCCR() at start of tests, should they go ? it depends on what is in setup functions
 * there are some #Wait for CCR comments which are superflous anyway
 * there's the odd waitForCCR e.g. at start of setUp in cat_smart_control.py ... why is that there?

### REVIEW:
grep again and count
meld to compare
also cvs diff -u if files in cvs
OR other source control review before commit

 

### NEXT STAGE of REVIEW: TEST IT!

rsync -avzhP FROM TO

rsync each file directly into a QA container or into host then container and run regression

if tests are good .. then BE PARANOID ... but you might be close to being able to commit the changes

 

TIP: do bite-sized chunks of changes and ONLY TEST and COMMIT from a completely clean workspace

e.g. don't commit 10 changed files after changing 12 and testing with those 12 changes

It makes sure you eyeball everything that is committed and also that everything works together.