Optimization in preparation for P12C (August 2012)

From Daya Bay
(Redirected from Profile r17658)
Jump to navigation Jump to search

Also see Optimization

profiling on mac

20120801

  • Profile r17658 (used for KUP starting 25 July 2012) on my mac desktop (os x 10.6.8)
  • Install optimized trunk at r17658

20120802

  • Make installation optimized by default
  • test installation $ ./dybinst trunk tests
  • DBI tests fail due to stale password in .my.cnf
  • re-test after freshening password ./dybinst trunk tests db dbivalidate
    • two failures due to lack of data file used in tests Details here.
  • get the raw data file from pdsf: scp djaffe@pdsf.nersc.gov:/eliza16/dayabay/data/exp/dayabay/2012/daq/Neutrino/0304/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data .
  • What is run for KUP? Command at link has errant "-m".
    • Here is what I ran and profiled: nuwa.py -n -1 @$KUP11ASHARE/runReco @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon.root ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data
  • Put the statistics in #1025 Keep Up Production Job Optimization, a perpetual ticket created by Dan.

20120803

  • Yasu says that ChargeTemplatePosTool has been improved at r17689 so I should profile a later version of trunk to check if #1129 Optimize ChargeTemplatePosTool is done.
  • Do analogous dance with ./dybinst -z 17726 trunk all

20120806

  • Profiling $ nuwa.py -n -1 @$KUP11ASHARE/runReco @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon_r17726.root ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data >& r17726.log 2>&1 < /dev/null &
  • Don't see any performance improvement. Details in #1129
  • Yasu said he forgot update trunk, so repeat procedure
$ mkdir r17731
$ cd r17731
$ svn export http://dayabay.ihep.ac.cn/svn/dybsvn/installation/trunk/dybinst/dybinst
A    dybinst
Export complete.
$ export CMTCONFIG="i386-darwin-gcc42-opt"
$ ./dybinst -z 17731 trunk all

20120807

  • r17731 is definite improvement for ChargeTemplatePosTool. Close #1129 after adding details.
  • However I notice that nuwa job is now using >5GB memory, this caused Instruments (the mac app with the time profiler) to crash.
  • Investigate memory usage (see FAQ:How_much_memory_is_nuwa_using?).
$ nuwa.py  -n 5000 -m "MemoryCheck -o memory.root" @$KUP11ASHARE/runReco  @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon_r17731_memorycheck.root  ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data >& r17731_memorycheck.log 2>&1 < /dev/null &
[1] 57801
  • Not enough data analyzed, try again:
]$ nuwa.py  -n -1 -m "MemoryCheck -o memory.root" @$KUP11ASHARE/runReco  @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon_r17731_memorycheck.root  ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data >& r17731_memorycheck.log 2>&1 < /dev/null &
[1] 58006
  • Results indicate resident memory usage is below 1 GB. Figure shows memory usage in GB vs execution cycle.

R17731 resident memory in GB from MemoryCheck.png

20120808

RollingGain optimization

  • Zhe reports that he modified RollingGain to address #1231 at r17740. Test it.
  • Usual dance
$ export CMTCONFIG="i386-darwin-gcc42-opt"[djaffe@bnlku24 work]$ svn export 
$ mkdir r17740
$ cd r17740
$ svn export http://dayabay.ihep.ac.cn/svn/dybsvn/installation/trunk/dybinst/dybinst
A    dybinst
Export complete.
[djaffe@bnlku24 r17740]$ ./dybinst -z 17740 trunk all
  • Substantial improvement see #1231

RPC cable cache optimization

  • Jiaheng reports he addressed #1230 at r17745.
  • Usual dance (started after r17740 installation)
$ mkdir r17745
$ cd r17745
$ export CMTCONFIG="i386-darwin-gcc42-opt"[djaffe@bnlku24 r17745]$ svn export http://dayabay.ihep.ac.cn/svn/dybsvn/installation/trunk/dybinst/dybinst
A    dybinst
Export complete.
$ ./dybinst -z 17745 trunk all
  • Major improvement. See #1230

20120809

  • Most items in #1025 addressed. Try to make overall comparison of CPU time usage improvement. Can't use --audit, so compare start time of ApplicationManager with time that log file was closed.
  • For r17745:
    • Start: 9 Aug 2012 08:32:58.
    • End: 9 Aug 2012 10:23
    • Elapsed time 111 minutes.
  • Original r17658 installation was modified with r17735. Do-over:
$ mv r17658 r17658_with_r17725mod
$ mkdir  r17658 
$ export CMTCONFIG="i386-darwin-gcc42-opt"
$ cd r17658
$ svn export http://dayabay.ihep.ac.cn/svn/dybsvn/installation/trunk/dybinst/dybinst
A    dybinst
Export complete.
$ ./dybinst -z 17658  trunk all

20120810

  • Build of r17658 completed. Run the job
nuwa.py  -n -1 @$KUP11ASHARE/runReco  @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon_r17658.root  ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data >& r17658.log 2>&1 < /dev/null &
[1] 57065
  • Start: Aug 10 08:39:20 2012
  • End: Aug 10 11:01
  • Elapsed time: 142 minutes
  • Comparisons with r17745:
    • 142/111 = 1.28
    • 111/142 = 0.78
  • The estimate for P12C production time was (roughly) 30 days. This is now reduced by about one week to ~23.5 days.

20120905

  • Test optimized trunk at r18013 on mac. This is a guess at the revision to be used for P12C.
  • svn export http://dayabay.ihep.ac.cn/svn/dybsvn/installation/trunk/dybinst/dybinst
  • export CMTCONFIG="i386-darwin-gcc42-opt"
  • ./dybinst -z 18013 trunk all
  • make opt by default
  • ./dybinst trunk tests
  • OK
  •  $  nuwa.py -n -1 @$KUP11ASHARE/runReco @$KUP11ASHARE/runTags @$KUP11ASHARE/runODM @$KUP11ASHARE/runFilters -o recon_r18013.root ../data/daq.Neutrino.0022647.Physics.EH1-Merged.SFO-1._0205.data >& r18013.log 2>&1 < /dev/null &
  • Done
    • start Sep 6 10:38:39 2012
    • end Sep 6 12:41
    • elapsed 2:03 = 123 minutes
  • Compares favorably to r17658 performance (142 minutes), so projected time for P12C is unchanged

Data file size info

Email from Cheng-Ju:

The file size has been constant at about 1GB per file. However, it looks like we have about 1.7 times more readout per file now than a year ago. 

From March 2012: daq.Neutrino.0022647.Physics.EH1-Merged.P12B-P._0205 has 755681 readouts
From Sept 2011: daq.NoTag.0014427.Physics.EH1-Merged.P12B-P._0001 has 434215 readouts

I haven't checked the content, but the difference in readout size may be related to the flash ADC data.
  • FADC data was prescaled in Dec(?)2011. I suspect that is the reason for the increase in readouts (same data volume but less data per readout). Confirmed by Miao: Yes. 40% data volume reduction due to prescaling of FADC is consistent with 70% increase of readouts.

Profiling Statics from nuwa.py --audit on PDSF

Showing only processes that used one second or more of CPU time.

ibdBoxCutFigsFarAD4:execute           INFO Time User   : Tot= 1.64  [s] Ave/Min/Max= 2.17(+- 46.5)/    0/1e+03 [us] #=757006
ibdBoxCutFigsFarAD2:execute           INFO Time User   : Tot= 1.74  [s] Ave/Min/Max=  2.3(+- 47.9)/    0/1e+03 [us] #=757006
ibdBoxCutFigsLingAoAD1:execute        INFO Time User   : Tot= 1.76  [s] Ave/Min/Max= 2.32(+- 48.1)/    0/1e+03 [us] #=757006
ibdBoxCutFigsFarAD3:execute           INFO Time User   : Tot= 1.73  [s] Ave/Min/Max= 2.29(+- 47.8)/    0/1e+03 [us] #=757006
ibdBoxCutFigsLingAoAD2:execute        INFO Time User   : Tot= 1.77  [s] Ave/Min/Max= 2.34(+- 48.3)/    0/1e+03 [us] #=757006
ibdBoxCutFigsFarAD1:execute           INFO Time User   : Tot=  1.8  [s] Ave/Min/Max= 2.38(+- 48.7)/    0/1e+03 [us] #=757006
ibdBoxCutFigsDayaBayAD2:execute       INFO Time User   : Tot= 2.22  [s] Ave/Min/Max= 2.93(+-   54)/    0/1e+03 [us] #=757006
Ge68Data:execute                      INFO Time User   : Tot= 2.27  [s] Ave/Min/Max=    3(+- 54.7)/    0/1e+03 [us] #=757006
DybArchiveOutput:finalize             INFO Time User   : Tot= 2.16  [s]                                             #=  1
Co60Data:execute                      INFO Time User   : Tot= 2.37  [s] Ave/Min/Max= 3.13(+- 55.8)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_ADETWind...   INFO Time User   : Tot= 2.65  [s] Ave/Min/Max= 3.49(+-   59)/    0/1e+03 [us] #=757006
keep_Event_Data_Calib_ADLED:execute   INFO Time User   : Tot= 2.68  [s] Ave/Min/Max= 3.54(+- 59.4)/    0/1e+03 [us] #=757006
keep_Event_Data_Calib_Co60:execute    INFO Time User   : Tot=  2.7  [s] Ave/Min/Max= 3.57(+- 59.6)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_NeutronS...   INFO Time User   : Tot= 2.68  [s] Ave/Min/Max= 3.54(+- 59.4)/    0/1e+03 [us] #=757006
keep_Event_Data_Calib_Ge68:execute    INFO Time User   : Tot= 2.69  [s] Ave/Min/Max= 3.56(+- 59.6)/    0/1e+03 [us] #=757006
keep_Event_Tag_Calib_Co60:execute     INFO Time User   : Tot=  2.7  [s] Ave/Min/Max= 3.57(+- 59.7)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_Coincide...   INFO Time User   : Tot= 2.73  [s] Ave/Min/Max=  3.6(+- 59.9)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_NeutronH:...   INFO Time User   : Tot= 2.74  [s] Ave/Min/Max= 3.62(+-   60)/    0/1e+03 [us] #=757006
keep_Event_Tag_Calib_ADLED:execute    INFO Time User   : Tot= 2.74  [s] Ave/Min/Max= 3.62(+- 60.1)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_Coinciden...   INFO Time User   : Tot= 2.72  [s] Ave/Min/Max=  3.6(+- 59.8)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_NeutronS...   INFO Time User   : Tot= 2.77  [s] Ave/Min/Max= 3.65(+- 60.3)/    0/1e+03 [us] #=757006
keep_Event_Tag_Muon_MuonAny:execute   INFO Time User   : Tot= 2.76  [s] Ave/Min/Max= 3.65(+- 60.3)/    0/1e+03 [us] #=757006
keep_Event_Tag_Muon_FirstMuonTri...   INFO Time User   : Tot= 2.77  [s] Ave/Min/Max= 3.66(+- 60.4)/    0/1e+03 [us] #=757006
keep_Event_CalibReadout_CalibRea...   INFO Time User   : Tot= 2.78  [s] Ave/Min/Max= 3.67(+- 60.5)/    0/1e+03 [us] #=757006
keep_Event_Tag_Calib_Ge68:execute     INFO Time User   : Tot= 2.78  [s] Ave/Min/Max= 3.67(+- 60.4)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_ADETWindo...   INFO Time User   : Tot= 2.78  [s] Ave/Min/Max= 3.67(+- 60.5)/    0/1e+03 [us] #=757006
keep_Event_Tag_Muon_MuonAll:execute   INFO Time User   : Tot= 2.78  [s] Ave/Min/Max= 3.68(+- 60.5)/    0/1e+03 [us] #=757006
keep_Event_Rec_AdScaled:execute       INFO Time User   : Tot= 2.79  [s] Ave/Min/Max= 3.68(+- 60.6)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_ADSingle...   INFO Time User   : Tot= 2.78  [s] Ave/Min/Max= 3.67(+- 60.5)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_MuonLoose...   INFO Time User   : Tot=  2.8  [s] Ave/Min/Max=  3.7(+- 60.7)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_Coinciden...   INFO Time User   : Tot= 2.81  [s] Ave/Min/Max= 3.72(+- 60.9)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_MuonTight...   INFO Time User   : Tot= 2.82  [s] Ave/Min/Max= 3.73(+- 60.9)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_MuonData...   INFO Time User   : Tot= 2.81  [s] Ave/Min/Max= 3.71(+- 60.8)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_MuonRecS...   INFO Time User   : Tot= 2.81  [s] Ave/Min/Max= 3.71(+- 60.8)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_NeutronSp...   INFO Time User   : Tot= 2.84  [s] Ave/Min/Max= 3.75(+- 61.1)/    0/1e+03 [us] #=757006
keep_Event_Tag_Muon_Retrigger:ex...   INFO Time User   : Tot= 2.82  [s] Ave/Min/Max= 3.72(+- 60.9)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_NeutronGd...   INFO Time User   : Tot= 2.86  [s] Ave/Min/Max= 3.78(+- 61.4)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_Coincide...   INFO Time User   : Tot=  2.8  [s] Ave/Min/Max=  3.7(+- 60.7)/    0/1e+03 [us] #=757006
keep_Event_Rec_MuonSimple:execute     INFO Time User   : Tot= 2.87  [s] Ave/Min/Max= 3.79(+- 61.4)/    0/1e+03 [us] #=757006
keep_Event_Rec_PoolSimple:execute     INFO Time User   : Tot= 2.83  [s] Ave/Min/Max= 3.74(+- 61.1)/    0/1e+03 [us] #=757006
keep_Event_Data_Physics_Spallati...   INFO Time User   : Tot= 2.88  [s] Ave/Min/Max= 3.81(+- 61.6)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_NeutronSp...   INFO Time User   : Tot= 2.87  [s] Ave/Min/Max= 3.79(+- 61.5)/    0/1e+03 [us] #=757006
keep_Event_Rec_RpcSimple:execute      INFO Time User   : Tot= 2.92  [s] Ave/Min/Max= 3.85(+- 61.9)/    0/1e+03 [us] #=757006
keep_Event_Tag_Physics_ADSingles...   INFO Time User   : Tot= 2.95  [s] Ave/Min/Max=  3.9(+- 62.3)/    0/1e+03 [us] #=757006
ibdBoxCutFigs_DayaBayAD1:execute      INFO Time User   : Tot= 3.32  [s] Ave/Min/Max= 4.39(+- 66.1)/    0/1e+03 [us] #=757006
NeutronSpTightData:execute            INFO Time User   : Tot= 3.83  [s] Ave/Min/Max= 5.07(+-   71)/    0/1e+03 [us] #=757006
clearAlg:execute                      INFO Time User   : Tot= 4.17  [s] Ave/Min/Max= 5.51(+-   74)/    0/1e+03 [us] #=757006
keep_Event_Rec_AdSimple:execute       INFO Time User   : Tot=  4.3  [s] Ave/Min/Max= 5.68(+- 75.2)/    0/1e+03 [us] #=757006
FlasherIdAlg:execute                  INFO Time User   : Tot= 4.71  [s] Ave/Min/Max= 6.22(+- 78.6)/    0/1e+03 [us] #=757006
SpallationFigsAlg:execute             INFO Time User   : Tot=  5.6  [s] Ave/Min/Max=  7.4(+- 85.7)/    0/2e+03 [us] #=757006
NeutronSpLooseData:execute            INFO Time User   : Tot= 5.76  [s] Ave/Min/Max= 7.61(+- 86.9)/    0/1e+03 [us] #=757006
ADLEDData:execute                     INFO Time User   : Tot= 5.89  [s] Ave/Min/Max= 7.78(+- 87.9)/    0/1e+03 [us] #=757006
ADSingleData:execute                  INFO Time User   : Tot= 5.87  [s] Ave/Min/Max= 7.75(+- 87.7)/    0/1e+03 [us] #=757006
keep_Event_Data_CalibStats:execute    INFO Time User   : Tot= 5.54  [s] Ave/Min/Max= 3.66(+- 60.3)/    0/1e+03 [us] #=1514012
AdReconFigs:execute                   INFO Time User   : Tot= 9.59  [s] Ave/Min/Max= 12.7(+-  112)/    0/2e+03 [us] #=757006
NeutronSpLooseTag:execute             INFO Time User   : Tot= 9.59  [s] Ave/Min/Max= 12.7(+-  112)/    0/1e+03 [us] #=757006
MuonRecC:execute                      INFO Time User   : Tot= 10.6  [s] Ave/Min/Max=   14(+-  117)/    0/1e+03 [us] #=757006
NeutronCap:execute                    INFO Time User   : Tot=   12  [s] Ave/Min/Max= 15.9(+-  125)/    0/1e+03 [us] #=757006
MuonData:execute                      INFO Time User   : Tot= 12.5  [s] Ave/Min/Max= 16.5(+-  127)/    0/1e+03 [us] #=757006
Ge68Tag:execute                       INFO Time User   : Tot= 9.14  [s] Ave/Min/Max= 12.1(+-  109)/    0/1e+03 [us] #=757006
random_seeder:execute                 INFO Time User   : Tot= 14.1  [s] Ave/Min/Max= 18.7(+-  135)/    0/1e+03 [us] #=757006
ADCoincTagging:execute                INFO Time User   : Tot=   15  [s] Ave/Min/Max= 19.9(+-  140)/    0/2e+03 [us] #=757006
ADLEDTag:execute                      INFO Time User   : Tot= 18.4  [s] Ave/Min/Max= 24.3(+-  154)/    0/1e+03 [us] #=757006
ToolSvc.ChargePointTool:initialize    INFO Time User   : Tot=   20  [s]                                             #=  1
PoolSimple:initialize                 INFO Time User   : Tot=   20  [s]                                             #=  1
Co60Tag:execute                       INFO Time User   : Tot= 12.1  [s] Ave/Min/Max=   16(+-  125)/    0/1e+03 [us] #=757006
MuonCalibFigs:execute                 INFO Time User   : Tot= 31.9  [s] Ave/Min/Max= 42.1(+-  277)/    0/5.9e+04 [us] #=757006
MuonTightAlg:execute                  INFO Time User   : Tot= 34.8  [s] Ave/Min/Max=   46(+-  210)/    0/1e+03 [us] #=757006
PoolSimple:execute                    INFO Time User   : Tot= 37.5  [s] Ave/Min/Max= 49.6(+-  217)/    0/1e+03 [us] #=757006
RpcSimple:execute                     INFO Time User   : Tot= 43.2  [s] Ave/Min/Max=   57(+-3.06e+04)/    0/2.66e+07 [us] #=757006
RollingGain:finalize                  INFO Time User   : Tot= 44.7  [s]                                             #=  1
SpallData:execute                     INFO Time User   : Tot= 45.7  [s] Ave/Min/Max= 60.3(+-  238)/    0/1e+03 [us] #=757006
MuonLooseAlg:execute                  INFO Time User   : Tot= 49.7  [s] Ave/Min/Max= 65.7(+-  248)/    0/1e+03 [us] #=757006
NeutronSpTightTag:execute             INFO Time User   : Tot= 57.3  [s] Ave/Min/Max= 75.7(+-  288)/    0/2e+03 [us] #=757006
ADSingleTag:execute                   INFO Time User   : Tot= 61.7  [s] Ave/Min/Max= 81.5(+-  274)/    0/1e+03 [us] #=757006
AdScaled:execute                      INFO Time User   : Tot= 62.2  [s] Ave/Min/Max= 82.2(+-  275)/    0/2e+03 [us] #=757006
CoincidenceTight:execute              INFO Time User   : Tot=  142  [s] Ave/Min/Max=  188(+-  849)/    0/7e+03 [us] #=757006
AdCalibFigs:execute                   INFO Time User   : Tot=  158  [s] Ave/Min/Max=  209(+-  407)/    0/8e+03 [us] #=757006
AdReadoutFigs:execute                 INFO Time User   : Tot=  304  [s] Ave/Min/Max=  402(+-3.48e+03)/    0/1e+06 [us] #=757006
CalibStatsAlg:execute                 INFO Time User   : Tot=  413  [s] Ave/Min/Max=0.545(+-0.529)/    0/    5 [ms] #=757006
RollingGain:execute                   INFO Time User   : Tot=  450  [s] Ave/Min/Max=0.595(+-0.903)/    0/   25 [ms] #=757006
CalibAlg:execute                      INFO Time User   : Tot= 8.84[min] Ave/Min/Max=0.701(+-0.741)/    0/    8 [ms] #=757006
AdSimple:execute                      INFO Time User   : Tot= 11.4[min] Ave/Min/Max=0.904(+- 1.29)/    0/   27 [ms] #=757006
DybArchiveOutput:execute              INFO Time User   : Tot= 28.1[min] Ave/Min/Max= 2.23(+- 8.63)/    0/1.03e+03 [ms] #=757006
ChronoStatSvc                         INFO Time User   : Tot= 91.6[min]                                             #=  1
*****Chrono*****                      INFO *******