Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
howtos:profiling:gperftools [2015/06/19 12:19] – Add notes about profiling with gperftools techeehowtos:profiling:gperftools [2020/03/16 13:58] (current) – OS X to macOS 邢家朋
Line 16: Line 16:
   - To start profiling, run <code>   - To start profiling, run <code>
 LD_PRELOAD=/usr/lib/libprofiler.so CPUPROFILE=/tmp/geany.prof geany LD_PRELOAD=/usr/lib/libprofiler.so CPUPROFILE=/tmp/geany.prof geany
-</code> LD_PRELOAD is used to preload the profiler's library, CPUPROFILE determines the location of the profiler's output, and in the above example, geany is the name of the binary to be executed.+</code> LD_PRELOAD is used to preload the profiler's library, CPUPROFILE determines the location of the profiler's output, and in the above example, geany is the name of the binary to be executed. The path to the libprofiler library may differ distribution by distribution - I have also seen it under /usr/lib/libprofiler.so.0.
   - Do whatever you want to profile in Geany. When finished, close Geany.   - Do whatever you want to profile in Geany. When finished, close Geany.
   - At this point, /tmp/geany.prof contains the CPU profile information. To view it graphically, run <code>   - At this point, /tmp/geany.prof contains the CPU profile information. To view it graphically, run <code>
Line 26: Line 26:
 In the above graph we can see the skipEverything() function takes 55.2% of total time so it looks like a good candidate for optimization. In the above graph we can see the skipEverything() function takes 55.2% of total time so it looks like a good candidate for optimization.
  
-Note: I keep getting the <nowiki>__nss_hosts_lookup</nowiki> in the plot even though no host lookup is actually being performed. This one can be ignored and is probably caused by the fact that some system function is being called and its symbol name isn't known. It's always sufficient to look at the function that calls this __nss_hosts_lookup which might be the one which needs optimization.+**Note**: I keep getting the <nowiki>__nss_hosts_lookup</nowiki> in the plot even though no host lookup is actually being performed. This one can be ignored and is probably caused by the fact that some system function is being called and its symbol name isn't known. It's always sufficient to look at the function that calls this <nowiki>__nss_hosts_lookup</nowiki> which might be the one which needs optimization.
  
-Note: I wasn't able to get profiles with reasonable level of information from the GTK libraries provided by the system so if there is some GTK-related problem, it's best to compile GTK by yourself.+**Note**: I wasn't able to get profiles with reasonable level of information from the GTK libraries provided by the system so if there is some GTK-related problem, it's best to compile GTK by yourself.
  
 ==== Profiling only part of the process runtime ==== ==== Profiling only part of the process runtime ====
Line 49: Line 49:
 google-pprof --web /usr/local/bin/geany /tmp/geany.prof.0 google-pprof --web /usr/local/bin/geany /tmp/geany.prof.0
 </code> Notice the extension 0 in the profiler file - you can record multiple profiles during a single run by sending the signal several times and the individual profiles are numbered by the extension starting from 0. </code> Notice the extension 0 in the profiler file - you can record multiple profiles during a single run by sending the signal several times and the individual profiles are numbered by the extension starting from 0.
 +
 +==== macOS ====
 +There's a bug in older versions of macOS(or OS X) which prevents getting good results from the profiler. There's a (scary) way to fix this by [[https://godoc.org/rsc.io/pprof_mac_fix| binary patching the OS X kernel]] (I've used it, it worked, but no guarantees at all).
  
 ==== Other options ==== ==== Other options ====
 There are many additional options in gperftools and while I found the above sufficient for profiling Geany, there may be cases where some extra parameter may be necessary (such as filtering the graph to some subset, more frequent sampling, etc.). See the CPU profiler [[http://gperftools.googlecode.com/git/doc/cpuprofile.html| documentation]] for more info. There are many additional options in gperftools and while I found the above sufficient for profiling Geany, there may be cases where some extra parameter may be necessary (such as filtering the graph to some subset, more frequent sampling, etc.). See the CPU profiler [[http://gperftools.googlecode.com/git/doc/cpuprofile.html| documentation]] for more info.
Print/export