#56793 closed defect (fixed)
macports base: installing a port with many files appears to hang; file registration loop slows dramatically for file counts > 10,000
Reported by: | yan12125 (Chih-Hsuan Yen) | Owned by: | mascguy (Christopher Nielsen) |
---|---|---|---|
Priority: | Normal | Milestone: | MacPorts 2.7.1 |
Component: | base | Version: | 2.5.3 |
Keywords: | Cc: | jmroot (Joshua Root) | |
Port: |
Description (last modified by mascguy (Christopher Nielsen))
I'm trying to install texlive-fonts-extra, which has 87,000 files in its destroot/ folder. port -d install texlive-fonts-extra
appears hang - no messages for several minutes and the CPU usage keeps high at around ~98%.
I found that the installation process is running the loop at:
If the loop is inevitable, how about adding some messages saying that the port process is busy doing something? It will improve the user experience greatly.
/cc the author of the aforementioned loop.
Attachments (4)
Change History (32)
comment:1 Changed 4 years ago by gpolitis (George Politis)
comment:2 Changed 4 years ago by mascguy (Christopher Nielsen)
Cc: | mascguy added |
---|
comment:3 Changed 4 years ago by mascguy (Christopher Nielsen)
I'm currently dealing with this too, for port texlive-fonts-extra
. Except in my case, the process has been running for more than 20 minutes.
Since the port installs nearly 90,000 files, it's reasonable for the process to take some time. But ideally not that long.
Ryan/Josh, do you folks know if it would be feasible to migrate the logic in question, to native code? Or are there other potential options to improve the performance of this loop?
I'd gladly help with the development effort, though I'd need some hints/direction.
comment:4 Changed 4 years ago by mascguy (Christopher Nielsen)
If there's no way to improve the performance of this, would it be possible to add a progress bar?
comment:5 Changed 4 years ago by mascguy (Christopher Nielsen)
Also, just curious: Have we considered using a dictionary, rather than an array, for the code in question?
I haven't (yet) debugged this enough to understand where the time is being spent. But based on research, dictionaries provide significantly faster performance, vs. an array. And it can't hurt...
comment:6 follow-up: 7 Changed 4 years ago by jmroot (Joshua Root)
I doubt the array is the performance bottleneck there; seems much more likely to be constructing all the file objects. An interface to set the property for a list of files might help. But let's not act on guesses when it comes to performance: profile!
comment:7 Changed 4 years ago by mascguy (Christopher Nielsen)
Replying to jmroot:
I doubt the array is the performance bottleneck there; seems much more likely to be constructing all the file objects. An interface to set the property for a list of files might help. But let's not act on guesses when it comes to performance: profile!
Agreed, profiling is the way to go. And port texlive-fonts-extra
makes a great test case!
As a starting point, I added ui_debug
statements to the file registration loop, and this is what I'm seeing: Initially the loop was quite fast. However, things gradually start slowing. And as we approach the 30,000th file, the loop has slowed dramatically.
By the time registration hits the 60,000th file, the process has gotten even slower.
In any case, I'll re-run this, and calculate timings. Stay tuned...
comment:8 follow-up: 9 Changed 4 years ago by jmroot (Joshua Root)
I would suggest learning to use the profiler
package in tcllib.
comment:9 Changed 4 years ago by mascguy (Christopher Nielsen)
Changed 4 years ago by mascguy (Christopher Nielsen)
Attachment: | port-install-texlive-fonts-extra-timings.txt added |
---|
Changed 4 years ago by mascguy (Christopher Nielsen)
Attachment: | port-install-texlive-fonts-extra-sampled.txt added |
---|
comment:10 Changed 4 years ago by mascguy (Christopher Nielsen)
Attached timing info, as well as a sample of the MacPorts tclsh process. The file registration work for texlive-fonts-extra
took a little over 28 minutes, and total port installation time was approximately 30 minutes.
If you review the timing file (port-install-texlive-fonts-extra-timings.txt), you'll see the performance is very non-linear: It's extremely fast at first, but grows progressively slower. By the end, only a few hundred file registrations are occurring during every 15-second interval.
As for profiling, I started by adding package require Tclx
at the top of portinstall.tcl
, but that resulted in the following error:
DEBUG: unknown user id: /opt/local/var/macports/build/_opt_local_var_macports_sources_rsync.macports.org_macports_release_tarballs_ports_texlive-fonts-extra/texlive-fonts-extra
Is the profiler in the Tclx package, the one to go with? And if so, any ideas? If not, what should we be using?
comment:11 Changed 4 years ago by mascguy (Christopher Nielsen)
Based on a cursory look at the process sample, I'm wondering if there is a less-than-optimal hashtable implementation somewhere? Or a case where the hashtable (within sqlite3, or wherever) needs to be initialized with a much larger default size?
Ultimately this looks like a classic case where a hashtable has a very small number of buckets, resulting in a high rate of key clashes.
comment:12 Changed 4 years ago by mascguy (Christopher Nielsen)
Cc: | mascguy removed |
---|---|
Owner: | set to mascguy |
Status: | new → assigned |
comment:13 Changed 4 years ago by mascguy (Christopher Nielsen)
Summary: | Installing a port with many files appears appears hang → macports base: installing a port with many files appears to hang |
---|---|
Version: | 2.5.3 → 2.7.0 |
comment:14 Changed 4 years ago by mascguy (Christopher Nielsen)
Summary: | macports base: installing a port with many files appears to hang → macports base: installing a port with many files appears to hang; file registration loop slows dramatically for file counts > 10,000 |
---|
comment:15 Changed 4 years ago by mascguy (Christopher Nielsen)
Description: | modified (diff) |
---|
comment:16 Changed 4 years ago by mascguy (Christopher Nielsen)
FYI, I'm rerunning the timing test, with the following modification to the file registration loop in portinstall.tcl
:
if {[info exists installPlist]} { ui_msg "Registering files - START" set clicks_open_total 0 set clicks_binary_total 0 # register files $regref map $installPlist foreach f [array names portinstall::file_is_binary] { ui_debug "Registering file: ${f}" set clicks_open_start [clock clicks -microseconds] set fileref [registry::file open [$regref id] $f] set clicks_open_end [clock clicks -microseconds] set clicks_open [expr { ${clicks_open_end} - ${clicks_open_start} }] set clicks_binary_start [clock clicks -microseconds] $fileref binary $portinstall::file_is_binary($f) set clicks_binary_end [clock clicks -microseconds] set clicks_binary [expr { ${clicks_binary_end} - ${clicks_binary_start} }] set clicks_open_total [expr { ${clicks_open_total} + ${clicks_open} }] set clicks_binary_total [expr { ${clicks_binary_total} + ${clicks_binary} }] ui_debug "File Time: open: ${clicks_open}, binary: ${clicks_binary}" } ui_msg "Registering files - END" ui_msg "Total Time: open: ${clicks_open_total}, binary: ${clicks_binary_total}" }
I'll attach the full log once it completes.
Changed 4 years ago by mascguy (Christopher Nielsen)
Attachment: | port-install-texlive-fonts-extra-install-timing-log.txt.gz added |
---|
comment:17 Changed 4 years ago by mascguy (Christopher Nielsen)
The full port install log, containing timings, is attached; filename: port-install-texlive-fonts-extra-install-timing-log.txt.gz
Comparing the runtime of line set fileref [registry::file open [$regref id] $f]
from the first two files, to the last:
:msg:install Registering files - START :debug:install Registering file: /opt/local/share/texmf-texlive/fonts/vf/rozynski/comicneue/ComicNeueAngular-tlf-t1.vf :debug:install File Time: open: 231, binary: 55 :debug:install Registering file: /opt/local/share/texmf-texlive/fonts/type1/public/hfbright/hfbrmi8.pfb :debug:install File Time: open: 57, binary: 30 [...] :debug:install Registering file: /opt/local/share/texmf-texlive/fonts/tfm/google/roboto/RobotoCondensed-BoldItalic-tlf-sc-ly1.tfm :debug:install File Time: open: 38728, binary: 153 :debug:install Registering file: /opt/local/share/texmf-texlive/fonts/tfm/public/ebgaramond/EBGaramond-Bold-osf-sc-lgr.tfm :debug:install File Time: open: 38677, binary: 144 :msg:install Registering files - END :msg:install Total Time: open: 1669120474, binary: 11160268
comment:18 Changed 4 years ago by jmroot (Joshua Root)
One possibility is that the database doesn't have a good index for this operation.
comment:19 Changed 3 years ago by mascguy (Christopher Nielsen)
There's also a possibility that the bottleneck might be within the TCL realm.
If you can help me get profiling setup, I can provide a more definitive answer.
comment:20 Changed 3 years ago by jmroot (Joshua Root)
Something like this works:
-
macports1.0/macports.tcl
src lib 1906 1906 } 1907 1907 1908 1908 set workername [interp create] 1909 $workername eval [list package require profiler] 1910 $workername eval [list profiler::init] 1909 1911 1910 1912 set mport [ditem_create] 1911 1913 lappend macports::open_mports $mport
-
port1.0/portinstall.tcl
src lib 457 457 } 458 458 459 459 _cd $oldpwd 460 puts [::profiler::sortFunctions exclusiveRuntime] 460 461 return 0 461 462 }
man n profiler
for more info on what you can do with it.
comment:21 Changed 3 years ago by jmroot (Joshua Root)
However I have a pretty good idea what the (main) problem is based on your sample.
comment:22 Changed 3 years ago by mascguy (Christopher Nielsen)
Beautiful, that's just what I need! I'll comment out my timing code, rerun the test, and report back with results. Thanks Josh!
Changed 3 years ago by mascguy (Christopher Nielsen)
Attachment: | port-install-texlive-fonts-extra-profile1.txt added |
---|
comment:23 Changed 3 years ago by mascguy (Christopher Nielsen)
Attached profiler output, from another install of port texlive-fonts-extra
; filename: port-install-texlive-fonts-extra-profile1.txt
Let me know if that's enough to go on, or if there's anything else I can do. And looking forward to hearing whether it's an index issue, or something else.
comment:24 Changed 3 years ago by jmroot (Joshua Root)
Looking good.
sudo port archive texlive-fonts-extra 8.17s user 2.55s system 84% cpu 12.706 total
comment:25 Changed 3 years ago by jmroot (Joshua Root)
Resolution: | → fixed |
---|---|
Status: | assigned → closed |
comment:26 Changed 3 years ago by jmroot (Joshua Root)
Milestone: | → MacPorts 2.7.1 |
---|
comment:27 Changed 3 years ago by jmroot (Joshua Root)
Version: | 2.7.0 → 2.5.3 |
---|
comment:28 Changed 3 years ago by mascguy (Christopher Nielsen)
Finally had a chance to validate this change, with a clean MacPorts installation. And WOW, what a difference!
Josh, thank you so much for your help and assistance, as well as the quick fix!
Thank you for reporting this issue, I would have killed the process if I hadn't seen this. Eventually, after several minutes, the installation finished successfully. A message saying that the port process is busy doing something as the OP suggests would have been nice.