Opened 21 months ago

Last modified 21 months ago

#67044 new defect

Registry DB is 1.2GB and makes a 10.4.11 system unusable

Reported by: evanmiller (Evan Miller) Owned by:
Priority: Normal Milestone:
Component: base Version: 2.8.1
Keywords: Cc:
Port:

Description

Sometime in the last few months, MacPorts has become unusable on my Tiger system. When attempting to run any port command, the registry:open function causes the kernel_task to consume all available system memory (1 GB) and mark it Inactive. This memory is not returned to the "Free" pool after the process exits, causing other processes to swap heavily until a system restart.

While there seems to be some kind of OS bug in play here, I'd like to see if there is some kind of workaround available. I see that the registry.db and registry.db-wal files are over 600MB apiece – since the memory leak occurs during registry:open, and there appear to be some changes to registry reading in MacPorts 2.8.0, I strongly suspect these files to be the culprit. Is there a way to reduce the size of the registry, or "rebuild" it somehow?

At present it looks like I have to start a new installation from scratch or stop using MacPorts on this system.

Change History (11)

comment:1 Changed 21 months ago by kencu (Ken)

I recall a massive registry file like that on Tiger a few years ago.

If you do decide to start fresh, I wound up creating a local ports archive, reinstalling MacPorts, and then using the archive as the binary source to reinstall the ports I wanted.

This was of course 1000x faster than rebuilding all the ports.

comment:2 Changed 21 months ago by jmroot (Joshua Root)

The registry.db is an ordinary sqlite3 database. You can therefore use the sqlite3 command to inspect it for anything unusual, create and restore backups, and run maintenance commands like VACUUM and wal_checkpoint.

comment:3 Changed 21 months ago by evanmiller (Evan Miller)

Thanks for the tips. VACUUM gives an error about VERSION not being collated (?). Interrupting a long-running process shows that the disk activity is from a WAL checkpoint. I will see if I can revert some of the recent fullfsync and WAL changes to see if that fixes things locally.

comment:4 Changed 21 months ago by evanmiller (Evan Miller)

Using sqlite3 I was able to checkpoint and reset the WAL file. However, running a simple port command resulted in an attempted VACUUM, and the WAL file is now back up to 600MB. I am guessing that something is not working as intended here.

comment:5 Changed 21 months ago by evanmiller (Evan Miller)

I've made the following changes locally and have a usable installation again:

  • Disable needs_vacuum – this avoids writing out a 600MB WAL file any time something is deleted from the registry. The current vacuum policy is probably too aggressive given the cost of the operation.
  • Disable fullfsync
  • Change SQLITE_CHECKPOINT_PASSIVE to SQLITE_CHECKPOINT_TRUNCATE

I'll leave the information here in case others find it useful and will prep a PR if I think it can be generalized.

comment:6 in reply to:  3 Changed 21 months ago by ryandesign (Ryan Carsten Schmidt)

Replying to evanmiller:

VACUUM gives an error about VERSION not being collated (?).

MacPorts uses an SQLite extension that implements a custom sort order (a "collation") for version numbers. When MacPorts uses the registry, it loads that extension. If you want to use programs other than MacPorts to interact with the registry file, you have to load the MacPorts SQLite extension first.

Version 1, edited 21 months ago by ryandesign (Ryan Carsten Schmidt) (previous) (next) (diff)

comment:7 Changed 21 months ago by ryandesign (Ryan Carsten Schmidt)

The size of the registry should be proportional to the amount of data in it, which would depend on how many ports you have installed, how many files those ports install, and so on. For example, on one of my machines running Monterey with 960 ports installed, my registry.db is 117MB. How many ports do you have installed? Do any of them contain a very large number of files? You should be able to reduce the size of the registry by installing inactive ports and any other ports you don't need.

comment:8 in reply to:  7 Changed 21 months ago by evanmiller (Evan Miller)

Replying to ryandesign:

The size of the registry should be proportional to the amount of data in it, which would depend on how many ports you have installed, how many files those ports install, and so on. For example, on one of my machines running Monterey with 960 ports installed, my registry.db is 117MB. How many ports do you have installed? Do any of them contain a very large number of files? You should be able to reduce the size of the registry by installing inactive ports and any other ports you don't need.

I was able to reduce the size of the registry DB to 80MB with port uninstall inactive. The other steps seem to be necessary to prevent the overall OS memory leak – maybe the OS is backing both files in Inactive memory for some reason?

I would suggest not setting needs_vacuum every time something is deleted from the registry, which is the current behavior. Vacuum appears to rewrite the entire registry database, once to the WAL file, and then to the actual DB file. Previously this was resulting 1.2 GB of disk writes for many everyday port commands.

comment:9 Changed 21 months ago by kencu (Ken)

you would not think inactive ports would take up a lot of space in the database… must be all the filepaths using up that room… but no apparent need to have stored the filepaths of inactive ports, is there?

comment:10 in reply to:  9 Changed 21 months ago by evanmiller (Evan Miller)

Replying to kencu:

you would not think inactive ports would take up a lot of space in the database… must be all the filepaths using up that room… but no apparent need to have stored the filepaths of inactive ports, is there?

When I started, the files table contained 1.2M rows. Now it's down to ~180,000.

comment:11 Changed 21 months ago by jmroot (Joshua Root)

Putting off vacuuming until some threshold of wasted space is reached is something I've considered, it just hasn't happened yet because it requires code to figure out how much wasted space there is. Similarly, I considered using truncate checkpoints, but it requires extra code to make sure it works correctly when other processes are reading the db.

Note: See TracTickets for help on using tickets.