/t/ - Technology

Discussion of Technology

Index Catalog Archive Bottom Refresh
+
-
Options
Subject
Message

Max message length: 12000

Files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

CAPTCHA
E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

8chan.moe | 8chan.st | 8chan.cc | Onion | Redchannit
Test123

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

You may also be interested in: AI

(4.11 KB 300x100 simplebanner.png)

Hydrus Network General #11 Anonymous Board volunteer 02/05/2025 (Wed) 22:34:45 No. 17183
This is a thread for releases, bug reports, and other discussion for the hydrus network software. The hydrus network client is an application written for Anon and other internet-fluent media nerds who have large image/swf/webm collections. It browses with tags instead of folders, a little like a booru on your desktop. Users can choose to download and share tags through a Public Tag Repository that now has more than 2 billion tag mappings, and advanced users may set up their own repositories just for themselves and friends. Everything is free and privacy is the first concern. Releases are available for Windows, Linux, and macOS, and it is now easy to run the program straight from source. I am the hydrus developer. I am continually working on the software and try to put out a new release every Wednesday by 8pm EST. Past hydrus imageboard discussion, and these generals as they hit the post limit, are being archived at >>>/hydrus/ . Hydrus is a powerful and complicated program, and it is not for everyone. If you would like to learn more, please check out the extensive help and getting started guide here: https://hydrusnetwork.github.io/hydrus/ Previous thread >>>/hydrus/22247
Edited last time by hydrus_dev on 04/19/2025 (Sat) 18:45:34.
Anyone running into rule34.us SSL cert errors lately?
Any place with the documentation for custom sort by rules? Can't seem to find any, wanted to make a custom sort rule to have creator->serie-> then probably either time when the file was created or time when it was uploaded to a website and then the rest of the gibberish like volume chapter and page.
>>17679 file>options>sort/collect has namespace file sorting and default collection sort settings, is that what you need?
(4.10 KB 596x136 image.png)

>>17682 That's what I was talking about yeah, I just wanted to know if there was documentation for the keywords I can use to create new custom ones here or at least modify the existing ones. What bothers me with some of my images is they aren't always sorted in the "right" order when you look at for example comics, or images with variants. I figured maybe I could make sure it always gets it right by changing the custom sort order to have it sort images by creator, then the serie the image is from, and then the time or date of creation/upload and then other stuff, but I don't know what the keywords to input there would be and can't find info on the manual either, that's what I meant.
Is there a way to always ignore one specific tag in one tag repository as if it wasn't there? Specifically, I never want to see the PTR series:mythology tag because it's e621 trash and actively disrupts series:* searches, but I don't want to ctrl+a, delete all PTR records of it. Same would go for the PTR "anthropomorphism" siblings, which are patently retarded.
>>17686 They're just the namespaced tags anon. If I understand what you want it should be creator-series-import time-other but I don't think you can use import time, I think it has to be a generic tag namespace.
>>17688 Aw, alright then. Thanks anyway.
I'm trying Hydrus on nhentai but it's not downloading. Is it because of a VPN?
Is there anything I can do for a site that returns a status code even when I enter a direct link to an image? I'm trying to make a downloader for whole threads and I get a 469 error code, but even a direct link to an image I get a 469 error code. It seems to always return the homepage html.
Is hydrus still unable to download from 8chan? I guess the API is gone?
>>17696 The splash page was created for multiple purposes, one of which was to prevent someone from posting illegal images, archiving their post, then reporting the archive to said archival site, and repeating this until this site is blacklisted from being archived. However, archives are grabbing images again, so this is no longer prevented, but it still breaks attempts to scrape files such as with hydownloader.
>>17697 Damn, I wish I could at least pull from the API.
(5.17 KB 507x15 16-01:53:56.png)

>>17698 >it works now Either I imported the wrong cookies last time, and I didn't fuck it up the second time, or the admin changed the site just for me. If it's the latter- uh, thanks admin.
I had a good week. I did a mix of small work, fixing some new and old bugs and cleaning code. The release should be as normal tomorrow.
Is there an elegant way to handle different domains for the same site? I noticed that 8chan.se & 8chan.cc don't work in Hydrus, despite being the same as 8chan.moe.
>>17714 I think you need to copy and modify a url class, and add the relevant urls as examples to the parsers, unless either of those reads or prepends domain names.
Hey guys I have a weird issue, I'm not an expert at this whole hydrus thing by any means, but I did try my best to try to figure it out on my own b4 posting here. Here's my problem: a LOT (but not ALL) of coomer.su posts 403-ing despite having sent cookies over. The kicker is... they fail (403) in a DETERMINISTIC way. If one succeeds, it will keep succeeding, and if one fails, it will always keep failing no matter what. I seriously doubt this is a cookies issue. What I found trough extensive testing is that when I click to open a link that failed (403) and open it in browser, as expected, it also 403-s, BUT the url I open in hydrus is NOT the same that pops up in my browser. For example. In hydrus, in my file log, I click on one of them that failed. In hydrus the url displays as: https://coomer.su/data/a7/bc/a7bc747c24357df8a585a366dbd80c71e81ebcfd76ea8263971b4ea276c5c914.jpg but when I open it in my browser I get: https://n1.coomer.su/data/a7/bc/a7bc747c24357df8a585a366dbd80c71e81ebcfd76ea8263971b4ea276c5c914.jpg Notice the "n1." subdomain at the beginning. This sometimes also appears as "n2.", "n3.", or "n4.", seemingly at random. I've found that if I manually change the subdomain of the url of the failed file import randomly between these 4 subdomains, I'll eventually get one that actually works! In the above example if I change "n1.coomer.su/data/a7/..." to "n2.coomer.su/data/a7/...", I no longer get a 403!!!! I have ZERO clue what is going on and hoped that someone here with more knowledge could maybe realize what's going on and help point me in the right direction. For context I am using the coomer.su downloader I imported from the community github found here: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/Kemono%20%26%20Coomer
Strange issue I've run into with jxls, Hydrus appears to max out my CPU (7700X) when reading them (either during import to generate thumbnails etc, or when browsing full size images). It also somehow takes priority over everything else and causes the PC to stutter, including currently playing audio streams from other programs. Hydrus is set to default normal priority in Task Manager and manually setting it to low fixes this with zero impact on performance. Is there anything that can be done to mitigate this or increase decode performance? I assume we're tied to whatever Python dependency we use? I ask since IrfanView etc appears to be able to decode jxls with much less effort. If it matters at all the images are all encoded at effort level 5-7 and are lossless.
>>17717 I had my fair share of debugging the coomer downloader this week and ended up where you are, though I think the coomer.su/data link should get forwarded to the correct url (so n1 n2 or somthing) What i found out is that '/data' is missing in a lot of URLs for me when I try to import a gallery for example. I edited the parser so that it adds '/data' to the pursuable URL that it finds, and I get a lot less 403s now but still a lot. e.g. this is a 403 for me - parsed by the downloader https://coomer.su/fd/a4/fda40d4ee7c09b34154e204934f1450881df82ea48bb364b08c66903b298aa18.jpg but this is not https://coomer.su/data/fd/a4/fda40d4ee7c09b34154e204934f1450881df82ea48bb364b08c66903b298aa18.jpg Maybe someone smart can fix it.
(7.46 KB 512x132 fixed_coomer.png)

>>17717 >>17720 (me) I did some more tweaking and added a string converter for attachments and primary images, prepending the text '/data' Now it seems to find the correct URLs, I parsed an entire gallery without a single 403 error. I'm not sure if this breaks downloading from kemono, but on coomer it seems to work great now. I attached the post api parser I modified. Could you try it and see if it works?
https://www.youtube.com/watch?v=10YHsVc01IY windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v618/Hydrus.Network.618.-.macOS.-.App.zip No Linux build this week, sorry! If you are a Linux user and want to help me out, there's a test build with instructions here: https://github.com/hydrusnetwork/hydrus/releases/tag/v618-ubuntu-test-01 I had a good week mostly fixing some old and new bugs. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights I accidentally broke the 'force metadata refresh' job (from the right-click->urls menu) last week! Sorry for the trouble, should be fixed now. I overhauled the hydrus IPFS service plugin. It works much more simply now, with less experimental and ancient code (nocopy is now disabled, sorry!), and directory pinning is fixed to use their new API. I disabled the native multihash downloader also, but I think I'll bring it back in some more sophisticated way that uses the normal downloader UI--if you are a long-time IPFS user, let me know what you'd like. Updated help is here: https://hydrusnetwork.github.io/hydrus/ipfs.html I removed a couple crazy system predicate preferences under options->file search. The more-confusing-than-useful one that did 'hide inbox/archive preds if one has count zero' is simply gone, and the one that did 'hide system:everything if client has more than 10k files' is just replaced with a simpler 'show system:everything (default on)'. Thanks for the feedback on these. If you have the 'archived file delete lock' on, the various file maintenance jobs that check if files are missing/invalid now navigate your situation better! If you had file maintenance paused, try turning it back on. next week More bugs to work on, and I want to reserve some time for duplicates auto-resolution work so that doesn't slip.
>>17722 I'm sad to say it doesn't seem to be working for me. I'm not quite sure what downloader component I'm supposed to be looking for. When I import the image, hydrus says it is adding "kemono post api parser" so I tried clicked on "manage parsers" in the downloader components, and tried to compare by eye, but couldn't find a difference. Do note that I am not very good at this whole parser making thing. Of course I also tested this on an actual gallery after I imported the image, but it still kept giving me 403s :( I'm not sure if this is user error on my part, or if what you gave doesn't work, as I cannot really read the imported modified parser to check for sure. Could it possibly be that I need to delete my existing parsers before replacing them with this new one...? I'm sorry I couldn't be of more help.
>>17762 It's the kemono.su post api parser. If you edit it then you should be able to see the components of the parser. This image might be a little chaotic but that's what I modified for the primary and secondary attachment files. Could you send a screenshot of the file list with the 403 errors? I just wanna look at the URLs that it parsed for you.
Hello fellow retards. How do I get watchers for 8chan working? It gets stuck on the disclaimer.
>>18046 You need the cookies from a session that has gone past the ToS. If you use hydrus companion you can send them directly.
>import files with tags via .txt sidecar >tags have underscores in them damn... did hydrus change how it handles underscores? I remember it just treating them as spaces no matter what. I poked around and found the option to display underscores as spaces, but if I turn that on and have one image with 'test tag' and another with 'test_tag' then type test in the search box I get "test tag" twice (they show up as completely different tags). Is there a way to do a one-time zap to turn underscores in tags into spaces?
Is it normal to have a 700mb main db file with 97% free pages which got down to 27mb after vacuuming (which was quite funny ngl)? I have around 50k images. And what does "free pages" mean actually? I only know a little bit about sqlite, that might be sth I haven't gotten around to yet.
(63.69 KB 533x784 1.png)

(27.29 KB 444x537 2.png)

>>17731 >More bugs to work on Hey Hydev, i found some weird behavior which might not be indended, but you have to tell me if it is. So here is the deal. First i need to say that im still in testing phase and my client is set up like this: 1. I added the PTR 2. I have no siblings for 'my tags' and only a handful of parents/children for 'my tags' for testing purposes. 3. In 'manage where tag siblings and parents apply', under the 'my tags' tab in the 'parents application' box, i have 'my tags' and added 'PTR' underneath, so that all the PTR parents/children apply to my tags also. So i tested if the PTR tags applied to my tags on a test file and it worked like it should. Then i checked what the parents submenu on right-clicking the 'studio:dc comics' tag would show. As you can see in the uploaded image (1.png), there are 3 blocks of tag domains: --my tags, PTR-- which contain the most (9 written + 228 more), --PTR-- and --my tags-- have 18 and 23 entries. Now i would assume, that if everything would be correct, everything would be under --my tags, PTR--, so both domains have all the parents/children? Why are there entries in the --PTR-- and --my tags-- blocks at all? I am pretty sure everything was synchronized at that moment, since my client isn't turned off and my PC also not, so it should. But i didn't check at that time tbh. But what i did when i saw those 3 blocks, i right-clicked the tag selection box and chose 'maintenance' -> 'regenerate tag display'. Since the warning box then says, that it would only regenerate for the selected entries and would only take some minutes, i started the process. But unfortunately it started to regenerate everything in the PTR tab in the 'tag sync display' window. I didn't realise from the start, but later when there were around 30k+ parents left to sync. While in the mid of syncing, i checked the parents submenu of studio:dc comics again (check 2.png) and saw that alot of them got swapped over to --my tags--, --PTR-- was gone and --my tags, PTR-- had only few left. I thought it will proceed to swap everything to --my tags-- and thought that it is maybe supposed to be like that, so i waited like 30 hours (!) till the sync was done and checked again. The result was, that it showed me the 3 blocks of entries like in 1.png, so kinda where it all started. So i wanna know: 1. Was me starting the maintenance job doing unnecessary work without changing anything in the end? Yeah i know it says 'WARNING EXPERIMENTAL'. 2. Should there be 3 blocks of entries? Seems wrong i think. Shouldn't be everything under --my tags, PTR-- so everything applies to both? Because if you type in the entries from the --PTR-- block into the 'manage tags' window (my tags tab) of a file, the autocomplete/suggestions are not showing and highlighting the parents like if you type in an entry from the --my tags, PTR-- or --my tags-- blocks. 3. How to fix else if not using this maintenance job? 4. Unrelated to all above: the 'show whole chains' checkbox in the 'manage tag parents' settings seems to not do anything for me, so i don't know what it is for. Can you give a quick tutorial/example what i have to do, in order to see what it actually does when activated?
>>17855 lmao, sorry for not writing earlier, but by the time I saw this, I have managed to already figure this out on my own. I quite literally reproduced all the modifications you made for myself locally. I went in and added the prepend text for the primary and secondary files, just like you did in this post. Funnily I came to this solution on my own, lol. So good news, it DOES work now for me, the only issue still left that I don't understand is, why the downloader png didn't work... In case lurkers or other people in this thread would want access to this as well, yk? Now knowing what I know I went back and deduced what might've been the problem. The kemono.su post api parser that I modified manually was called "kemono.su post api parser", BUT next to this I noticed there were 2 others with a similar name, called "kemono.su post api parser (1)", and "kemono.su post api parser (2)". I remember that I tried importing your png TWICE, and when I went into these two parsers I found that, indeed they were ALREADY modified to prepend "/data" to all primary and secondary files, so it seems to me like the issue was that since I already has a parser with the name "kemono.su api post parser", when I imported your image, the new imported parser got automatically re-named "kemono.su api post parser (1)", and so on. Since both the "coomer.su all-in-one creator lookup", and the "coomer.su onlyfans creator lookup" were both configured internally to use "kemono.su api post parser", the result ended up being exactly the same since the correct, new parser to use would have been "kemono.su api post parser (1)". Since I MANUALLY changed the working behavior of "kemono.su post api parser", the already existing creator look-ups started finally started working correctly. Mystery solved. Not sure what conclusion to draw from this for future generations tho.... maybe delete your existing parser before importing one with the exact same name? I seems strange to me that hydrus chose to rename the newly imported parser instead of overwriting...
>>17662 >>17664 I am sorry for the frustration--that looks like a real pain. Unfortunately, the thumbnail grid is a very 'Qt' widget, in many ways. Although I do handle some scrolling and mouse moving stuff, for the most part the actual grunt work of firing off scroll events and so on is all done through Qt. I hate to abdicate responsibility, but I haven't changed this code in a very long time, so if this has recently changed for you, I have to first assume it is something like a new mouse driver or Window Manager (X, Wayland) update that is now conflicting with Qt event processing. Since you are on 6.7.3, are you on the built release or running from source? If running from source, or happy to try running from source (https://hydrusnetwork.github.io/hydrus/running_from_source.html), could you try rebuilding your venv and choosing the (a)dvanced install, and then selecting the (t)est Qt? This should give you 6.8.2.1 or so. If the behaviour differs or is fixed, then we could write this up as some Qt bug that has since been fixed. If that doesn't fix it, or you cannot run from source, can I ask you to load up a tall multi-column list, maybe something like a subscription file log or or network->data->bandwidth list when set to show all history, and then tell me if this scrolling problem is repeated there? Also, same deal for the normal 'selection tags' taglist, when it is like ten pages tall? The taglist uses similar tech to the thumb grid, where it is a virtual scroll area that I render myself; the multi-column list is completely native Qt. If scrolling is busted in both, that suggests it is Qt (or some application-wide event handling thing I am doing wrong somehow); but if multi-column lists are fine, that suggests my custom rendering tech is eating/merging mouse events. Oh yeah, and can you please do the media viewer too? Just load up a media viewer with like twenty files and see if scrolling through it is smooth if you move the mouse. The media viewer works entirely on my custom shortcut handling routine, no Qt tech, so if that works or doesn't it gives us some more info. >>17669 Yeah, I broadly agree. I was going to explore implementing this through en masse sibling rules, but I scaled back my 'lots of siblings' plans, like for namespace siblings, when I ran into unexpected computational complexity developing the new system. You are talking about replace, and I think that's the correct place for this--a hard replace where I basically write a routine that says 'delete the underscore tag, add the prettier one', rather than virtualised and undoable siblings. I am still gearing up plans for hard replace, but it will come. It should be much simpler than siblings and parents, pretty much just a clever dialog like 'migrate tags' that fires off bigass async jobs with some BACKUP FIRST red text all over it. >>17674 I'd still like to do this natively sometime. Some way of saying 'open these externally' and pass a list of paths to the external exe you want. >>17675 I'd love to. Current plan is to revamp that whole dialog to use newer string processing/conversion tech, and then integrate sidecar tech better. Sidecars work technically good and do URLs and notes and stuff, but they are a fucking nightmare to actually use because the UI is hell. If you haven't tried them yet but feel brave, have a poke around the 'sidecars' tab and look here: https://hydrusnetwork.github.io/hydrus/advanced_sidecars.html
>>17689 Sorry for the limitation here. I'm hoping to expose the secondary sort on normal search pages in future, so you'll be able to do two-stage sorting a bit easier, but I can't think about supporting super clever crunchy sort logic like you want yet. >>17678 CDNs will sometimes do this to effect rangebans I think. I think they terminate the SSL handshake in some rude way in lieu of a proper http response, in order to save resources? Or maybe bamboozle spiders? If you are on a VPN, maybe try changing region? >>17687 Try right-clicking the tag and then saying hide->(tag) from here. There are finer-grained options under tags->manage tag display and search (which that menu option dumbly populates). It won't remove the tag from manage tags dialogs, but it'll remove it from normal views. >>17694 I don't think I've ever heard of a 469 before--it sounds like a joke, maybe? You might like to try putting in the URL in the manage parsers test panel to see if there is any body to the message (it might have an error text or something), or help->debug->network actions->fetch a url, which does basically the same thing. If the site doesn't like you hitting up direct URLs, it is probably a cookie or referrer issue. You might like to try fetching the same URL in your browser with dev mode on and see the network traffic. Is a referral URL sent when you do normal browsing, and is it something hydrus isn't/could be replicating? If you do the fetch in a private browser window (i.e. no cookies), do you get the same 469? Does it have any extra info in the response headers? >>17714 >>17716 In future I want URL Classes to support multiple domains. There's some technical pain in the ass stuff I have to do to make it work behind the scenes, but there are several situations where the same engine or site itself runs on multiple domains and you end up having to spam with the current system. I had this when I did the e621, e6ai, e926 stuff recently. Relatedly, I'd like some better en masse URL renaming or redirecting or normalising tech. It'd be nice to handle domain name changes or merging legacy http -> https duplicates with a clever dialog or URL Class settings that would just say 'ok take all these weird urls and rename them in this clever way to this nicer format'.
>>17718 I noticed this too with some of my test files, that larger ones would eat a lot of CPU and memory, but it wasn't as bad as you have seen. That sucks if it is hitching your whole system. This is the library we use for it: https://github.com/Isotr0py/pillow-jpegxl-plugin It provides encode/decode and registers with Pillow as a plugin. It is in Rust, I think a python wrapper around this https://github.com/inflation/jpegxl-rs , which is bindings on libjxl itself. My assumption was the extra memory and CPU was due to the higher level languages here, and particularly because this is new tech and it just hasn't been optimised yet. I could see how the lossless decoder is shit simply because there haven't been many real world use cases yet. My plan was to wait and assume that new versions (and, hopefully, an eventual native plugin integration into Pillow) would simply improve performance. EDIT: I tried generating an effort 6 fairly large lossless jpegxl, and while I didn't get audio hitching, it hitched a youtube video in the background while loading. That sucks! Unfortunately, I am not sure if I can do much about this. I am planning to re-engineer my render pipeline to handle large images better, but new libraries like this may not even benefit from that. I think I have to assume that this library has been written with some aggressive thread yielding or whatever because they made it for command line encoding or something, and it hasn't been smoothed out yet. Wait and see, is probably the correct strategy. Hydrus is set up to get the latest version of this library every week, so fingers crossed performance improves in future. >>18162 Sorry to say I don't think I have changed the behaviour here, and I don't think I've had underscore-replacement tech in sidecars, so my guess is your incoming tags were nicer quality somehow before, or you had string processing steps to clean them up manually? Underscores have been a bit of a problem for a long time. I've been foolish in trying to support as much as possible when it comes to allowed tag characters and so on. I do want to have a big red button that merges them all to whitespace equivalents, and options to massage all incoming tags with rules like 'ditch all underscores according to these patterns' to stop future instances. I just wrote about it a bit earlier here >>18230 in response to >>17669 . Not yet, but I want it in future. >>18188 I wrote like three replies to this with long-winded technical bullshit but I'm still not sure exactly what happened so I'm just going to make it simple. Might be I have a sync calculation bug in my code, and I will look into it. Your actual questions: 1) Might have been, but if there is a logical problem in my sync code, this was the correct way to fix it. I do not know why it appeared to resync everything in your PTR--it should just do everything that is connected to 'DC'. Now, DC might have tens of thousands of connections, so maybe this triggered 20-60% of all the sibling rules (I'm sure pokemon is connected to DC some way, and that to a hundred other things). 2) Might be a presentation problem--there's some technical weirdness in the program between the 'actual' sync and the 'ideal' sync, and that menu currently shows the actual, which I think it shouldn't. Regarding the three blocks part--do you have any/many DC-based siblings or parents in your 'my tags', or are all these tag relationsh coming from the PTR? If your 'my tags' disagrees with the PTR (for instance, if it has 'catwoman' as a child, probably because of some sibling mapping disagreement), then I will recognise that the three domains do not share all the same ideal and inferred tags. If your 'manage where siblings and parents apply' rules just said 'PTR' on both 'my tags' and 'PTR', or indeed if they said 'my tags, then PTR', then eventually these two lists would harmonise completely, but if the two services differ, then putting 'my tags' in there will cause different lists. Most importantly, the UI here is shit and I can do better. I will think about this. 3) Not easily, and only with bigger commands that truly trigger a 100% complete resync of everything. 4) Basically if you put in 'series:batman', the dialog is supposed to load all the direct descendants (character:batman) and ancestors (studio:dc). But dc has children in other branches, let's say superman, and character:batman may have other parents, let's say 'superhero' or something. Those 'cousins', and all the very complicated nest of tertiary and n-ary branches that spill out from that and probably connect to pokemon and azur lane one way or another, may be useful to see, or they may be spam. The checkbox is supposed to show that, but the dialog's pair-based view is still a horrible way to look at the directed graphs here. Thank you for this feedback. I will do a bit of poking around here and see if I can do some quick improvements to the UI and so on to make things clearer and help our future debugging for this stuff. I'll look into the 'show whole chains' thing too.
>>18172 That is very unusual, but if you had recently deleted a LOT of data, it would be normal. Think of a SQLite database file like a hard drive that can grow in size but not shrink. The file is split into many 'pages' (like 1-4KB each or so), just like your disk, and there's some metadata like a page bitmap that says which pages are in use and which are not. When you delete data, if the page no longer has anything in it, it is added to the list of free pages. When we need to write extra data, if it doesn't fit into an existing page, SQLite then asks itself if it has a free page or if it should expand the filesize to give itself more free pages. It is a little more complicated than this (afaik it doesn't edit pages in-place but instead writes new valid data, updates page pointers, and adds out-of-date pages to the freelist), but that's the basic idea. Vacuum basically says 'ok, create an entirely new database file and write all the data from the old file in a super efficient way to the new file, filling up pages as much as possible on this first write run. Not only do we create a new file with no free pages (since no deletes have occured), but all the pages are pretty much maximally filled since they haven't been edited and shuffled about. It is similar to a defrag, with the additional bonus that any free space is truncated. If your client.db lost 97% of its internal stuff, that's a lot of shit that's gone! client.db tends to store pretty important stuff that doesn't get deleted or recalculated. Like you can delete a file, but for every 'current files' row that is deleted, a 'deleted files' row is added. and stuff like basic file metadata is never deleted. client.caches.db will increase or decrease like 20% if you add or remove a file service, but can you remember deleting anything particular from your client recently? I can't even think what it would be. Some potentially bloaty things like notes are stored in client.db, I'm pretty sure, but there's no way to delete the master records atm iirc. Maybe parents and siblings, with the PTR, somehow? Did you recently delete the PTR? Otherwise, have you recently had database damage? I assume everything feels normal, if you haven't noticed like giant amounts of metadata missing from your collection. If you have never had a big meaty database with a lot of metadata, I suppose it could be that SQLite (by which, presumably, I mean me) used 650MB of client.db space for some pseudo-temporary storage, but I can't think what it would be. If you happen to have an old backup before any big delete you did, you might like to download your respective 'command line tools' from here https://sqlite.org/download.html and then run sqlite_analyzer on your old and new client.db files. It isn't a big deal though, but if you want to poke around more, and learn more about SQLite, that tool produces nice document reports on size of tables and stuff. If everything seems fine, you are good. The pages are free, so you can do your vacuum and keep using the client. If your db bloats up again, let me know. Let me know how you get on regardless! EDIT: Oh wait, subscriptions or GUI Sessions, maybe? Did you clear out a giganto search recently?
(342.88 KB 1243x630 2025-04-20 06_10_21-.jpg)

>>18264 Thanks that's a lot of useful info. I don't think I did any major deletions recently, or ever. I'm usually pretty paranoid about losing/messing up data so yeah. Does hydrus write these things in the logs? I don't keep gigantic search pages around for long either (I like to keep things snappy and responsive). And from looking at my backup history, the file has been roughly the same size for at least 2 months. I don't use the PTR either. The only major thing that I can think of is moving my entire 150gb media library to another location. Don't know if that could be it. Anyway I will pull my backup and compare them. Might be a good chance to practice some more sqlite as I've been getting into it recently. Will keep you posted if I find anything interesting.
(8.08 KB 307x138 Screenshot (269).png)

An idea I had to deal with galleries that have these high image counts. For gallery downloaders, I kind of had the idea of setting some kind of threshold that automatically makes a 2nd or more query entry, like a part 2, part 3, and so on if a certain threshold of images is reached. Say in the gallery downloader, users would set it to make a new query after 1000 images is reached, something like >query [artist] >once it reaches 1,000 images >query [artist part 2] is made starting with the 1,001th image and so forth This would then keep Hydrus from killing itself trying to open 10k images when you go to check on some random downloaded gallery. I guess my idea is to set up options to break up large work into smaller piece. I'd rather deal with >query artist 2k images >query artist part 2 2k images >query artist part 3 2k images than >query artist 6k images And I kind of wonder if this can be done with tabs as well when searching for something with a certain threshold set ie, searching for a tag with over 1,000 images or search + limit set to 1,000 with a threshold set to 500 per tab >threshold set to open 500 at a time >make 2 tabs, 500 each where it focuses on 1 tab at a time I wonder if that would also help reduce any strain on the program as well.
(16.75 KB 669x343 3.png)

(32.92 KB 669x819 4.png)

(73.34 KB 579x726 5.png)

>>18260 >Now, DC might have tens of thousands of connections, so maybe this triggered 20-60% of all the sibling rules (I'm sure pokemon is connected to DC some way, and that to a hundred other things). The PTR has 35,342 pairs, the 'manage tag parents' windows tells me. When i noticed that my SSD was working hard, i checked the sync status, which was around 94-95% and said '30k+ parents to sync'. Later i took the screenshot (3.png). It says only 'parents', no 'siblings'. So i THINK it pretty much resynced all parents/children. >>18260 >2) Might be a presentation problem--there's some technical weirdness in the program between the 'actual' sync and the 'ideal' sync, and that menu currently shows the actual, which I think it shouldn't. But don't we want the 'actual' synced ones to show? Because the tags that are only shown in the --PTR-- block don't apply any parents/children when entered in the 'my tags' domain on any file. At least like that we know there is a sync problem. But im not sure what you mean with 'ideal sync' either, so you would know better :D >>18260 >Regarding the three blocks part--do you have any/many DC-based siblings or parents in your 'my tags', or are all these tag relationsh coming from the PTR? If your 'my tags' disagrees with the PTR (for instance, if it has 'catwoman' as a child, probably because of some sibling mapping disagreement), then I will recognise that the three domains do not share all the same ideal and inferred tags. I did only have a handful of not DC-related parents like 'test1/test2' in 'my tags' and no siblings, when i encountered the 3 blocks. Later i added/created a handful of related stuff (test stuff like studio:dc comics 2/dc comics 3/dc comics 4), but it didn't change anyhing regarding the 3 blocks. Now I deleted everything from the 'my tags' parents dialog and it still shows me the 3 blocks. So now i have no siblings and parents, and the 'manage where siblings and parents apply' dialog looks like seen on '4.png'. The PTR tab has 'PTR' in both boxes, i didn't change anything there. Also i checked other tags with many parents like 'series:star wars' or 'studio:marvel', but also normal blue tags that have 20+ parents, they often have those 3 blocks and the most are always in the --my tags, PTR-- block--, but for 'pokémon' 2/3 are in the --PTR-- block (5.png) weirdly enough. So every tag from that block would not apply the parents/children when i would enter them in the 'my tags' domain. - Sync status for 'my tags' says '52,640 rules, all synced!'. Tho i am not 100% sure if it had any rules here before i started the regenerate tag display maintenance job. Also i don't know why its 50k+ rules, and not 35,342, which are the number of PTR pairs which apply to 'my tags' now. I guess rules and pairs are not the same? - Sync status for 'PTR' says 611,995 rules, all synced!'. 35k parent pairs + 558k sibling pairs don't equal 611k rules, so i guess here also you can't count them together to get the number of rules right? But '3.png' would say otherwise -> 584,477 rules applied + 26,667 parents to sync = 611,144 (rules). Is anyone here that has parents set up like i did in '4.png'? If yes, could you then check if the parents submenu on a tag shows you also different blocks of domains like in '5.png'? Thx! >>18260 >The checkbox is supposed to show that, but the dialog's pair-based view is still a horrible way to look at the directed graphs here. Well turns out that the 'show all pairs' checkbox is not supposed to be active when doing that lol. Now it works. I thought it would filter the whole 35k list to a smaller one. And of course you have to have some tag entered in one of the boxes below as the tooltip says. Maybe you can make it so, that 'show whole chains' only can get activated, when 'show all pairs' is deactivated? That would only have pros (like helping confused people like me) and no cons, right? Thanks for your help!


Forms
Delete
Report
Quick Reply