/ais/ - Artificial Intelligence Tools

"In the Future, Entertainment will be Randomly Generated" - some Christian Zucchini

Index Catalog Archive Bottom Refresh
+
-
Name
Options
Subject
Message

Max message length: 12000

Files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0 (Temporarily Dead).



8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Use this board to discuss anything about the current and future state of AI and Neural Network based tools, and to creatively express yourself with them. For more technical questions, also consider visiting our sister board about Technology

(134.07 KB 1024x1024 lmg_.jpg)

/lmg/ - local models general Anonymous 04/16/2025 (Wed) 06:15:26 No. 6258
/lmg/ - a general dedicated to the discussion and development of local language models. ►News >(04/14) GLM-4-0414 and GLM-Z1 released: https://hf.co/collections/THUDM/glm-4-0414-67f3cbcb34dd9d252707cb2e >(04/14) Nemotron-H hybrid models released: https://hf.co/collections/nvidia/nemotron-h-67fd3d7ca332cdf1eb5a24bb >(04/10) Ultra long context Llama-3.1-8B: https://hf.co/collections/nvidia/ultralong-67c773cfe53a9a518841fbbe >(04/10) HoloPart: Generative 3D Part Amodal Segmentation: https://vast-ai-research.github.io/HoloPart ►News Archive: https://rentry.org/lmg-news-archive ►Glossary: https://rentry.org/lmg-glossary ►Links: https://rentry.org/LocalModelsLinks ►Official /lmg/ card: https://files.catbox.moe/cbclyf.png ►Getting Started https://rentry.org/lmg-lazy-getting-started-guide https://rentry.org/lmg-build-guides https://rentry.org/IsolatedLinuxWebService https://rentry.org/tldrhowtoquant ►Further Learning https://rentry.org/machine-learning-roadmap https://rentry.org/llm-training https://rentry.org/LocalModelsPapers ►Benchmarks LiveBench: https://livebench.ai Programming: https://livecodebench.github.io/leaderboard.html Code Editing: https://aider.chat/docs/leaderboards Context Length: https://github.com/hsiehjackson/RULER Japanese: https://hf.co/datasets/lmg-anon/vntl-leaderboard Censorbench: https://codeberg.org/jts2323/censorbench GPUs: https://github.com/XiongjieDai/GPU-Benchmarks-on-LLM-Inference ►Tools Alpha Calculator: https://desmos.com/calculator/ffngla98yc GGUF VRAM Calculator: https://hf.co/spaces/NyxKrage/LLM-Model-VRAM-Calculator Sampler Visualizer: https://artefact2.github.io/llm-sampling ►Text Gen. UI, Inference Engines https://github.com/lmg-anon/mikupad https://github.com/oobabooga/text-generation-webui https://github.com/LostRuins/koboldcpp https://github.com/ggerganov/llama.cpp https://github.com/theroyallab/tabbyAPI https://github.com/vllm-project/vllm
>>6258 good luck!
lots of /lmg/ refugees in https://meta.4chan.gay/tech/67288
>>6266 I'm curious to see where everyone will consolidate
>>6258 omg it migu
>>6270 I want 4chin back...
>>6273 It'll be back eventually and probably worse than ever
(40.62 KB 500x500 9l1tnh.jpg)

>>6273 4fag mods and jannies are troons, we are the mods and jannies here
>>6266 >here are your neighbors, bro
https://huggingface.co/microsoft/bitnet-b1.58-2B-4T https://github.com/microsoft/BitNet In case anyone missed it in the chaos, microsoft actually trained a bitnet model. It's a 1.58b so more of a retard you can carry around in your pocket than anything useful but I suppose it's proof that bitnet isn't a completely abandoned concept.
>>6286 anons tested it out already, okay for a 2b model https://meta.4chan.gay/tech/67288#p76975
>>6287 >Serbia >Solarized theme hi petra
>>6258 Is that new 47b Nemotron model roleplayable like the recent 49b one, or is for researchy stuff?
>only options are here, dead, or the literal cunny chan what the fuck
>>6293 What is the cunny chan name?
>>6293 at least here we have post ids, but yeah all of the options suck
hello where did Hentai Diffusion go?
>pedophiles all flock to a literal pizza altchan hmmm
>>6349 https://meta.4chan.gay/tech/67288 use fennec f-droid or any other firefox based browser on mobile if you have issues posting
>>6428 GO AWAY POO POO NIGGER MORAL FAG FAGGOT THIS IS OUR BOARD NOT YOUR FUCK OFF TO INDIA OR TURKMENISTAN OR WHEREVER YOUR SHITTY UNWIPED BUM WAFTED IN FROM, THIS IS NOT YOUR SHITTING STREET, THIS IS OUR SHITTING STREET, NOT PUBLIC, NOT FOR YOU
>>6258 I am home again
>>6412 /trash/ got their sdg back, but I haven't found something like Hentai Diffusion yet In the meantime your best bet might be civitai?
>>6287 I got it running now as well. Hope they will continue experimenting with Bitnet
>>6273 No way. Seeing the solo janny in /h/ getting doxxed was funny.
>>6568 >4chan acquired by Y Combinator Fate worse than death.
>>6293 /g/ was always the technololigy board, fag.
>>6266 Nice try. I'm not going to any site with ".gay" at the end of the URL.
uhh.. guys? anyone alive?
>>6266 Your shit is down
>>6647 yeah well, if you checked the archive you'd know that ALL /lmg/ refugee locations are regularly posted there
https://meta.4chan.gay/tech/67288 WE'RE BACK! MASSIVE HAPPENINGS HAPPENING
(102.75 KB 1887x1742 crysad.jpg)

we got 2 /lmg/ now? I'm liking this better.
its OVER!
Was just up a second ago.
ok since 4chan gay is being gay lets talk local models whats up anons
4chan.gay is gay altchans suck
>>6669 4chan.gay's /lmg/ was better than this ghost town. too bad the 4chan.gay admin is a dipshit who tests in prod
4chan gay is cool, but whoever is managing it is some ADHD zoomed retard. I guess 4chan is as great as it is because the management never is present...
4chan itself was gay. no vpns, countdown timers. these alt-chans are at least anonymous. I would rather have one take off.
>>6706 None of them work without javascript 4chan.gay hosts CP while being behind cloudflare. It's the glowiest honeypot to ever glow
some more news
>>6707 >>6712 >Reporter's Name: Hiroyuki Shouldn't he be more concerned about bringing 4chan back up instead of attacking the competition?
person who reported inside https://unknown.spam/aicg_mail_list
>>6707 >>6710 We're posting about models not CP. I don't give a fuck, may the strongest chan win. Would you rather reddit or discord?
>>6717 matrix
>>6718 I tried that. It was psychotic leftists.
>>6720 theres a few based homeservers, although any platform similar to discord will eventually lead to 'cordfaggotry so i'd rather we keep it on literally any chan
There's lainchan too you know, the place seems comfy
>>6724 extremely cancerous trannie jannies
>>6725 Considering it's you, I bet they banned you for shitting the place up and you're butthurt All the more reason we should consider lainchan
>>6727 *stands in your way* your move?
>>6725 It's no better than gay-chan, the mod is watching as we speak
>>6725 >>6727 The fact that lainchan doesn't have any threads for AI suggest they are not very interested in it (or anything too new actually). Also, the ai generals would be far too fast for them. Here is better for now. The gay 4chan is not working for me.
Someone please bake /ldg/ in this board please
>>6772 https://meta.4chan.gay/tech/67288?last=100#bottom works like this if you're a ramlet or something
>>6717 >dude just ignore the Democrat activism next door >If you don't like it then you must want to go to reddit or discord instead!
>>6837 >cunny.. is LE BAD
https://seed-tars.com/1.5/ https://huggingface.co/ByteDance-Seed/UI-TARS-1.5-7B VLM from bytedance, focused on computer use. Might be interesting. A lot of other computer use systems have basically been just bolting one of the obese models onto a browser use system. This seems relatively more polished and better for interactions, but I have doubts about its ability to handle more complex tasks.
EXL3 with cache quantization when?
I want to chat with a chinese LLM and see if its views about china differ from western ones. Which one should I check first? I can run up to 32B. GLM? qwen? qwq?
>>6921 Yeah if you want chinese models try qwen's stuff, GLM, deepseek's if you can get it running. btw If you're just doing quick evaluations then you might have a better time just trying them out on openrouter rather than downloading every single one.
>>6921 qwq is the quintessential local Chinese model atm.
>>6854 When turboderp gets time off his dayjob and finishes railing his anime girls.
Bros! You're back!
>>6976 GLM still has an open PR in llama.cpp for some problem, I will wait. I see that qwen has official gguf quants in hf. I will test 2.5 and qwq. I prefer to use 100% local, especially if I want to test the "limits" of a model.
>8chan has miku theme We're so back it's unreal.
>.moe is literally dead >4chan gay is figuratively dead >desuarchive was never actually alive It's unironically over
>>7087 Shit... that could be a while.
If 4chan doesn't come back, the canonical /lmg/ is going to be wherever the thread recap bot operator and/or CUDA dev show up. This place looks ok so far, so maybe there's hope!
>>7209 Recap Anon is here and in 4chan gay, so it's actually up to whichever place has more anons. I wonder about CUDA anon... I will try to send him a email.
>>7200 It's not over, fren. The first reaction of most people was to wait it out, expecting 4chan to come back online in short order. With every day that passes, more and more of those people are starting to look for alternatives. They'll find us.
I've come here to complain that even though jetbrains recently added support for local models in their ai shit it's still worse than zed's.
>>7235 >jetbrains >zed This feels aliencoded.
>>7235 local llm aren't for real work
>>7249 qwhen? 3 will make local LLMs viable for real work.
>>7235 >>7237 >>7249 Petra, stop doing this
Am I retarded? Why does this guy recommend 512x512 for wan when it's not in the recommended resolutions? https://comfyanonymous.github.io/ComfyUI_examples/wan/
>>7275 because he's a fucking retard
>>7249 They cover a good chunk of it if you care enough about the ideology behind running local. The simple boilerplate, small changes, relatively simple bugfixes, can be handled just as well by current 70Bs as they can by e.g. Gemini Flash. (for me, deepcogito 70B and before that, Athene) I just really don't like the idea of individuals completely losing the ability to do their own computer stuff on their own hardware. So yeah I won't be so ridiculous as to never use the cloud stuff, when it really calls for it, but when I'm using local models it makes me feel like "you will own nothing and be happy" hasn't progressed quite so far.
>>7324 deepseek v3/r1 is also local.
>>7350 >MAI-DS-R1 is a DeepSeek-R1 reasoning model that has been post-trained by the Microsoft AI team to improve its responsiveness on blocked topics and its risk profile >MAI-DS-R1 has successfully unblocked the majority of previously blocked queries from the original R1 model Microsoft uncensoring models? I somehow doubt it. If Microshit got their claws on it, then they may have unblocked it's ability to tell you about Tiananmen Square, but at the cost of losing the ability to tell you what a woman is.
>>7353 I care less about that aspect than the slight hope that the finetune lessened R1's chaotic adhd tendencies as a side effect. It's cope but tunes by big corpos like this are likely the only real ones we're going to see for Deepseek considering the size of these models. I just wish there were quants for it.
>>7353 It's a double edged sword. They made it so you can ask about Tiananmen on the model but in return, they trained it on the same safety mix as Tulu so it went full safety from a Chinese point of view to a Western point of view. It is marginally better for real tasks like code generation due to the better data that Microsoft added but I would hardly say that was worth it. But Microsoft used those compute resources, not us and it's for enterprises so makes sense.
>discount /lmg/ hours >and discussing a fucking fine-tune that nobody should give a shit about What a fucking retarded discussion. Put this general out of its misery.
>>7358 >having a mental breakdown over people discussing one of the few finetunes for one of the best local models we have Is being poor that hard on you?
>>7363 >one of the few finetunes It's the exact same thing that Perplexity already did, the only thing all those companies care about is swapping Chinese propaganda with an American one. And then there will be /r/LocalLLaMA-level retards that will shill the model like if it became "uncensored". It's all those American companies the ones adding censorship we care about in in the first place. Fuck you for posting it here.
>>7364 Let people chose the propaganda they want dude.
>>7379 Not gonna get excited for western cucked models. Even if they benchmaxx a little higher.
Does the sillytavern image generation function not work with REFORGE? does it have to be the old A11111 SD1.5 UI? I upgraded to reforge ages ago and it cannot seem to find the connection to my reforge when I'm running it
>>7465 I've had good success with ComfyUI that's what everyone seems to be using for everything imagegen these days...
>>7465 It worked on the old re-forge made by pancho. I dunno about the new one. After he stopped updating, I moved to comfy.
>>7467 >>7468 I have never used comfyui for anything. How do you launch it so sillytavern picks it up? or better yet is there a guide for sillytavern image genning with comfyui? I just want to be able to have images be genned based on the situation mid-RP
>>7469 You start it with the API active, make a workflow and then put that WF with stuff like prompt replaced via placeholders inside silly. Not as plug and play like A1111 was but lets you do a whole lot more.
Any news about Qwen3? I missed the last couple of days because of the whole 4chan thing.
Well according to the system message on 4gay they're getting shut down. So I guess this is the official /lmg/ now.
>>7489 qwen3 miku oo ee oo
RIP. Perception-LM-8B ooms on a 3090. Useless model.
https://8chan.se/bot/ Our own board.
>>7593 We made a measly 100 posts in 4 days. Why would you want to splinter off now?
>>7593 no thanks
>>7364 >>7364 There's a good reason for them to do this finetune that has nothing to do with us using it as R1 was essentially mostly uncucked for most purposes anyone here would care about. Retarded politicians in Washington want to ban open weights model R1 because it was made in China and keep grasping at straws for some reason to ban it (not that there's many), but since this is MIT licensed, Microsoft is probably doing some legal trolling where they would finetune it and show some use and thus could defend it in court if the boomers do end up attempting to ban it. Obviously such a law would be unenforceable and they would be shooting themselves in the foot, and code is speech and all that, but Microsoft having their own variant would probably count as a good start for a defense.
>>7668 Also, isn't it R1 the strongest model you can run locally right now? This could be useful for companies with pockets deep enough to run R1, but in need of a model aligned to western sensibilities.
>>7679 yes that is the only usecase
>>7679 There was already one such finetune that came out weeks after R1 came out. Mostly though R1 isn't even that heavy on the refusals on the one thing that they tuned it against (CCP stuff), a simple prefill will avoid most issues as usual. And yes, it's close to the best open weights model currently.
>>7689 close?
>>7695 For example a reasoning finetune of 405B can reach similar performance to R1, Nvidia did one recently. It also depends on your usecase, sometimes you may be fine with a dumber model that uses less VRAM. Also, the first DS3 on which R1 was based on had serious repetition issues (somewhat solved in 3.1), which some smaller models (such as mistral large) lacked.
>>7703 ugh fine, but its so safety cucked . . .
>>7706 I'd just use R1, but I guess it's not uncommon for models to need some finetune after to remove "safety". Base models tend to be uncucked, but if the dataset is too filtered, the output can be too plain/boring, so ultimately you still need a finetune on top of it.
>>7707 >>7703 why would anyone want to use a 253B dense model over a 37B/671B MoE? if both have same-ish performance
>>7708 idk, I haven't played with nvidia's tune, but maybe there's some reason? It's like asking why would someone prefer claude opus or sonnet 3.7 over R1 or whatever, might depend on taste and how it performs in specific tasks. Currently R1 could be better at tool use, it's not like they don't have things to improve. I wonder if R2 will handle those well.
>>7708 No consumers at least. Can't run 250B on RAM without killing token generation speed.
Just want a quick update since I haven't been keeping up. Is Nemo still unbeaten by a model same parameter count or less? I'm guessing yes because it's a safe bet at this point, but figured I'd ask
dead thread, dead website, dead hobby
happy Easter
Just got myself a 3090, what the best model I can run for peak kino AI lewd roleplays?
>>7812 post rest of your specs, also https://meta.4chan.gay/tech/67288?last=100#bottom is more active
>>7812 cydonia
>>7812 MS-Magpantheonsel-lark-v4x1.6.2RP-Cydonia-vXXX-22B-8.i1-IQ4_XS.gguf
>comfy thread >growing website >developing hobby
>>7747 >https://huggingface.co/OnomaAIResearch/Illustrious-XL-v2.0 >Illustrious XL 1.0-2.0 series aims to stabilize native generation at 1536 resolution while significantly improving natural language understanding capabilities. Not really that interesting, I think it is hitting against the limits of what SDXL can do without Vpred. I expect a lot of models to probably rebase on this since we will probably never get local 3.0/3.5 Vpred from Angel and how funding has essentially almost stopped. >https://huggingface.co/OnomaAIResearch/Illustrious-Lumina-v0.03 >This model is based on Alpha-VLLM/Lumina-Image-2.0 , which is nice small DiT model with minimal guaranteed functionality! Please refer to https://github.com/Alpha-VLLM/Lumina-Image-2.0 for official repository. This is interesting but I suspect he tried to train it before their technical report was out. Lumina was trained on extremely details and long captions for tags and boomer prompting and they even built their own tool for that. I suspect the training wasn't as effective as it should've been because of that, and as the model says, it can recognize characters now but it is still severely undertrained to the extent where it doesn't even equal the training done on Illustrious v0.1
What's with the fake 404 on 4gay?
How about model for sci-fi novel slop?
>>7940 4chan got pwnd by sharty
>>7946 i mean 4chan.gay
>>7814 Will those niggers just come here instead I'm not going to a pizzachan
>>7958 they'll pick literally anywhere else but here. is it because of muh ids?
(17.62 KB 550x107 s5.png)

I get this red text each time I launch silly. What exactly is this and how do I fix it, idk where exactly it wants me to click for this. I've ignored it so far
>>7960 Choose Text Completion on the 2nd dropdown list under API text.
>>7959 It actually is ids, lmg has a history of randomly being spammed (by soijack party users no less) so obviously they won't post here
>>7358 >Implying that 50% of /lmg/ discussion wasn't always about trying out whatever new meme finetune
>>7962 I'll give it a try next time ty
(341.04 KB 1920x5224 retarded.webp)

(44.27 KB 1734x302 retardedtwice.webp)

>>7965 i love ids
>>7975 Based. Easy to get around though. I post through a vpn and get a new ID every time without changing anything. Not intentional, I like IDs.
>riverwind Is this a trolling model? I keep getting shilled by products.
>>7999 kek
>>8001 >001 AAAAAAAAAACCCCCCCCKKKK
>>7999 yes its a troll model, unironically great at what its made to do
>>7999 pretty sure it was an april fools day project that wasnt ready in time
>>7959 Probably, the guy who makes most of the posts there replied to himself here twice >>7237 >>7264 >>7275 >>7276
>>8021 What the fuck lmao what a weird cunt. if 4chan ever comes back IDs need to be on every board to out freakshows like this
>>8021 Why the fuck are You giving him (You)s
(248.28 KB 828x938 1726959941008009.jpg)

>>8021 What causes one to behave this way?
>>8028 you's are not currency dont be a faggot, this person deserves to be pointed out and shamed
>>8029 Mental illness.
https://github.com/JohannesGaessler/elo_hellm >Elo HeLLM is a project for establishing a ranking based on Elo ratings between large language models. The context is that I'm working on training code for llama.cpp. llama.cpp has methods for estimating the quality loss from quantization but it lacks methods for estimating the quality of a model in absolute terms or for making comparisons between different models. I intend to co-develop this project with the llama.cpp training code for quality control. The approach is to merge an arbitrary number of quality metrics into a single Elo rating for a model using statistical methods. One category of such quality metrics are simply the results of language model benchmarks such as MMLU. Results from competitive games such as Chess can also be used (not yet implemented).
Hey wait wtf. I just noticed that my post here >>7237 has the same ID as a bunch of other posts in the thread that aren't mine. I'm serious. Also, I don't see "(You)" in the replies. I'm getting spooked what the hell.
>>8036 Why is my id different ahhhhhhh.
(196.71 KB 269x375 1734017971365721.gif)

>tfw anons that leave /lmg/ for too long get assimilated into petra after all
>>8036 fake until proven gay
>>8038 >petrified petra is a gorgon
>>8038 >>8039 But seriously though this is creepy. Are the mods messing with me? Did I get hacked? How am I even supposed to get proof in this situation?
>>8041 Why do you care? Even if you are telling the truth you are anonymous and have no identity worth protecting.
>>8041 Your IP could have changed and some other guy has gotten your exact previous one. Which is probably less likely than winning the lottery.
serial expetriments lain
>>8041 In all probability, someone is just using the same VPN.
>>8046 this is true. watch me change me id by changing my vpn
>>8047 Sex with AI.
>>8048 as shrimple as that
anons what if he hacked 8chan too?
>>8050 He won't get away with it on 16chan.
Christ is risen Hitler's birthday Kikes seething
>>8043 Why would I not care? ID's serve a purpose, and people are treating them as something that has a purpose, so if they can be undermined, then we can't really treat them the same anymore. And I don't see why it someone wouldn't be concerned if they were the target of some mod trolling or other activity, assuming this wasn't due to a bug or some one in a million chance. >>8044 Last I checked I have a static IP. I do use librewolf though which might change my canvas/fingerprint around sometimes, does this site use other indicators other than IP to assign an ID? If so then perhaps that's why. >>8046 I wasn't using a VPN when I made that first post, and I'm not using one right now. I did use a VPN to take a look at gay 4chan tho.
I got different IDs too even though I have (supposedly) static IP.
>be me >rode the wave of AI cooming before proxies dried up and became hoarded by people >forget about AI cooming for a bit >get a 7900XTX for vidyagames >only now i realize i could run a model locally and coom my brains out Ok, I've got Ooba set up, what NSFW models would you suggest for 24 GB of VRAM and 32 GB of RAM?
>>8068 nevoria 70b or whatever its called


Forms
Delete
Report
Quick Reply