forked from capjamesg/hugging-face-papers-rss
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathhf_posts.xml
More file actions
2 lines (2 loc) · 13 KB
/
hf_posts.xml
File metadata and controls
2 lines (2 loc) · 13 KB
1
2
<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>Hugging Face Posts</title><link>https://huggingface.co/</link><description>This is a website scraping RSS feed for the Hugginface trending posts.</description><generator>rfeed v1.1.1</generator><docs>https://github.com/svpino/rfeed/blob/master/README.md</docs><item><title>Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B.</title><link>https://huggingface.co/posts/eaddario/861558852118345</link><description>Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.6-27B and Qwen/Qwen3.6-35B-A3B. Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, GPQA, MMLU, etc.) and methodology in the models' cards. eaddario/Qwen3.6-27B-GGUF eaddario/Qwen3.6-35B-A3B-GGUF See translation</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/eaddario/861558852118345</guid></item><item><title>SciCrafter measured something AI practitioners have intuited: frontier agents are improving at executing inside well-framed problems, but lag at framing the problem in the first place.</title><link>https://huggingface.co/posts/salma-remyx/889764886790464</link><description>SciCrafter measured something AI practitioners have intuited: frontier agents are improving at executing inside well-framed problems, but lag at framing the problem in the first place. GPT-5.2, Gemini-3-Pro, and Claude Opus 4.5 all plateaued near 26% on a new Minecraft benchmark for probing AI capabilities in the discovery-to-application loop. So the authors ran targeted interventions: * Hints about what to investigate doubled performance. * A structured experimentation template added 7-14 more points. * Structured consolidation beat free-form summaries by 6 points. * Curriculum context beat independent task-solving. These interventions helped the agent frame what’s worth investigating, and structure what gets learned so it compounds. The bottleneck for AI in scientific workflows is upstream of execution. Their findings are congruent with the design patterns we've adopted at Remyx AI to help AI teams close the development loop scientifically. Agents work well inside structured...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/salma-remyx/889764886790464</guid></item><item><title>Şifahane, a dual-inference medical classification demo, is now live on Spaces. It features side-by-side Turkish BERT and Qwen2.5 architectures for real-time evaluation of the "Classifier vs. LLM" trade-offs, all within a single space. The system utilizes a fine-tuned Turkish BERT for high-speed, cost-effective inference and the Qwen2.5-7B model for flexible multi-task reasoning, with support for department classification, condition analysis, urgency assessment, and rationale generation across 12 medical departments.</title><link>https://huggingface.co/posts/cihatyldz/143573654584943</link><description>Şifahane, a dual-inference medical classification demo, is now live on Spaces. It features side-by-side Turkish BERT and Qwen2.5 architectures for real-time evaluation of the "Classifier vs. LLM" trade-offs, all within a single space. The system utilizes a fine-tuned Turkish BERT for high-speed, cost-effective inference and the Qwen2.5-7B model for flexible multi-task reasoning, with support for department classification, condition analysis, urgency assessment, and rationale generation across 12 medical departments. 🧠 BERT model: https://lnkd.in/dCUUASqq 📊 Dataset: https://lnkd.in/dGK9y24w 🤗 Demo: https://lnkd.in/dtWjCCPF See translation</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/cihatyldz/143573654584943</guid></item><item><title>Day 3 - 05/02/2026</title><link>https://huggingface.co/posts/Crownelius/258416991176465</link><description>Day 3 - 05/02/2026 Scamp ships, hits the wall. New plan... Scamp came back from training today... Didn't go so well, I'm still unsure... Fast benchmark, temperature 0.7, top_p 0.9: - "Capital of France is" produced "covered by the Crown" (grammatical, factually wrong) - "23 + 19 = ?" produced "23. Answer: 23. Answer: 23..." (loops, math broken) - "def fibonacci(n):" produced a list of letters It speaks English. It can't reason. At 8K vocab and 50M params, it was never going to. Next build: 412M MoE-3E. Three experts (math, language, code), top-1 routing, random init, let specialization emerge from gradient signal alone. Tried seeded Branch-Train-MiX first then dropped it. Adds compute for no clear win when the router will find its own attractors anyway. Big lesson today came from limit testing on A100 80GB. Surprise, every planned phase ran out of memory even on 80GB. Root cause: at vocab 262144 (Gemma 3 standard), the output logits dominate during forward and backward. Fix: Liger...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/Crownelius/258416991176465</guid></item><item><title>By trying to disprove the Omega H2 battery I have discovered;</title><link>https://huggingface.co/posts/AbstractPhil/850797268513183</link><description>By trying to disprove the Omega H2 battery I have discovered; * Each topology formed by the H2 battery is deviant, none have a uniformly shared substrate of behavior. They are each uniquely independent per training set all with perfect recon. * Image recon can be tracked and mapped, yielding a consistently mapped and response 16.77m vocabulary potential. In the current spectrum testing at around 5 million unicode bytes. * The model scale shows patch size is related to how much data you want the model to represent within the model itself, and this has yet to see a capacity to this day. The MSE recons and yields - and the more data fed, the more they yield. * The scaling principle shows that the model indefinitely scales upward and each level of the model can be iteratively captured upward to form deviant and uniformly consistent repeatable pathways of implicit codewise response, not just arbitrary bitwise recall. Meaningful implicit learned utility. * Image recon patch size should...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/AbstractPhil/850797268513183</guid></item><item><title>Multimodal-Edge Demo, a node-based inference canvas demo, is now live on Spaces. It features node-based Transformers for fast inference across 10+ edge-device multimodal models on the Hub, all within a single space. The series includes models from Qwen3.5, Qwen3-VL, Gemma 4, and the LFM 2.5 VL model series, with support for reasoning and grounding tasks.</title><link>https://huggingface.co/posts/prithivMLmods/636036299853646</link><description>Multimodal-Edge Demo, a node-based inference canvas demo, is now live on Spaces. It features node-based Transformers for fast inference across 10+ edge-device multimodal models on the Hub, all within a single space. The series includes models from Qwen3.5, Qwen3-VL, Gemma 4, and the LFM 2.5 VL model series, with support for reasoning and grounding tasks. 🤗 Demo: prithivMLmods/Multimodal-Edge-Node 🔗 GitHub: https://github.com/PRITHIVSAKTHIUR/Multimodal-Edge-Node ✅ Multimodal Apps Collections: https://huggingface.co/collections/prithivMLmods/hall-of-multimodal-apps 🤗 > To learn more, visit the app page or the respective model pages. See translation</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/prithivMLmods/636036299853646</guid></item><item><title>✅ Article highlight: *Runtime Admissibility and Barrier Objects* (art-60-226, v0.1)</title><link>https://huggingface.co/posts/kanaria007/853173732847613</link><description>✅ Article highlight: *Runtime Admissibility and Barrier Objects* (art-60-226, v0.1) TL;DR: This article turns runtime admissibility into a first-class object family. A governed runtime should not rely on scattered booleans, warning banners, or hidden branches to decide whether an effect may proceed. It should evaluate the requested effect under an explicit *barrier object*, emit a normalized verdict, record the resulting runtime posture, and preserve the full lineage if the path later degrades, reopens, or reenters. Read: kanaria007/agi-structural-intelligence-protocols Why it matters: • turns “was this allowed?” into a replayable governance question • makes runtime gating portable and auditable instead of implementation-specific branching • distinguishes degraded postures that are operationally different even when they normalize to the same exported verdict • prevents history laundering by requiring explicit reopen and reentry lineage What’s inside: • the core idea that a *barrier*...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/kanaria007/853173732847613</guid></item><item><title>[DAY TWO] PROJECT CROWFEATHER - 5/1/2026</title><link>https://huggingface.co/posts/Crownelius/763437334277546</link><description>[DAY TWO] PROJECT CROWFEATHER - 5/1/2026 Que sera, what will he be? Step 47,500 of 100,000. Loss hovering around 2.76 on 6.2B tokens. Throughput steady at 87k per second on the A100. Not a GH200, but she gets it done. Still haven't named him. Scamp has a rascally charm. Quentin sounds like he'd wear a bow tie and think hard before speaking. Taking votes. Phase two is what's keeping me up. Datasets everywhere and I can't pick. I'm fusing Google and DeepSeek's ideas: Gemma 4's alternating sliding and global attention, DeepSeek V4's Muon optimizer and WSD scheduler, Gemma 2's logit soft cap, and PaLM's z-loss. Sounds like peanut butter on a hamburger, but the loss curve says it works. Tribe_v2 has real potential but needs more scaffolding than a barn raising before I throw it in. One thing's certain though. This model's gonna be a thinker. Not a Wikipedia parrot. Something that chews before it answers. Finally got a use for my less popular datasets too. Some Opus-4.5-Writing-Style for...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/Crownelius/763437334277546</guid></item><item><title>🌌 Introducing Model Galaxy — a Living, Multimodal Fork of the HF Model Atlas</title><link>https://huggingface.co/posts/SeaWolf-AI/178991568320786</link><description>🌌 Introducing Model Galaxy — a Living, Multimodal Fork of the HF Model Atlas 👉 Try it: FINAL-Bench/model-galaxy This Space is a fork of the brilliant Eliahu/Model-Atlas, the official demo of "Charting and Navigating Hugging Face's Model Atlas" (Horwitz et al., arXiv 2503.10633). Their pre-computed HF model graph is the foundation of every node and edge you see, and we are deeply grateful for its open release. The original atlas is a static snapshot of early 2025. Model Galaxy turns it into a living, multimodal map. We injected the 2026 trending originals that did not exist when the atlas was frozen — DeepSeek-V4, Hy3-preview, GLM-5.1, Kimi-K2, gpt-oss, Nemotron-3 Super / Nano / Omni, Hermes-4.3, Qwen3-Coder-Next, Llama-3.3, Granite-4.1, plus the latest multimodal releases (FLUX.2, ERNIE-Image, HunyuanImage / Video, LTX-2.3, Wan2.2, Kokoro-82M, VoxCPM2, Voxtral-TTS, whisper-v3-turbo, Gemma-4, Qwen3-Omni, Phi-4-mm) — each with proper base_model lineage edges. We also added the...</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/SeaWolf-AI/178991568320786</guid></item><item><title>I submitted a "Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization" Paper by Zi-Bo Qin, Feng-Feng Wei, Tai-You Chen, Wei-Neng Chen to Daily Papers on huggingface.</title><link>https://huggingface.co/posts/rajkumarrawal/456013984828233</link><description>I submitted a "Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization" Paper by Zi-Bo Qin, Feng-Feng Wei, Tai-You Chen, Wei-Neng Chen to Daily Papers on huggingface. A trajectory-driven framework uses large language models to guide agent behavior and cooperation patterns in distributed black-box consensus optimization, improving solution quality and efficiency. Learning to Act and Cooperate for Distributed Black-Box Consensus Optimization (2605.00691) See translation</description><pubDate>Mon, 04 May 2026 15:11:15 GMT</pubDate><guid isPermaLink="true">https://huggingface.co/posts/rajkumarrawal/456013984828233</guid></item></channel></rss>