Developer Daily

Top Stories

Hit Score 95
Hacker News

vLLM Scales DeepSeek Serving to 2.2k tok/s

vLLM achieves massive throughput for DeepSeek models using wide-ep on H200 GPUs, showcasing the latest in large-scale AI inference efficiency.

AIInfrastructureGPU
Hit Score 92
Hacker News

40-Line Fix Closes 400x Performance Gap

A deep dive into how QuestDB optimized JVM thread user time tracking to eliminate a massive performance bottleneck with a surprisingly small code change.

PerformanceJVMJava

More Updates