Inbox

https://stanford-cs329s.github.io/
https://arxiv.org/abs/2510.08731
https://github.com/karpathy/nanochat
https://sebastianraschka.com/llms-from-scratch/
https://huggingface.co/learn/diffusion-course/unit0/1
https://huggingface.co/docs/diffusers/main/en/index
https://bbycroft.net/llm
https://www.bentoml.com/blog/nvidia-data-center-gpus-explained-a100-h200-b200-and-beyond
https://www.bentoml.com/blog/which-inference-platform-is-right-for-enterprise-ai
https://www.bentoml.com/blog/deepseek-ocr-contexts-optical-compression-explained
https://www.bentoml.com/llm/getting-started/choosing-the-right-gpu
https://www.bentoml.com/llm-perf/
https://www.bentoml.com/blog/announcing-llm-optimizer
https://bentoml.com/llm/inference-optimization/kv-cache-offloading
https://bentoml.com/llm/
https://bentoml.com/llm/inference-optimization/llm-performance-benchmarks
https://www.doubleword.ai/resources/behind-the-stack-ep-1-what-should-i-be-observing-in-my-llm-stack
https://www.bentoml.com/blog/3x-faster-llm-inference-with-speculative-decoding
https://www.bentoml.com/blog/amd-data-center-gpus-mi250x-mi300x-mi350x-and-beyond
https://newsletter.semianalysis.com/p/the-gpu-cloud-clustermax-rating-system-how-to-rent-gpus
https://www.baseten.co/resources/guide/the-baseten-inference-stack/
https://www.baseten.co/blog/how-baseten-achieved-2x-faster-inference-with-nvidia-dynamo/#how-baseten-uses-nvidia-dynamo
CS230 Deep Learning
A hands-on course for real AI Engineers
The AI Engineering Playbook
The Ultra-Scale Playbook: Training LLMs on GPU Clusters
SkyPilot: An Intercloud Broker for Sky Computing
GPU-Enabled Platforms on Kubernetes
https://developer.nvidia.com/blog/an-introduction-to-speculative-decoding-for-reducing-latency-in-ai-inference/
https://cgnarendiran.github.io/blog/kv-caching-mla-is-attention-all-you-really-need/
https://cgnarendiran.github.io/blog/hnsw-graph-based-vector-search/
https://cgnarendiran.github.io/blog/lora-efficient-fine-tuning-llms/
https://alidarbehani.com/2025/08/24/beyond-gpus-mastering-ultra-scale-llm-training/
https://alidarbehani.com/2025/08/28/beyond-gpus-mastering-ultra-scale-llm-training-part-2/
https://newsletter.semianalysis.com/p/amazons-ai-resurgence-aws-anthropics-multi-gigawatt-trainium-expansion
https://writings.stephenwolfram.com/2023/02/what-is-chatgpt-doing-and-why-does-it-work/
https://writings.stephenwolfram.com/2024/08/whats-really-going-on-in-machine-learning-some-minimal-models/

https://clickhouse.com/blog/breaking-free-from-rising-observability-costs-with-open-cost-efficient-architectures
https://www.uber.com/en-FI/blog/building-ubers-data-lake-batch-data-replication-using-hivesync/
https://medium.com/@isanghao/io-bound-or-compute-bound-in-ai-c9c541cd6696
https://medium.com/@oril_/transforming-mobile-development-with-backend-driven-ui-c65df97baa79
https://developer.nvidia.com/blog/nvidia-blackwell-ultra-sets-new-inference-records-in-mlperf-debut/
https://developer.nvidia.com/blog/inside-nvidia-blackwell-ultra-the-chip-powering-the-ai-factory-era/?ncid=so-link-929079&linkId=100000379792615
https://developer.nvidia.com/blog/smart-multi-node-scheduling-for-fast-and-efficient-llm-inference-with-nvidia-runai-and-nvidia-dynamo/
https://mlcommons.org/2025/09/deepseek-inference-5-1/
https://medium.com/@piyushkashyap045/tokens-and-embeddings-5d65c7543dea
https://finance.yahoo.com/news/redis-acquire-real-time-data-130000238.html
https://www.spaceo.ai/case-study/ai-agent-cost-optimization/
https://www.uber.com/en-IT/blog/from-predictive-to-generative-ai/
https://stytch.com/blog/best-authentication-services/
https://www.unite.ai/what-every-data-scientist-should-know-about-graph-transformers-and-their-impact-on-structured-data/
https://tamerlan.dev/how-i-manage-my-dotfiles-using-gnu-stow/
https://veitner.bearblog.dev/gpu-l2-cache-persistence/
https://www.duckbillgroup.com/blog/figmas-300k-daily-aws-bill-isnt-the-scandal-you-think-it-is/
https://medium.com/@knish5790/fine-tuning-large-language-models-llms-in-2025-623567db84e9
https://www.philschmid.de/agents-2.0-deep-agents
https://medium.com/fresha-data-engineering/the-good-the-bad-and-the-automq-5aa7a8748e71
https://medium.com/@sahilkatiyar2024/inside-the-mind-of-a-cnn-architecture-explained-simply-7b1168a628c7
https://oaqlabs.com/2025/10/12/kernel-level-gpu-optimization-for-transformer-attention-a-technical-deep-dive/

Dify - Open-source LLM app platform
Build, evaluate, and deploy LLM apps with visual workflows, RAG, agents, datasets, and observability.
llm platform, workflow builder, agents, rag, open-source, evaluation, monitoring

DBOS - Transactional serverless runtime
Database-as-OS model providing durable workflows and ACID transactions for cloud applications.
serverless, workflows, transactions, durability, distributed systems

Flowise - Drag-and-drop LLM workflow builder
Open-source visual builder for creating LLM workflows, agents, and chatbots with no-code interface.
no-code, workflow builder, agents, chatbots, open-source, visual builder

Zep - Memory and vector store for LLM apps
Long-term session memory, embeddings, and semantic search to power RAG and personalized assistants.
memory, vector database, embeddings, semantic search, rag, llm memory

Maze - User research and testing platform
Conduct user research and usability testing with remote, unmoderated testing tools and analytics.
user research, usability testing, analytics, remote testing, user experience

zrok - Secure tunneling and sharing platform
Zero-trust networking for secure sharing and tunneling with built-in authentication and access controls.
tunneling, secure sharing, zero-trust, networking, access control

Syncable - Real-time data synchronization
Real-time data synchronization platform for building collaborative applications with offline support.
data sync, real-time, collaborative apps, offline support, synchronization

CodeRabbit - AI-powered code review
AI-powered code review platform that provides intelligent feedback and suggestions for pull requests.
code review, ai assistant, pull requests, code quality, automated feedback

AIBrix - VLLM inference optimization
High-performance inference optimization toolkit for large language models with VLLM integration.
llm inference, performance optimization, vllm, inference acceleration, model serving

WorkOS - Enterprise authentication platform
Enterprise-grade authentication and user management APIs for developers building B2B applications.
enterprise auth, sso, user management, b2b apis, authentication

NocoDB - Open-source Airtable alternative
Open-source no-code database platform that turns any database into a smart spreadsheet interface.
no-code database, airtable alternative, open-source, spreadsheet interface, database management

RustFS - Rust-based file system
High-performance file system implementation written in Rust with focus on safety and concurrency.
file system, rust, performance, safety, concurrency

Clerk - Authentication and user management
Complete authentication and user management platform for React, Next.js, and modern web applications.
authentication, user management, react, nextjs, web apps

Liquid AI - Foundation models and neural networks
Advanced AI research company developing foundation models and neural network architectures.
foundation models, neural networks, ai research, machine learning, deep learning

RooCode - Development platform
Cloud-based development platform for building and deploying applications with collaborative features.
cloud development, deployment platform, collaborative coding, application building

StarWatcher - AI-powered GitHub analytics
AI-powered analytics platform for tracking GitHub repositories, stars, and open-source project insights.
github analytics, repository tracking, open-source insights, ai analytics, project monitoring

Portkey - AI gateway and LLM operations
AI gateway platform for managing, monitoring, and scaling LLM applications with observability features.
ai gateway, llm operations, monitoring, observability, model management

Clay - Data enrichment and automation
Data enrichment and automation platform for sales and marketing teams with AI-powered insights.
data enrichment, sales automation, marketing tools, ai insights, lead generation

Eraser - Documentation and diagrams
Collaborative platform for creating technical documentation, diagrams, and architectural designs.
documentation, diagrams, collaboration, technical writing, architecture design

StreamYard - Live streaming platform
Browser-based live streaming platform for creating professional broadcasts and webinars.
live streaming, webinars, broadcasting, online events, content creation

Streamlabs - Streaming software and tools
Comprehensive streaming software suite with alerts, overlays, and monetization tools for creators.
streaming software, content creation, alerts, overlays, monetization

SingleStore - Distributed database platform
High-performance distributed database for real-time analytics and transactional workloads.
distributed database, real-time analytics, high performance, transactional, data processing

Lightning AI - Machine learning platform
End-to-end machine learning platform for training, deploying, and scaling AI models with PyTorch.
machine learning, pytorch, model training, ai platform, model deployment

CoreWeave - GPU cloud computing
Specialized GPU cloud platform optimized for AI, machine learning, and high-performance computing workloads.
gpu cloud, ai computing, machine learning infrastructure, high performance computing, cloud gpu

Tracto - AI workflow automation
AI-powered workflow automation platform for streamlining business processes and task management.
workflow automation, ai automation, business processes, task management, process optimization

Datasaur - Data labeling platform
Collaborative data labeling platform for machine learning projects with AI-assisted annotation tools.
data labeling, machine learning, annotation tools, ai-assisted, collaborative platform

Hyperstack - GPU cloud platform
GPU cloud infrastructure platform designed for AI, machine learning, and compute-intensive applications.
gpu cloud, ai infrastructure, machine learning, compute platform, cloud computing

Character.AI - Create and chat with AI characters
Platform to create and interact with AI characters for assistance, entertainment, and roleplay experiences.
ai characters, chatbots, roleplay, consumer ai, conversational ai