Follow (Blogs/Podcasts/Sites)
AI Tools
Providers
https://www.packet.ai/
https://www.spheron.ai/
https://www.atlascloud.ai/
https://glambase.app/
https://www.gputrader.io/
https://fal.ai/
https://mithril.ai/
https://www.baseten.co/
https://www.modular.com/
https://www.coreweave.com/
https://forum.datacrunch.io/
https://www.anyscale.com/
https://vast.ai/
https://www.clarifai.com/
https://www.shadeform.ai/
https://hosted.ai/
https://cast.ai/
https://modal.com/
https://tensormesh.ai/
https://www.modular.com/
https://cocoon.org/
https://www.porter.run/
https://gcore.com/
Local Inference
Platforms
Inference
GPU Sharing
Well, I try to give a more generic answer: vGPU virtualizes the GPU in timesharing its GFX and compute resources, and that needs a license and a special driver in hypervisor as well as in guest OS. vGPU allows for a lot of flexibility in how to assign fractions of the GPU to users, from full GPU to single user up to 12/16/24 fragments, one per user…
Compared to MIG, vGPU might have less predictable, sometime longer latency, if the user/jobs needs to wait for his time to use all GPU resources is to come again…
MIG has a fraction of all GPU resources fix assigned to a user/tenant/job, but is much less flexible in how to change the fraction size per user/job. MIG is only avail on the high end datacenter GPUs, can fragment the GPU between a single or up to 7 users/jobs/instances, but to change assignment, all jobs need to be stopped, and GPU needs to be reconfigured and like re-set…
So basically with MIG you trade lower, more predictability of resources/response/result for easier and more flexible managebility via vGPU…
Hope this helps as some guidance.
GPU Telemetry
Kubernetes
Performance Optimization
https://developer.nvidia.com/nsight-systems https://docs.csc.fi/apps/nsys/ https://github.com/gpu-mode/lectures https://cudaforfun.substack.com/p/outperforming-cublas-on-h100-a-worklog https://github.com/deepaksatna/NVIDIA-Nsight-Systems-Profiling-for-Distributed-LLM-Training https://alexarmbr.github.io/2024/08/10/How-To-Write-A-Fast-Matrix-Multiplication-From-Scratch-With-Tensor-Cores.html
NVIDIA
https://nvidia.custhelp.com/app/answers/detail/a_id/3751/~/useful-nvidia-smi-queries
https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/
https://docs.nvidia.com/deploy/cuda-compatibility/why-cuda-compatibility.html
https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html
https://nvidia.custhelp.com/app/answers/detail/a_id/3751/~/useful-nvidia-smi-queries
https://docs.nvidia.com/deeplearning/frameworks/user-guide/index.html#runcont
https://docs.nvidia.com/deploy/nvidia-smi/index.html
https://docs.nvidia.com/deploy/driver-persistence/index.html
https://docs.nvidia.com/gameworks/content/gameworkslibrary/coresdk/nvapi/group__gpupstate.html
https://github.com/NVIDIA/aistore
https://developer.nvidia.com/deepstream-sdk
https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html
https://github.com/NVIDIA/DALI
https://triton-inference-server.github.io/model_navigator/
NVIDIA CUDA
Use Cases
Infrastructure
Jupyter Services
AI Image & Video
https://kive.ai/
https://www.photogenius.ai/
https://browzwear.com/
https://imgsys.org/
https://openart.ai/home
https://higgsfield.ai/
https://klingai.com/global/
https://www.mimicpc.com/
https://getimg.ai/
https://www.facefusion.co/
https://www.comfy.org/
D-iD - Create Interactive Avatars to Engage Your Audience
Synthesia - Create studio-quality videos with AI avatars and voiceovers in 140+ languages. It’s as easy as making a slide deck.
Image Upscaler - Upscale your images by up to 600% using AI
Stable Diffusion Web - Create high-quality images with AI
RenderNet - Create high-quality images with AI
RunwayML - Create high-quality images with AI
Leonardo - Create high-quality images with AI
https://www.krea.ai/
https://www.affinity.studio/
https://www.freepik.com/
https://wavespeed.ai/
https://runware.ai/
ML System Design
https://www.theunwindai.com/
https://www.evidentlyai.com/ml-system-design
https://www.zenml.io/llmops-database
https://www.zenml.io/blog/llmops-in-production-another-419-case-studies-of-what-actually-works
https://wandb.ai/mostafaibrahim17/ml-articles/reportlist
https://bytebytego.com/courses/machine-learning-system-design-interview/visual-search-system
https://github.com/zhaochenyang20/Awesome-ML-SYS-Tutorial
https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
https://github.com/IaroslavElistratov/ml-systems-course
https://github.com/mercari/ml-system-design-pattern
https://github.com/khangich/machine-learning-interview