Posts

Showing posts from 2024

My Windows software development ecosystem is complicated

Image
Modern development is surprisingly complicated.  My personal software development environment is suffering a severe case of urban sprawl.  While working on some Container IAC scripting for ML and the cloud, I had to pop into a Linux environment. This environment map made me realize that I have a crazy set of different specialized sandboxes. Software systems continue to grow and become more complicated. Software Engineering platforms growing right alongside them. YouTube https://youtu.be/67i43rBkk1c Why this page exists This page exists as a link root for the video and to make it easy to refer to this diagram. Revision History Created 2024 08

Manually validating compatibility and running NVIDIA (NIM) container images

Image
NVIDIA NIMs are ready to run pre-packaged containerized models.  The NIMs and their included models are available in a variety of profiles supporting different compute hardware configurations.  You can run the NIMs in an interrogatory mode that will tell you which models are compatible with your GPU hardware. You can then run the NIM with the associated profile.   Sometimes there are still problems and we have to add additional tuning parameters to fit in memory or change data types. In my case, the data type change is because of some bug in the NIM startup detection code.   This article requires additional polish.  It has more than a few rough edges.   NVIDIA NIMs are semi-opaque. You cannot build your own NIM.  NIM construction details are not described by NVIDIA.  Examining NVidia Model Container Images The first step is to select models we think can fit and run on our NVIDIA GPU hardware. The first step is to investigate models of the different types by visiting the appropriate NVI

Rocking an older Titan RTX 24GB as my local AI Code assist on Windows 11, Ollama and VS Code

Image
This is about using a Turing NVIDIA Titan RTX GPU to locally execute code assist LLMs to be used in VSCode. This slightly older card has 24GB of VRAM making it a great local LLM. The Titan RTX is a two-slot dual-fan card. The Titan RTX is currently about the same price as a refurbished Ampere NVidia 3090 TI 24GB. There are a bunch of ways to host the code support LLMs. We are using an early release  Ollama   as our LLM service and  continue.dev  VSCode extension as the language service inside VSCode.   This was tested on AMD Ryzen 8 core with 64GB of memory and the Titan RTX.  Related blog articles and videos Several related blogs and videos that cover VSCode and local LLMs Blog  Get AI code assist VSCode with local LLMs using Ollama and the Continue.dev extension - Mac Get AI code assist VSCode with local LLMs using LM Studio and the Continue.dev extension - Windows Rocking an older Titan RTX 24GB as my local AI Code assist on Windows 11, Ollama and VS Code