NVidia AI Studio Workbench is a containerized ML playground
NVidia AI Studio creates and manages containerized ML environments that isolate ML projects on local and remote machines. You no longer have to switch environments or remember which version of Python or Anaconda you are using in your global machine environment. NVidia simplifies the initial configuration by providing predefined image definitions containing Python, PyTorch and other tools to be used with or without NVidia Graphics cards.
inbounds ssh on server after opening a remote Jupyter notebook via the web browser
The actual development is done via browser-based tools like JupyterLabs notebooks. Workbench spins up local proxies that port forward into the development container.
NVidia Workbench runs in a WSL instance
NVidia workbench runs in its own WSL instance. Each project runs in its own Docker container. You can look at the NVidia main WSL instance by opening a shell into that WSL instance. The following command can be run in a Windows terminal window.
- wsl -d NVIDIA-Workbench
NVidia projects live in the WSL instance in
- /home/workbench/nvidia-workbench/
Creating A New Project
The project creation wizard gathers basic information and then asks which preconfigured environment should be used. You select an environment definition and a GitHub or local repository when you create a workspace. The Studio builds the container image and populates the source tree from Git or creates a new empty source tree. The NGC catalog contains environment definitions for Python, PyTorch, and RAPIDS all with CUDA support
This will be an empty project with no code.
There is no way to add AI Workbench support to an existing project at this time (5/2024)
Local Projects in Local Containers
AI Studio can manage multiple projects and their associated environments. Each project environment has its own container image. AI Studio builds a project image when the project is created that describes the resources required to build the image. It then builds the image which can be reused. An image is built on any machine the project is run/developed on.
A file project browser is opened whenever you double-click on that project. This is just a view into the local WSL Workbench file system or the remote Linux file system if using a remote server.
You can see the status of the containerized project in the status bar at the bottom of the screen. The project image may be built or it may be built and running.
Clicking on the green button hydrates the image. The image is deployed into docker on the target machine, local or remote.
NVidia Workbench opens a browser when the container is running. The browser is pointed at a local proxy that forwards to the AI Workbench container's exposed port.
Workbench Examples
NVidia hosts workbench examples in the public GitHub repository. Most of these examples will fit on standard video cards.
Local and Remote Control
The local Windows (in my case) installation of AI Studio manages containerized development environments on local or remote machines. The installation process locally installs Docker or Podman. It can only communicate with a local development environment if AI Studio Server is installed. Each location can have multiple projects. Those projects each run in their own container.
Linux Remote Server
A local AI Studio instance can manage an AI Studio instance running remotely on a Linux server. The local AI Studio communicates with a remote service. You can select local or remote projects using the AI Studio GUI's host dropdown. This capability relies on SSH
- You need to have SSH enabled on the remote server.
- Actual software development is done using a web browser hitting a local endpoint that is port-forwarded to the remote server.
You need to install AI Studio locally and on the remote Linux machine. Linux server installation is CLI-based. The actual development environment is always containerized. AI Studio installs docker if it is not previously installed on the remote server. The remote server installation is command-line and text-based.
mkdir -p $HOME/.nvwb/bin && \curl -L https://workbench.download.nvidia.com/stable/workbench-cli/$(curl -L -s https://workbench.download.nvidia.com/stable/workbench-cli/LATEST)/nvwb-cli-$(uname)-$(uname -m) --output $HOME/.nvwb/bin/nvwb-cli && \chmod +x $HOME/.nvwb/bin/nvwb-cli && \sudo -E $HOME/.nvwb/bin/nvwb-cli install
You can watch the installation process in the text GUI
Linux Remote References
- https://docs.nvidia.com/ai-workbench/user-guide/latest/installation/ubuntu-remote.html
- https://docs.nvidia.com/ai-workbench/user-guide/latest/reference/locations/remote.html
- https://docs.docker.com/build/buildkit/#getting-started changes to daemon.json
If you get an error building a remote container related to buildx
The AI Studio builds a docker image every time you create a new environment from a template. That container is your project. The build process requires buildx which may not be installed on your remote Linux system. My Ubuntu 22.01 had docker installed but not buildx.
sudo apt install docker-buildx
Version alignment between local and server
The AI Studio versions on the local machine and on the remote Linux server should be the same. You may need to update the local AI Studio if you download the latest version of AI Studio to your remote machine.
Opening a remote project
The server-side Linux AI Studio creates a new container for each project you spin up a project environment. Every Jupyter environment is its own container. This is true for local or remote projects. The following image shows the logs for a successful container build.
That container is then run when you decide to open the project. The screen to the right shows a containerized project. You can click on it to see the details.
Developers start working on the project by clicking on the green Connect button. This will open a browser notebook session
Configuration issues when adding remote locations at home
AI Studio expects to be able to reach the remote server by FQDN hostname. It expects there to be a hostname. Hostname won't work if you just have a bare hostname, like hp-z820 You will need your server's IP address.
I tried to use .local as a domain but that didn't work and instead used the IP address.
Videos
Nerd Details with Studio and SSH
The Studio GUI executes remote commands via SSH. It talks to the remote server, creates workspaces, or starts the container via SSH. The following is just to exemplify this communication. The local laptop is Powerspec-g708 and the remote Linux server is hp-7820.
idle on the server after ssh into the remote machine.
(base) joe@hp-z820:~$ sudo netstat -atp | grep ssh
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 hp-z820:60436 hp-z820:ssh ESTABLISHED 16648/ssh
tcp 0 0 hp-z820:ssh hp-z820:60436 ESTABLISHED 16649/sshd: joe [pr
tcp6 0 0 [::]:ssh [::]:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 hp-z820:60436 hp-z820:ssh ESTABLISHED 16648/ssh
tcp 0 0 hp-z820:ssh hp-z820:60436 ESTABLISHED 16649/sshd: joe [pr
tcp6 0 0 [::]:ssh [::]:* LISTEN 1094/sshd: /usr/sbi
inbound ssh on the server after opening a Workbench AI Workbench remote
(base) joe@hp-z820:~$ sudo netstat -atp | grep ssh
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 localhost:47610 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp 0 0 localhost:40398 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp 0 0 hp-z820:ssh Powerspec-g708:61068 ESTABLISHED 33724/sshd: joe [pr
tcp 0 0 hp-z820:60436 hp-z820:ssh ESTABLISHED 16648/ssh
tcp 0 0 hp-z820:ssh hp-z820:60436 ESTABLISHED 16649/sshd: joe [pr
tcp 0 0 localhost:47650 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp 0 0 localhost:47626 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp 0 0 localhost:47634 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp 0 0 localhost:40386 localhost:10001 ESTABLISHED 33830/sshd: joe
tcp6 0 0 [::]:ssh [::]:* LISTEN 1094/sshd: /usr/sbi
inbounds ssh on server after opening a remote Jupyter notebook via the web browser
(base) joe@hp-z820:~$ sudo netstat -atp | grep ssh
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 localhost:44384 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:56986 localhost:10001 ESTABLISHED 37430/sshd: joe
tcp 0 0 hp-z820:ssh Powerspec-g708:61080 ESTABLISHED 37331/sshd: joe [pr
tcp 0 0 localhost:56990 localhost:10001 ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:33990 localhost:10001 ESTABLISHED 37430/sshd: joe
tcp 0 0 hp-z820:60436 hp-z820:ssh ESTABLISHED 16648/ssh
tcp 0 0 localhost:44356 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:44404 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:33984 localhost:10001 ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:44368 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:44394 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 hp-z820:ssh hp-z820:60436 ESTABLISHED 16649/sshd: joe [pr
tcp 0 0 localhost:44386 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:44362 localhost:webmin ESTABLISHED 37430/sshd: joe
tcp 0 0 localhost:57002 localhost:10001 ESTABLISHED 37430/sshd: joe
tcp6 0 0 [::]:ssh [::]:* LISTEN 1094/sshd: /usr/sbi
inbound ssh on the server after closing the remote AI Workbench window
(base) joe@hp-z820:~$ sudo netstat -atp | grep ssh
tcp 0 0 0.0.0.0:ssh 0.0.0.0:* LISTEN 1094/sshd: /usr/sbi
tcp 0 0 hp-z820:60436 hp-z820:ssh ESTABLISHED 16648/ssh
tcp 0 0 hp-z820:ssh hp-z820:60436 ESTABLISHED 16649/sshd: joe [pr
tcp6 0 0 [::]:ssh [::]:* LISTEN 1094/sshd: /usr/sbi
Revision History
Created 2024/05
Comments
Post a Comment