Contents

Smart GPU: The Library That Makes GPU Computing Actually Simple

Smart GPU: The Library That Makes GPU Computing Simple

I’ve been diving deep into AI and machine learning lately - I even went back to school for it (see my post: Why I Went Back to School for AI (Even With a Full Plate)). I’ve worked on a number of machine learning projects, and there’s one problem that keeps coming up: the GPU/CPU compatibility problem.

https://images.unsplash.com/photo-1727176763565-1d983341bb95?q=80&w=2340&auto=format&fit=crop&ixlib=rb-4.1.0&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D

Photo by Đào Hiếu on Unsplash

You know the drill. You write some code on your beefy GPU machine, it runs like lightning, and you’re feeling pretty good about yourself. Then you try to deploy it on a CPU-only server, and everything breaks. Or worse, your teammate on a Mac tries to run your code, and it’s a complete disaster.

This happened to me one too many times, so I built Smart GPU - a Python library that makes GPU computing actually simple.

The Problem: Why GPU Computing Is a Mess

Let me show you what I mean. Here’s what you typically have to write to support both CPU and GPU environments:

# The old way - a complete mess
import numpy as np
import pandas as pd

try:
    import cupy as cp
    import cudf
    USE_GPU = True
except ImportError:
    USE_GPU = False

# Now you need duplicate code paths everywhere
if USE_GPU:
    # GPU code path
    arr = cp.array([1, 2, 3, 4, 5])
    df = cudf.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    result = cp.sum(arr)
else:
    # CPU code path - same logic, different libraries
    arr = np.array([1, 2, 3, 4, 5])
    df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
    result = np.sum(arr)

This approach has several problems:

  1. Code Duplication: You’re writing the same logic twice
  2. Maintenance Nightmare: Every change needs to be made in two places
  3. Environment Complexity: Different users have different hardware setups
  4. Deployment Issues: Code that works on your machine fails elsewhere
  5. Platform Limitations: GPU libraries only work on Linux with NVIDIA GPUs

The Solution: Write Once, Run Anywhere

Smart GPU provides a unified interface that automatically detects your environment and uses the best available resources. Here’s what the same code looks like with Smart GPU:

# The Smart GPU way - clean and simple
from smart_gpu import gpu_utils, array, DataFrame

# Same code works everywhere
arr = array([1, 2, 3, 4, 5])
df = DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
result = gpu_utils.np.sum(arr)

# Automatically uses GPU if available, CPU otherwise
print(f"Using: {'GPU' if gpu_utils.is_gpu_mode else 'CPU'}")

Or even simpler with direct imports:

from smart_gpu import gpu_utils

# Import np and pd directly - they automatically switch between GPU/CPU
np = gpu_utils.np
pd = gpu_utils.pd

# Use exactly like regular NumPy/Pandas
arr = np.array([1, 2, 3, 4, 5])
df = pd.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
result = np.sum(arr)

That’s it. No more try/except blocks. No more duplicate code paths. No more “works on my machine” issues.

How It Works: Automatic Detection and Smart Fallbacks

Smart GPU uses a multi-layered detection system:

  1. Hardware Detection: Checks if you have NVIDIA GPU hardware
  2. Library Detection: Verifies if CuPy and CuDF are installed
  3. Environment Variables: Respects user preferences (e.g., SMART_GPU_FORCE_CPU=true)
  4. Graceful Fallbacks: Automatically falls back to CPU mode when GPU isn’t available

The library provides a GPUUtils class that acts as a smart wrapper around NumPy/Pandas and CuPy/CuDF:

from smart_gpu import GPUUtils

# Create a custom instance
gpu_utils = GPUUtils(gpu_mode=True)  # Force GPU mode

# Create arrays and DataFrames
arr = gpu_utils.array([1, 2, 3, 4, 5])
df = gpu_utils.DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})

# Synchronize GPU operations
gpu_utils.synchronize()

# Convert to CPU format if needed
cpu_arr = gpu_utils.to_cpu(arr)
cpu_df = gpu_utils.to_cpu(df)

Real-World Benefits: Why This Matters

Faster Development

You write code once and it works everywhere. No more debugging environment-specific issues.

Reduced Bugs

Eliminates the “works on my machine” problem. Your code runs the same way on laptops, servers, and cloud environments.

Easier Deployment

Same codebase works across different platforms:

  • Linux with GPU: Full GPU acceleration
  • Linux without GPU: CPU fallback
  • macOS: CPU mode (GPU libraries not available)
  • Windows: CPU mode (GPU libraries not available)

Better Performance

Automatically leverages GPU acceleration when available, without any code changes.

Simplified CI/CD

No need for separate CPU and GPU test environments. Your tests run the same way everywhere.

Installation and Usage

Basic Installation (CPU-only)

pip install smart-gpu

With GPU Support (Linux only)

pip install smart-gpu[gpu]

Quick Start

from smart_gpu import gpu_utils, array, DataFrame, to_cpu

# Check if GPU mode is active
print(f"GPU mode: {gpu_utils.is_gpu_mode}")

# Create arrays - automatically uses GPU if available
data = array([1, 2, 3, 4, 5])
print(f"Array type: {type(data)}")

# Create DataFrames - automatically uses GPU if available
df = DataFrame({'A': [1, 2, 3], 'B': [4, 5, 6]})
print(f"DataFrame type: {type(df)}")

# Convert back to CPU if needed
cpu_data = to_cpu(data)
cpu_df = to_cpu(df)

Manual Mode Control

Sometimes you need more control over when to use GPU vs CPU:

from smart_gpu import set_gpu_mode, get_gpu_mode

# Force CPU mode
set_gpu_mode(False)

# Force GPU mode (if available)
set_gpu_mode(True)

# Check current mode
print(f"Current mode: {'GPU' if get_gpu_mode() else 'CPU'}")

You can also use environment variables:

# Force CPU mode
export SMART_GPU_FORCE_CPU=true

# Explicitly enable GPU mode
export USE_GPU=true

# Explicitly disable GPU mode
export USE_GPU=false

Platform Support

Platform GPU Support CPU Support
Linux ✅ NVIDIA + CUDA
macOS
Windows

The Technical Details: How I Built It

The core of Smart GPU is the GPUUtils class, which provides a unified interface to both CPU and GPU libraries. Here’s a simplified version of how it works:

class GPUUtils:
    def __init__(self, gpu_mode=None):
        # Auto-detect GPU mode if not specified
        if gpu_mode is None:
            gpu_mode = auto_detect_gpu_mode()
        
        self._gpu_mode = gpu_mode
        
        # Import appropriate libraries
        if self._gpu_mode:
            try:
                import cupy as cp
                import cudf
                self.np = cp
                self.pd = cudf
            except ImportError:
                # Fall back to CPU if GPU libraries aren't available
                import numpy as np
                import pandas as pd
                self.np = np
                self.pd = pd
                self._gpu_mode = False
        else:
            import numpy as np
            import pandas as pd
            self.np = np
            self.pd = pd
    
    @property
    def is_gpu_mode(self):
        return self._gpu_mode
    
    def array(self, data, **kwargs):
        return self.np.array(data, **kwargs)
    
    def DataFrame(self, data, **kwargs):
        return self.pd.DataFrame(data, **kwargs)
    
    def to_cpu(self, data):
        """Convert GPU data to CPU format"""
        if hasattr(data, 'get'):  # CuPy array
            return data.get()
        elif hasattr(data, 'to_pandas'):  # CuDF DataFrame
            return data.to_pandas()
        return data

The library also includes comprehensive detection logic:

def detect_gpu_hardware():
    """Detect NVIDIA GPU hardware"""
    try:
        import subprocess
        result = subprocess.run(['nvidia-smi'], capture_output=True, text=True)
        return result.returncode == 0
    except (FileNotFoundError, subprocess.SubprocessError):
        return False

def is_gpu_available():
    """Check if GPU libraries are available"""
    try:
        import cupy
        import cudf
        return True
    except ImportError:
        return False

def auto_detect_gpu_mode():
    """Auto-detect optimal GPU mode"""
    # Check environment variables first
    if os.getenv('SMART_GPU_FORCE_CPU') == 'true':
        return False
    if os.getenv('USE_GPU') == 'false':
        return False
    if os.getenv('USE_GPU') == 'true':
        return True
    
    # Auto-detect based on hardware and libraries
    return detect_gpu_hardware() and is_gpu_available()

Testing and Quality Assurance

The library includes comprehensive unit tests with 87% code coverage:

  • GPU Detection Tests: Platform detection, hardware detection, library availability
  • Mode Switching Tests: Environment variable overrides, manual mode control
  • GPUUtils Class Tests: Array creation, DataFrame creation, data conversion
  • Convenience Functions Tests: Global functions and utilities
  • Logging Tests: Logger configuration and behavior

You can run the tests with:

# Run all tests
python -m pytest tests/ -v

# Run tests with coverage
python -m pytest tests/ --cov=smart_gpu --cov-report=term-missing

# Generate HTML coverage report
python -m pytest tests/ --cov=smart_gpu --cov-report=html

Lessons Learned: What I’d Do Differently

Building Smart GPU taught me a few things about open source development:

1. Start Simple, Iterate Fast

The initial version was much more complex than it needed to be. I tried to solve every possible use case upfront. The current version is much simpler and more focused.

2. Testing Is Crucial

With a library that needs to work across different platforms and environments, comprehensive testing is essential. The test suite has caught numerous edge cases I wouldn’t have thought of.

3. Documentation Matters

Good documentation is as important as good code. I spent a lot of time making sure the README and examples are clear and comprehensive.

4. Community Feedback Is Valuable

Even though this is a relatively small library, the feedback from users has been incredibly helpful in improving the API and fixing bugs.

The Future: What’s Next

I’m planning several improvements for future versions:

  • Better Error Messages: More helpful error messages when GPU libraries aren’t available
  • Performance Monitoring: Built-in performance comparison between CPU and GPU modes
  • More Libraries: Support for other GPU libraries beyond CuPy and CuDF
  • Memory Management: Better memory management for GPU operations
  • Async Support: Support for asynchronous GPU operations

Conclusion: Why This Matters

Smart GPU solves a real problem that many data scientists and machine learning engineers face. It’s not about making GPU computing faster - it’s about making it simpler.

The library eliminates the complexity of supporting multiple environments, reduces code duplication, and makes deployment more reliable. It’s a small tool, but it solves a big pain point.

If you’re working on data science or machine learning projects that need to run in different environments, give Smart GPU a try. It might just save you from the next “works on my machine” disaster.

You can find the project on GitHub: https://github.com/ardydedase/smart-gpu

And install it with:

pip install smart-gpu

Have you faced similar GPU/CPU compatibility issues? What solutions have you found? I’d love to hear your experiences and feedback on Smart GPU.