Skip to content

tested and working #3

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# Git
.git
.gitignore

# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
*.egg-info/
.installed.cfg
*.egg

# Virtual Environment
venv/
ENV/

# IDE
.idea/
.vscode/
*.swp
*.swo

# Environment variables
.env
.env.*

# Docker
.docker/

# Logs
*.log

# Local development
.DS_Store
Thumbs.db
33 changes: 33 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Use an official Python runtime as a parent image
FROM python:3.10-slim

# Set the working directory in the container
WORKDIR /app

# Install curl for DBFS API calls
RUN apt-get update && apt-get install -y curl && rm -rf /var/lib/apt/lists/*

# Copy requirements first to leverage Docker cache
COPY requirements.txt .

# Install any needed packages specified in requirements.txt
# Use --no-cache-dir to reduce image size
RUN pip install --no-cache-dir -r requirements.txt

# Optional: Clean up build dependencies to reduce image size
# RUN apt-get purge -y --auto-remove build-essential

# Copy the rest of the application code into the container at /app
COPY . .

# Make port 8000 available to the world outside this container
# (MCP servers typically run on port 8000 by default)
EXPOSE 8000

# Define environment variables (these will be overridden by docker run -e flags)
ENV DATABRICKS_HOST=""
ENV DATABRICKS_TOKEN=""
ENV DATABRICKS_HTTP_PATH=""

# Run main.py when the container launches
CMD ["python", "main.py"]
157 changes: 157 additions & 0 deletions databricks_mcp_tools.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,157 @@
# Databricks MCP Server Tools Guide

This guide outlines the available tools and resources provided by the Databricks MCP server.

## Server Configuration

The server is configured in the MCP settings file with:
```json
{
"mcpServers": {
"databricks-server": {
"command": "python",
"args": ["main.py"],
"disabled": false,
"alwaysAllow": ["list_jobs"],
"env": {},
"cwd": "/Users/maheidem/Documents/dev/mcp-databricks-server"
}
}
}
```

## Available Tools

### 1. run_sql_query
Execute SQL queries on Databricks SQL warehouse.

**Parameters:**
- sql (string): SQL query to execute

**Example:**
```python
<use_mcp_tool>
<server_name>databricks-server</server_name>
<tool_name>run_sql_query</tool_name>
<arguments>
{
"sql": "SELECT * FROM my_database.my_table LIMIT 10"
}
</arguments>
</use_mcp_tool>
```

**Returns:** Results in markdown table format

### 2. list_jobs
List all Databricks jobs. This tool is in alwaysAllow list.

**Parameters:** None

**Example:**
```python
<use_mcp_tool>
<server_name>databricks-server</server_name>
<tool_name>list_jobs</tool_name>
<arguments>
{}
</arguments>
</use_mcp_tool>
```

**Returns:** Job list in markdown table format with columns:
- Job ID
- Job Name
- Created By

### 3. get_job_status
Get the status of a specific Databricks job.

**Parameters:**
- job_id (integer): ID of the job to get status for

**Example:**
```python
<use_mcp_tool>
<server_name>databricks-server</server_name>
<tool_name>get_job_status</tool_name>
<arguments>
{
"job_id": 123
}
</arguments>
</use_mcp_tool>
```

**Returns:** Job runs in markdown table format with columns:
- Run ID
- State
- Start Time
- End Time
- Duration

### 4. get_job_details
Get detailed information about a specific Databricks job.

**Parameters:**
- job_id (integer): ID of the job to get details for

**Example:**
```python
<use_mcp_tool>
<server_name>databricks-server</server_name>
<tool_name>get_job_details</tool_name>
<arguments>
{
"job_id": 123
}
</arguments>
</use_mcp_tool>
```

**Returns:** Detailed job information in markdown format including:
- Job Name
- Job ID
- Created Time
- Creator
- Tasks (if any)

## Available Resources

### schema://tables
Lists available tables in the Databricks SQL warehouse.

**Example:**
```python
<access_mcp_resource>
<server_name>databricks-server</server_name>
<uri>schema://tables</uri>
</access_mcp_resource>
```

**Returns:** List of tables with their database and schema information.

## Usage Flow

```mermaid
graph TD
A[Start Server] --> B[MCP System Auto-connects]
B --> C[Tools Available]
C --> D[Use Tools]

subgraph "Tool Usage Flow"
D --> E[List Jobs]
E --> F[Get Job Details]
F --> G[Get Job Status]
G --> H[Run SQL Query]
end
```

## Requirements

The server requires the following environment variables to be set:
- DATABRICKS_TOKEN
- DATABRICKS_HOST
- DATABRICKS_HTTP_PATH

These are already configured in the .env file.
Loading