Creating a Lakehouse Service: Part 2: Creating K8s Resources
This is part two of a longer series and here is the link to the previous part.
sitrep
Ok, so after part one, we have:
- an init script that can be used to “pre-configure” our duckdb session
- a working k8s pod configuration
- a custom duckdb-based image that runs our init script and launches UI
- an nginx sidecar that actually exposes the duckdb UI
- a configured traefik
IngressRouteto access the nginx exposed port
But, this is only for one pod. We want this to be “scalable”, so that many users can each have their own pod/session and are connected to the same Iceberg tables.
The API
As alluded to in the prologue/background post to this series, the ultimate goal
here is to have some form of command or UI element/button that actually launches
the DuckDB pod/session. I chose to have an always-on API pod running in the k8s
cluster. This API would be reachable to anyone in the network (it has its own
IngressRoute) and would have some pretty simple endpoints to spin up or take
down these DuckDB pods & related resources.
Making an Image
The actual API portion of this (I’m using fastapi) is fairly simple. We need
two endpoints:
- POST to create the resources
- DELETE to delete the resources
This was my first time playing with the Kubernetes Python API, and I really
really enjoyed using it. The kubernetes.config.load_incluster_config()
function is so cool and makes everything else easy to use.
I made different functions for the creation of the different resources. This
helped me see how the yaml specs from the previous step aligned with the Python
API. Example of the build_pod method, where these parameters were populated
with environment variables via os.getenv().
def build_pod(
session_id: str, username: str, catalog: str, s3_key: str, s3_secret: str
):
labels = {
"app": "duckui",
"session-id": session_id,
"duckui-user": username,
"duckui-session": "true",
}
annotations = {
"duckui.bsale.me/created_at": datetime.datetime.utcnow().isoformat() + "Z",
}
env = [
client.V1EnvVar(name="SESSION_ID", value=session_id),
client.V1EnvVar(name="USERNAME", value=username),
client.V1EnvVar(name="S3_ACCESS_KEY_ID", value=s3_key),
client.V1EnvVar(name="S3_SECRET_ACCESS_KEY", value=s3_secret),
client.V1EnvVar(name="CATALOG_ENDPOINT", value=catalog),
]
container = client.V1Container(
name="duckui",
image="bsale/duckui:latest",
stdin=True,
env=env,
ports=[client.V1ContainerPort(container_port=4213)],
)
proxy_volume_mount = client.V1VolumeMount(
name="nginx-conf",
mount_path="/etc/nginx/conf.d",
)
proxy_volume = client.V1Volume(
name="nginx-conf",
config_map=client.V1ConfigMapVolumeSource(name="duckui-nginx-conf"),
)
proxy = client.V1Container(
name="duckui-proxy",
image="nginx:alpine",
ports=[client.V1ContainerPort(container_port=8080)],
volume_mounts=[proxy_volume_mount],
)
pod_spec = client.V1PodSpec(containers=[container, proxy], volumes=[proxy_volume])
metadata = client.V1ObjectMeta(
name=f"duckui-{session_id}",
namespace=DUCKUI_NAMESPACE,
labels=labels,
annotations=annotations,
)
return client.V1Pod(api_version="v1", kind="Pod", metadata=metadata, spec=pod_spec)
and then, with the different build_ methods defined, I created the actual
FastAPI endpoint. I used pydantic models for the request and response models,
but this likely didn’t matter too much since they are pretty simple anyway (just
a bunch of strings).
@app.post("/sessions", response_model=SessionResponse)
def create_session(req: SessionCreateRequest):
session_id = uuid.uuid4().hex[:6] # e.g. "abc123"
pod = build_pod(
session_id,
username=req.username,
s3_key=req.s3_key,
s3_secret=req.s3_secret,
catalog=req.catalog,
)
svc = build_service(session_id)
ingress_body, host = build_ingressroute(session_id)
try:
# 1) pod
core_v1.create_namespaced_pod(namespace=DUCKUI_NAMESPACE, body=pod)
# 2) service
core_v1.create_namespaced_service(namespace=DUCKUI_NAMESPACE, body=svc)
# 3) ingressroute CRD
custom_api.create_namespaced_custom_object(
group="traefik.io",
version="v1alpha1",
namespace=TLS_SECRET_NAMESPACE,
plural="ingressroutes",
body=ingress_body,
)
except Exception as e:
# attempt cleanup here if something failed mid-way
raise HTTPException(status_code=500, detail=str(e))
url = f"https://{host}"
return SessionResponse(session_id=session_id, url=url)
This allowed me to define my Dockerfile as: (I’m using uv for package
management, so that is why I’m using the uv image as the base)
FROM ghcr.io/astral-sh/uv:python3.10-alpine
COPY pyproject.toml ./
WORKDIR /app
RUN uv sync
COPY app ./app
EXPOSE 8000
CMD ["uv", "run", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
Fun with DNS
The actual API creation was very simple and I was pleasantly surprised with how fast everything deployed. Since the actual complexities with the pod & related resources were already ironed out, this part was made much more simple.
However, the issues that I was running into were not related directly to this
project! Since we are creating an IngressRoute that points specifically to the
pod that we are creating, we need a DNS entry to point to that URL that we are
creating (with the randomized session ID). For development purposes, making
these entries manually wasn’t too much overhead.
When I started to ramp up creating and deleting the pods, however, this quickly
became pretty annoying. So, as a result, I elected to make a wildcard DNS entry
for *.local.bsale.me on my DNS to point to Traefik.
Creating a New Pod/Session
This is as easy as issuing a couple requests:
Creating a new pod:
curl -X POST https://duckui-api.local.bsale.me/sessions -d '{"username": "bsale", "s3_key": "<s3_key>", "s3_secret": "<s3_secret>", "catalog": "http://x.y.z.a:8181/catalog"}' -H "Content-Type: application/json"
This returns an output like
{"session_id":"977a26","url":"https://duckui-977a26.local.bsale.me"}
And then deleting the session is as simple as
curl -X DELETE https://duckui-api.local.bsale.me/sessions/977a26
Summary
I think we have a pretty extensible base here: if more parameters are needed at some point in the future, we can pretty easily add those parameters to our Pydantic models and the functions generating the different objects within the k8s cluster.
In the future, I could see making specific parameters/defaults for resource requests or having some “t-shirt” size options that would set these resources with some pre-defined presets.