Karakun Developer Hub

Shipping marimo WASM Notebooks as Browser-Based Engineering Tools with Spring Boot

2026-06-17T00:00:00+00:00

Interactive engineering workflows often need more than static forms or fully automated pipelines. In EXOKNOX, curve fitting and extrapolation require engineers to compare algorithms, tune parameters, and inspect results before simulation.

This article explains how we integrated marimo WASM notebooks as browser-based Python tools using Pyodide, Spring Boot security, shared Python wheels, and REST API integration.

The Engineering Data Problem
Why marimo for Browser-Based Python Tools
What We Built: marimo WASM Apps in Spring Boot
How We Built the marimo WASM Deployment
Trade-offs and Constraints of marimo WASM
Benefits of Browser-Based Python Engineering Tools
Conclusion: When marimo WASM Fits
Let’s Discuss

The Engineering Data Problem

EXOKNOX is a platform for managing functional engineering data. Before simulation, engineers start with measured hysteresis data: repeated loading and unloading curves from physical tests. For downstream simulation, these data have to be reduced to a representative single curve and extrapolated beyond the measured force range.

This step cannot be fully automated. The correct result depends on engineering judgment: different smoothing, fitting, and extrapolation strategies can produce curves that are mathematically plausible but physically wrong. Engineers need to compare different algorithms, tune parameters, inspect the result, and repeat until the curve is suitable for simulation.

At the same time, we wanted to move away from EXOKNOX’s Java-based Eclipse RCP frontend.

So the requirement was clear: we needed a lightweight, browser-based Python tool with an interactive UI that could read and write EXOKNOX data.

Why marimo for Browser-Based Python Tools

Marimo is a reactive Python notebook framework. Unlike Jupyter, marimo notebooks are pure Python files — no JSON, no hidden state. Cells are reactive: when a value changes, all dependent cells re-execute automatically. It ships a clean web UI and can be deployed either as a running server application or as a WebAssembly (WASM) application that executes entirely in the browser via Pyodide.

The important requirements for us were: notebooks had to be versionable as normal source files, UI state had to be reproducible, and the same notebook had to support fast local development as well as browser-only deployment. Marimo fit that better than a traditional Jupyter workflow because the notebook is ordinary Python source and the dependency graph is explicit. A custom Vue or React UI would have offered more control, but at a much higher implementation cost for exploratory engineering workflows.

What We Built: marimo WASM Apps in Spring Boot

Before diving into the challenges, here’s what the final system looks like. The frontend/scripting module delivers two interactive data-analysis tools — Curve Editor and Load Fitting — as self-contained Python applications that run entirely in the browser. No Python server is needed at runtime.

The module is organized around two layers:

frontend/scripting/
├── build.gradle.kts          ← Orchestrates the entire Python + WASM build
├── common/                   ← Shared Python library (wheel)
│   ├── pyproject.toml
│   └── src/common/
│       ├── api/              ← HTTP client to access the backend REST API
│       └── curveprocessing/  ← Curve processing functions
└── marimoapps/
    ├── curveeditor/          ← marimo notebook app
    └── curvefitting/          ← marimo notebook app

common is a plain Python package built as a wheel (.whl). It contains all business logic and is shared across both apps — as an editable uv workspace dependency during local development and as a pre-built wheel loaded at runtime inside the browser.

marimoapps contains the marimo notebooks. Each app declares common as a uv workspace dependency so that during development they share a single source tree. For WASM export, the common wheel is bundled alongside the app and loaded at runtime via micropip.

The high-level architecture is straightforward:

Browser
  └── marimo WASM app (Python running in Pyodide)
        ├── Fetches functional data via EXOKNOX REST API
        └── Writes results back via EXOKNOX REST API

Spring Boot Server
  ├── Serves marimo notebooks as static resources
  ├── Enforces OIDC authentication
  └── Provides the EXOKNOX REST API

There is no marimo server process and no Python runtime on the backend. Just static files, served securely, executing in the client’s browser.

The resulting tools are embedded as web pages in the browser:

Curve Editor

Curve Fitting

Getting to this architecture required solving three concrete challenges.

How We Built the marimo WASM Deployment

Moving from proof of concept to production meant solving deployment, API, and development workflow constraints one by one.

Challenge 1 — Secure Static Notebook Deployment Without a marimo Server

The obvious deployment for marimo is as a server: you run marimo run notebook.py and marimo starts a WebSocket-backed application server that executes Python on the backend. We evaluated this and rejected it for two reasons.

Security surface. A running marimo server executes arbitrary Python code from notebooks that are essentially customer property. Even sandboxed, this is an attack vector we preferred not to manage.

Infrastructure complexity. For on-premise installations — which many EXOKNOX customers require — spinning up and managing a persistent marimo server (or per-session containers in Kubernetes) places requirements on the customer’s infrastructure that we cannot guarantee.

The WASM approach removes the need to execute notebook Python on the backend. That significantly reduces the server-side attack surface, although the browser-side notebook still has to be treated like any authenticated frontend code. The Python runtime lives in the browser, execution is sandboxed by the browser’s security model, and the server is stateless. Backend access is secured by OIDC authentication and managed entirely by the browser.

From Notebook to Static WebAssembly Assets

marimo can export a notebook as a self-contained WASM application:

marimo export html-wasm notebook.py -o dist/notebook.html --mode run

The output is a static directory with an index.html, the notebook code, and the assets needed by the marimo WASM runtime. We serve this from Spring Boot as static content, protected behind Spring Security’s OAuth2 login flow.

SecurityFilterChain securityFilterChain(HttpSecurity http, 
                                        GrantedAuthoritiesMapper grantedAuthoritiesMapper, 
                                        OpaqueTokenIntrospector opaqueTokenIntrospector, 
                                        JwtAuthenticationConverter jwtAuthenticationConverter, 
                                        SecurityFilter securityFilter) {

    // takes care of HTTP authorization - authorizationCustomizer secures the protected paths
    http.authorizeHttpRequests(this::authorizationCustomizer);

    // takes care of login (authentication flow)
    http.oauth2Login(oauth2 -> oauth2.userInfoEndpoint(userInfo -> userInfo.userAuthoritiesMapper(grantedAuthoritiesMapper)));

    // takes care of bearer tokens in the HTTP header - resourceServerCustomizer handles opaque and jwt tokens
    http.oauth2ResourceServer(oauth2 -> resourceServerCustomizer(oauth2, opaqueTokenIntrospector, jwtAuthenticationConverter));
}

The notebook is never accessible without authentication.

Wiring the Build with Gradle and uv

We needed the WASM export to happen automatically as part of the standard Gradle build — not as a manual step. The build.gradle.kts uses the community plugin com.pswidersk.python-uv-plugin to drive uv commands from Gradle tasks, plus org.openapi.generator to generate Pydantic model classes from the backend’s OpenAPI specs.

The pipeline runs in four phases on every build:

Phase 1 — OpenAPI Code Generation. The backend service modules expose REST APIs defined by OpenAPI YAML files. Gradle scans those specs and runs OpenAPI Generator to produce Pydantic model classes. We generate only the model layer, not the transport layer, because the generated clients assume a normal CPython HTTP stack, while the WASM runtime needs browser-based fetch through pyodide.http. The generated models are synced into common/src/exoknox__client/models/, giving the shared library strongly typed data structures for every backend API response.

Phase 2 — Common Library Build. uv build --managed-python produces common/dist/common-0.1.0-py3-none-any.whl. This is a pure-Python, platform-neutral artifact that the browser will later fetch and install.

Phase 3 — Per-App WASM Export. For each app discovered by scanning marimoapps/*/pyproject.toml, Gradle creates a task chain:

uvBuild — builds the app package with uv.
uvWasm — runs marimo export html-wasm, which bundles the notebook with the Pyodide Python runtime and produces a self-contained dist/wasm/ directory.
uvWheelCommon — copies the common wheel into dist/wasm/public/, making it fetchable by the browser at a relative URL.
copyWasmApplication and buildWasm — copy the output to build/wasm//.

The top-level buildWasm task aggregates all per-app tasks. The standard Gradle build task depends on buildWasm, so the full pipeline runs on every build with no extra steps.

Phase 4 — JAR Packaging. processResources includes the WASM output in the Spring Boot JAR:

from("marimoapps/$appName/dist/wasm") into("wasm/$appName")

Spring Boot then serves these static resources, making the apps accessible at:

/exoknox/marimo/curveeditor/index.html
/exoknox/marimo/loadfitting/index.html

Challenge 2 — REST API Integration from the Browser

A marimo WASM notebook runs entirely in the browser. It has no direct access to databases or backend services — it can only make HTTP requests. The data access model is exactly the same as any other frontend application. In the module we have a directory with Python modules that are bundled as a wheel into the WASM, providing access to the EXOKNOX REST API.

Browser-Based HTTP Requests with Pyodide

We built a Python module that uses Pyodide’s HTTP module to send HTTP requests to the backend (http_client.py). The notebook fetches data from the EXOKNOX REST API using the user’s existing browser session. In WASM mode, credentials="include" lets the browser attach the same authenticated session cookies it would use for the rest of the application.

import pyodide.http as http

async def get_request(url: str) -> str:
    request = await http.pyfetch(url, method="GET", credentials="include")
    response = await request.string()
    status_code = getattr(request, "status", None)
    if 200 <= status_code < 300:
        return response
    else:
        raise Exception(f"Error fetching {url} : {status_code}")

The write path mirrors the read path:

async def post_request(url: str, body: str) -> str:
    request = await http.pyfetch(
        url,
        method="POST",
        credentials="include",
        body=body,
        headers={"Content-Type": "application/json"}
    )
    response = await request.string()
    status_code = getattr(request, "status", None)
    if 200 <= status_code < 300:
        return response
    else:
        raise Exception(f"Error posting to {url} : {status_code}")

A Typed REST API Layer

On top of the raw HTTP calls, exoknox_api.py provides functions that encapsulate the REST API endpoints, giving notebooks clean, typed access to backend data.

from common.api.http_client import get_request

async def read_dataset(dataset_id: str, base_url: str) -> ChannelsDTO:
    url = f"{base_url}/channels?dataSetId={dataset_id}"
    data = await get_request(url)
    return ChannelsDTO.from_json(data)

When the user has completed their analysis — fitted a curve, computed new values, reviewed the result in an interactive chart — a save action posts the result:

async def save_dataset(scripting_request: ScriptingResultRequestDTO, base_url: str) -> ScriptingResultResponseDTO:
    url = f"{base_url}/scripting-result"
    data = await post_request(url, scripting_request.to_json())
    return ScriptingResultResponseDTO.from_json(data)

Notebook Integration

Each marimo notebook contains a dedicated cell to load data on startup. It reads the dataset ID from URL query parameters, derives the base URL from the notebook’s current location, and hands back either the loaded channels or an error message:

@app.cell
async def _(mo, exoknox_api):
    qp = mo.query_params()
    datasetid = qp.get("datasetid", "")

    nb = mo.notebook_location()
    from urllib.parse import urlparse
    parsed = urlparse(str(nb))
    base_url = f"{parsed.scheme}://{parsed.netloc}"

    try:
        channels = await exoknox_api.read_dataset(datasetid, base_url)
        loading_error_message = ""
    except Exception as error:
        loading_error_message = f"Error loading data: {error}"
        channels = None
    return (base_url, channels, loading_error_message)

Saving is equally straightforward. A save button triggers a cell that posts results back only when the button is actually pressed:

@app.cell
async def _(mo, exoknox_api, base_url, save_button, datasetid, x_fitted, y_fitted):
    mo.stop(not save_button.value)  # if button was not pressed, return
    with mo.status.spinner(title="Saving Data Set...") as _spinner:
        from exoknox_scripting_result_client.models import ScriptingResultRequestDTO
        try:
            result = await exoknox_api.save_dataset(
                base_url=base_url,
                scripting_request=ScriptingResultRequestDTO(dataSetId=datasetid, x=x_fitted, y=y_fitted)
            )
            saving_error_message = ""
        except Exception as e:
            result = None
            saving_error_message = "Error saving data"
    return result, saving_error_message

Challenge 3 — Development Mode vs. Production WASM Mode

This was the most practically fiddly challenge. During development, a running marimo server is the right environment: fast feedback, full Python library support, no Pyodide compilation step. In production, the notebook runs in WASM under Pyodide.

Start marimo in edit mode with uv run marimo edit curve_fitting.py. In this mode, you build your board interactively: add and edit cells, add UI elements, and see results update immediately. Changes propagate automatically to dependent cells, so there’s no manual rerun flow. Everything you do is saved instantly to the underlying Python file, making the board both live and persistent at the same time.

But this is not the same environment as the production setup that uses Pyodide. These two environments differ in two important ways:

Import availability. Pyodide supports a substantial subset of the scientific Python ecosystem (NumPy, SciPy, Pandas, Matplotlib), but not every library. Anything with C extensions that Pyodide has not pre-compiled is unavailable. The Python version in each pyproject.toml must match the Python version provided by the Pyodide runtime used by marimo’s WASM export. In our setup that means pinning Python to ==3.12.*, because the prebuilt Pyodide wheels we rely on are built for that runtime.

Available APIs. Browser-based async execution has different constraints from server-side CPython. In particular, HTTP calls need to go through browser fetch APIs exposed by Pyodide (pyodide.http), rather than requests, httpx or a normal socket-based client.

Our solution was to isolate the environment-specific code behind a thin detection layer defined at the top of each notebook:

@app.cell
async def _(mo):
    from pathlib import Path
    nb = mo.notebook_location()    # In WASM, this is the URL of the webpage, in non-WASM, this is the directory of the notebook 
    wasm_marimo = not isinstance(nb, Path)
    return wasm_marimo

We then thread wasm_marimo through to the HTTP client functions. In http_client.py, the flag drives two completely different transport implementations:

async def fetch_data(url: str, wasm_marimo: bool) -> dict:
    if wasm_marimo:
        # In WASM: the browser provides the session cookie to access the server
        request = await http.pyfetch(url, method="GET", credentials="include")
        response = await request.string()
        status_code = getattr(request, "status", None)
        if status_code == 200:
            return response
        else:
            raise Exception(f"Error fetching {url} : {status_code}")
    else:
        # Deployed locally: calls need a bearer token to access the backend
        token = _get_access_token()  # login if necessary

        import urllib.error
        import urllib.request
        request = urllib.request.Request(url, headers={"Authorization": f"Bearer {token}"})
        try:
            with urllib.request.urlopen(request) as request:
                response = request.read().decode(UTF_8)
        except urllib.error.HTTPError as error:
            raise Exception(f"Error fetching {url} : {error}")
        except urllib.error.URLError as error:
            raise Exception(f"Error fetching {url} : {error}")

    return response

This pattern adds a modest amount of boilerplate per notebook. We accepted it as the cost of a comfortable development experience. The alternative — always developing against a local WASM build — would have meant a slow compile cycle on every change.

Browser Bootstrap with Pyodide and micropip

When a user opens the app, the browser downloads and instantiates the Pyodide WASM binary. The notebook then detects its environment via the wasm_marimo flag described above. If running in WASM, it uses micropip — Pyodide’s in-browser package manager — to install the common library wheel from the same origin, together with other libraries used by the notebook:

if wasm_marimo:
    base_url = mo.notebook_location() 
    import micropip
    common_url = f"{base_url}/public/common-0.1.0-py3-none-any.whl"
    await micropip.install([common_url, "plotly", "anywidget"])

This is the key to the whole architecture — what we call the wheel-in-public pattern. The shared common library is built as a platform-neutral wheel and placed in the public/ subdirectory of the WASM output during the Gradle build. The browser fetches it at startup via a relative URL and installs it with micropip, achieving code reuse across both apps without any server-side Python. After that, the app runs fully client-side, calling the backend REST API from the browser using the OAuth-aware HTTP client in common/api/.

Trade-offs and Constraints of marimo WASM

No architectural decision is free. These are the constraints we accepted:

Pyodide’s library limitations. If a script requires a library that Pyodide has not compiled, it cannot run in WASM. So far this has not been a problem — NumPy, SciPy, and Pandas cover our use cases. This also prevented us from generating the complete client with OpenAPI as this is not using pyodide.http.

Startup latency. Starting up the marimo app takes some time because the browser first has to initialize Pyodide and load notebook dependencies.

Performance. WASM Python is slower than native Python. For the data sizes we work with (thousands to low tens of thousands of data points), this is unnoticeable. For genuinely large datasets, data should be downsampled on the server.

Notebook source is embedded in the HTML. The WASM export includes the Python source. Spring Security protects access, but anyone authenticated can view source. This is an accepted trade-off; the notebooks contain customer-specific logic that customers themselves should be able to see.

Notebook architecture limits app complexity. With marimo notebooks, it is not easy to build larger, more complex applications. We therefore use this approach for focused, interactive analysis tools rather than full-featured application surfaces.

Dual-mode boilerplate. The wasm_marimo flag is a small but real maintenance surface. We mitigated this by keeping it minimal and consistent across notebooks.

Benefits of Browser-Based Python Engineering Tools

Fast time to value. We can build new interactive analysis tools as Python notebooks instead of full frontend features.
Easy customer-specific extensions. It is straightforward to adapt notebooks to individual requirements.
Strong plotting capabilities. Rich, interactive visualizations are available out of the box.
Practical engineering UI components. marimo includes useful prebuilt elements for technical workflows.
No backend Python execution. With marimo WASM, code runs in a fully browser-sandboxed environment.
Simple deployment model. The production artifact is static content packaged into the existing Spring Boot application.

Conclusion: When marimo WASM Fits

Marimo’s WASM deployment mode gave us something we could not easily get elsewhere: a fully interactive Python data environment that runs in the browser, requires no server-side Python runtime, and integrates naturally with an existing Spring Boot security model.

The combination of reactive notebooks, Pyodide’s scientific Python stack, and standard REST-based data access covers the vast majority of customer-specific scripting use cases we encounter — at a fraction of the implementation cost of our previous approach. The full stack looks like this:

Concern	Technology
Notebook authoring	marimo
Python runtime in browser	Pyodide (via marimo `html-wasm` export)
In-browser package loading	`micropip`
Python package management	`uv`
Shared logic distribution	Pure-Python wheel (`common-0.1.0-py3-none-any.whl`)
API type safety	`OpenAPI` Generator → `Pydantic` models
Build orchestration	Gradle 9 with `python-uv-plugin`
Deployment	Spring Boot static resource serving

For teams considering a similar architecture, the deciding questions are simple: do your dependencies run in Pyodide, and are your data volumes suitable for browser-side execution? If yes, marimo WASM offers a compelling deployment model: interactive Python tools shipped as static assets, protected by the same authentication and API layer as the rest of the application.

Let’s discuss!

Do you have questions about browser-based Python tools, marimo WASM, Pyodide, or Spring Boot integration? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Open Source Doesn’t Need Another Pull Request. It Needs Triage.

2026-06-09T00:00:00+00:00

Most engineers think contributing to open source starts when you write code.

But on busy open source projects, the most valuable contribution is often not another pull request. It is triage: clarifying issues, connecting related work, identifying incomplete fixes, and helping maintainers decide what should happen next.

Large issue trackers are not just backlogs. They are the project’s shared memory. When that memory is vague, outdated, or misleading, contributors duplicate work, maintainers merge partial fixes, and users keep running into problems the project may already have half-solved elsewhere.

This article explains why open source triage is engineering work, how it helps maintainers, how to distinguish related issues from true duplicates, and how AI coding agents can support triage without replacing human judgment.

Triage is debugging the issue tracker
Why this matters more than most people think
Related isn’t the same as duplicate
The 5-minute workflow I wish more people used
What good triage comments sound like
The fastest ways to make triage worse
AI makes human triage more important, not less
Final thoughts

Before getting into the workflow, here is the kind of situation where triage matters.

Imagine a successful open source project with thousands of open issues. Somewhere in that backlog are three related reports:

one says the Web UI only accepts images
another asks for document uploads, such as PDF and Word files
a third asks for support for uploading any file type

In the pull request queue, two related fixes already exist:

one changes only the file picker in the frontend
another changes both the file picker and the backend

Now one engineer finds the first issue and starts writing a third pull request because the bug looks easy to fix. At the same time, a maintainer sees the frontend-only PR, assumes it solves the whole problem, and merges it. The UI now looks fixed, but the backend still drops the files. Multiple people have spent their evening on the same problem, and the issue tracker is now misleading everyone.

The most expensive open source bug is often not the hardest one. It is the one that gets fixed three times.

You might think this is just a communication problem. In a company, people might notice each other’s work in a stand-up or Slack channel. Open source does not work like that.

On large open source projects, contributors work across time zones and personal schedules. Maintainers cannot manually connect every duplicate issue, related pull request, partial fix, and stale report. If they did, they would have no time left to review the work that actually needs to be merged.

That is why missing links between issues and pull requests are not a minor inconvenience. If one engineer claims an issue but another finds a duplicate elsewhere, they may never realize that someone is already halfway through the fix, or that two pull requests already address the same problem.

I almost did exactly that. While using OpenClaw on my Android phone, I noticed that tapping the paperclip in the Web UI only let me choose images, while Telegram let me upload any file. Since OpenClaw is an AI coding assistant, I asked it to investigate whether this was a bug. It found the technical cause quickly and immediately asked whether it should prepare a pull request.

Instead, I asked it to check the issue tracker and pull requests first. That changed everything.

Several related issues and two pull requests already existed. One PR changed only the frontend. The other changed both frontend and backend. The key detail was that we already knew the backend dropped these files, so a UI-only fix would create a feature that looked complete but still failed.

At that point, writing another fix was the least useful thing I could do. The useful contribution was mapping the existing work so maintainers could see the overlap, close duplicates, and focus on the pull request that actually solved the whole problem. That is triage.

Triage is debugging the issue tracker

At first glance, triage sounds boring.

It sounds like paperwork. It sounds like process. It sounds like the thing you do when you can’t contribute code. I think that’s backwards.

On a busy open source project, triage is one of the most important contributions you can make, because it changes what everyone else does next.

The best short definition I’ve come up with is this:

Triage is debugging the issue tracker until the next action becomes obvious.

That next action might be:

Is this reproducible?
Is there already a better issue for it?
Is there already a PR for it?
Does that PR solve the whole problem, or only part of it?
Is this still relevant, or has later work already changed the behavior?

A good triage comment usually doesn’t try to do everything.

It does one job: it reduces uncertainty.

If you only remember one thing from this article, remember this:

A short, accurate comment is better than a long, uncertain one.

Why this matters more than most people think

Duplicate work is easier than people realize

As we saw with the OpenClaw example, the asynchronous nature of open source makes it incredibly easy for two people to spend their evening on the exact same problem without realizing it.

That sounds small until it happens again and again.

Then it becomes a tax on everyone involved.

A merged PR isn’t the same as a solved issue

A PR title can sound complete. The green “Merged” badge feels like a finish line. But a merged PR doesn’t automatically mean the whole problem is gone.

Recently, a severe issue was reported in OpenClaw: sending a binary file via Telegram caused the bot to dump raw, unsanitized bytes into the context. A single file could blow up the prompt to around 460,000 tokens. This wasn’t just a bug; it posed a massive risk of resource exhaustion and cost amplification.

Shortly after the issue was reported, an OpenClaw contributor opened a PR to fix it. Because the issue affected prompt handling and could dramatically increase token usage, a maintainer merged the change quickly. When I looked at the diff, the fix seemed surprisingly small for the scope of the problem. I deployed the updated OpenClaw branch locally and tried to reproduce the issue myself. Given the severity of the problem and the volume of incoming work, I completely understood why the maintainer had merged it so quickly.

Normally, your time is better spent triaging open issues than rechecking merged PRs. But in this case, the diff left me unsure whether the entire problem had actually been solved.

Uploading a 100 KB EPUB file immediately blew up my local prompt to 231,000 tokens.

The PR author had fixed part of the issue, but not all of it. The PR also skipped the repository’s verification checklist, so nobody had explicitly confirmed the fix worked outside the code review itself.

If the missing verification had been obvious earlier, someone else could have tested the change before it was merged. Whether a human ignores a template or an AI omits it entirely, maintainers lose important context for judging how much trust to place in a fix.

After I patched the remaining upload leak, I kept digging. Experience tells me that where there is one bug, there are usually neighbors. Instead of stopping at the narrowest interpretation of the bug, I tried replying to a message of a previously sent binary file on Telegram. Sure enough, it pulled the raw bytes into the context again. It was a different path, but the same broader problem.

I packaged both fixes into a new PR (#66877), which the maintainers merged an hour later.

The real lesson here is about what code review looks like under pressure. In a perfect world, every PR would be tested locally before it is merged. In reality, maintainers often have to rely on diffs, contributor claims, and community feedback to decide whether a fix is ready.

This is exactly where triage steps in. You don’t have to write the code to save the day. If you take an open PR, test it locally, and leave a comment saying, “I deployed this branch and followed the reproduction steps, but the issue is still present,” you just saved the project from shipping a broken feature. Catching an incomplete fix before it gets merged makes you a hero to the maintainers.

In my case, the broken PR was merged within minutes because it was an urgent security fix, leaving no window for the community to verify the code before it landed. But normally, PRs sit in the review queue for days or weeks. That gives you plenty of time to pull the branch, test the fix yourself, and raise that exact flag.

However, if the code actually works, commenting, “I deployed this locally and confirmed: on main the issue happens, but on this branch it is completely resolved,” is extremely valuable for a maintainer. Doing the manual verification that maintainers don’t have time for is one of the most valuable triage contributions you can make.

A messy issue tracker lies to people

An unclear issue tracker doesn’t just look untidy. It actively changes what people decide to do.

Someone spends an evening reproducing a bug that already has a PR. A maintainer assumes the problem is solved because the title sounds right. A contributor opens a duplicate issue for a “PDF upload error” because the original report was vaguely titled “mobile attachment bug” and nobody ever added the specific keywords or error codes that would have made it show up in a search.

Bridging these gaps doesn’t require you to be the lead architect. It just means applying a technical perspective to look past the surface-level description. In my case, it was easy to assume that because Telegram already allowed all file types, the backend was fine - making a frontend-only fix look like the complete answer. Triage is that “Wait a minute” moment where you pause to verify if that assumption is actually true.

Whether you’re identifying a shared root cause between two different-looking bugs, or noticing that a PR only masks a symptom instead of fixing the logic, you’re using engineering judgment. That’s why keeping the issue tracker trustworthy isn’t admin work; it’s engineering work.

It’s one of the best ways to start contributing

Many engineers assume they need deep knowledge of the codebase before they can contribute to open source.

That’s understandable, but it’s often wrong.

You don’t need to know every service, build step, and deployment detail to notice that:

the reproduction steps are missing
the version is missing
the PR description doesn’t match the changed files
two PRs address the same issue but aren’t linked to the issue yet
a PR fixes an issue that hasn’t been linked
two issues describe the exact same problem from slightly different angles
a PR only touches the frontend because it looks as though the backend is already “done” (as I initially assumed). You don’t need to know the code to ask: “Since Telegram already allows all file types, are we sure the Web UI uses the exact same API path, or are we just hoping it does?”

That level of careful reading is an immediate, high-value contribution. On a project with a large review queue, the last thing maintainers need is another item added to it. Even a one-line “quick fix” adds to the noise. Connecting the dots is more valuable because it helps clear the backlog instead of adding to it.

This is one of the easiest mistakes to make if you’re new to open source. Several things can exist in the same part or functionality of the software without being the same issue.

In my OpenClaw example, all of these were about file uploads:

the web UI only accepts images
one issue asks for support for document files
a broader issue wants support for any kind of file
one PR changes only the frontend
another changes frontend and backend
there was also a question about whether uploading files actually worked at all

Those items are clearly connected, but they aren’t identical. If you treat them as a single “bucket” and start closing them simply based on which one arrived first, you create a chain reaction of waste:

1. The “Incomplete Fix” Trap

It’s easy to think, “Obviously, I would keep the PR that fixes both the frontend and the backend.” But in reality, triagers and maintainers rarely have the time to deeply compare the code of every duplicate. Usually, they just see two PRs with similar titles that claim to fix the same problem.

Ideally, the PR author would leave a note saying, “I’m opening this because PR #123 is an incomplete fix.” But in practice, most contributors don’t realize they should search for existing, unlinked PRs before writing code. They usually have no idea the older PR even exists.

If you just assume the two PRs are identical and blindly close the newer one as a duplicate, you might accidentally bury the actual solution. If the older, partial fix gets merged, the feature will still be broken. But since the “right” PR was already closed, nobody will think to look for it. Weeks from now, a third contributor will end up spending hours just to rewrite the exact same backend code that was already sitting in the PR you closed.

2. The Scope Trap

If you close the request for “any file type” in favor of the narrower “documents support,” you have unintentionally limited the project’s potential. It creates a confusing experience where users can send images through the Telegram bot, but not through the Web UI. This mismatch happens when a triager takes an issue author’s request too literally. Issue authors often think about their immediate needs, like uploading a PDF, but a good triager has to translate that narrow complaint into a broader system requirement. If you just assume the issue author’s specific example is the whole story, you guarantee that developers will have to rewrite the exact same code the moment someone else tries to upload a code file.

3. The “Cannot Reproduce” Trap

The most dangerous mistake in triage is assuming a bug doesn’t exist just because you can’t reproduce it yourself. It’s incredibly common to read an issue report, try it out, see it working perfectly, and immediately close the issue as non-reproducible. But unless the issue author is an LLM, they probably aren’t hallucinating. If you can’t trigger the bug, it’s almost always because you don’t fully understand the issue author’s environment, or because they left out a specific detail that felt too “obvious” to mention. When you experience this, the best thing you can ask yourself is: “What am I missing?”

You wouldn’t believe how many issues get closed this way, leaving real, systemic bugs hiding in the codebase to frustrate users for years. Maintainers don’t do this because they don’t care. With thousands of issues to fix, they just don’t have the time to chase down missing details for every vague report.

This is exactly where you can step in. By asking clarifying questions, recreating the issue author’s environment as closely as possible, or testing different configurations, you provide the exact help maintainers often don’t have time to provide themselves. When you can’t reproduce a bug, don’t ask, “Should I close this?” Ask, “What am I missing?”

Good triage means looking for the missing variable instead of closing the issue at the first obstacle.

There’s one more trap that causes the same kind of waste, even when no duplicate is involved.

4. The “Confident Diagnosis” Trap

Some issue reports do more than describe a problem. They also explain what the issue author believes caused it. That’s helpful but it can also hide the next thing you should verify.

Imagine an issue that says:

The database doesn’t save profile changes.

I changed my display name, clicked Save, saw a success message, refreshed the page, and the old name was back.

I checked the user record in the database and found that it still contains the old display name, so I suspect there’s an issue with writing the change to the database.

The issue author may be right. Maybe the backend really doesn’t persist the change to the database.

But the problem could also be somewhere else: the frontend never sent the changed field, the backend rejected the change but the frontend still showed a success message, the backend wrote to a different record, or the write was attempted but rolled back.

Good faith means assuming the issue author is trying to help. It doesn’t mean assuming their diagnosis is automatically correct.

A useful triage comment could say:

Thanks for the clear steps and for checking the database.

You mentioned that after clicking Save, you see a success message, but the user record in the database still contains the old display name.

I’d like to confirm where in the flow the change gets lost. Could you check the browser network tab when clicking Save and see whether the request contains the changed display name and whether the response is successful?

That would help narrow this down: either the frontend doesn’t send the change, the backend rejects it but the frontend still shows success, the backend writes to a different record, or the write is attempted but rolled back.

This kind of comment takes the issue seriously without accepting the first diagnosis too quickly. It tests the simple explanations first, but still leaves room for the issue author to have information you don’t have yet.

That balance matters. If you assume the issue author is wrong, your comment can sound dismissive and make them defensive. If you assume the issue author’s diagnosis is right, you may skip the simplest explanation and turn a misunderstanding into a misclassified bug, a misplaced feature request, or a much larger investigation than necessary.

Don’t waste a misunderstanding

And if the issue author comes back and says, “You were right, I misunderstood how this works,” don’t treat that as the end of the story.

That misunderstanding may still be useful.

This isn’t limited to configuration misunderstandings. Whenever an issue ends without any change to code, docs, examples, error messages, or repository guidance, pause for a moment: is there a small change that would have helped the issue author find the answer before needing to open an issue?

You can ask:

Thanks for confirming, glad it works now.

One more thought: this sounds like something the docs could make clearer. What should the docs say so the next person can find the answer before needing to open an issue?

If the issue author suggests a clearer wording, don’t let that disappear in the thread. If you have time, turn it into a small docs PR and link the PR back to the original issue. You don’t need to be a maintainer to do that.

If you don’t have time to make the docs change yourself, create a follow-up issue instead. Link the original discussion and write down what was confusing, so someone else can pick it up later.

Even if the original issue has already been closed, this can still be valuable. If comments are still open, you can ask the question there. If not, you can still open a docs issue that points back to the original discussion.

A misunderstanding isn’t always just user error. Sometimes it’s evidence that the project is teaching the right thing in the wrong way.

That’s still triage. You took one confusing issue and turned it into something the project can learn from.

In the end, good triage sits in the middle. It keeps the differences that matter and removes the duplication that doesn’t. Sometimes the real question isn’t just “what links to what?” It’s “what kind of problem is this, actually?” In my case, the Web UI behavior looked like a bug because Telegram allowed arbitrary file types. But after reading more closely, it also looked plausible that the Web UI had simply been implemented as an image-only flow on purpose. Good triage makes that kind of distinction visible instead of pretending it’s obvious.

The 5-minute workflow I wish more people used

You don’t need a large, complicated process. You need one simple enough that you’ll actually use it. If a workflow feels like a chore, you’ll skip it the second you get busy. If it’s natural and intuitive, it actually gets followed.

1. Read the whole thing

Don’t triage from the title. Read the body, screenshots, reproduction steps, version information, linked issues, recent comments, and, for PRs, the changed files.

A surprising amount of poor triage comes from people reacting to names instead of content. As you read, slow down whenever the issue author moves from “this happened” to “therefore this is the cause.”

This describes what happened:

I changed my display name, clicked Save, saw a success message, refreshed the page, and the old name was back.

This is a possible cause:

There must be an issue with writing to the database.

The issue author may be right. But the cause could also be the frontend request, backend validation, a different database record, a rollback, stale cached data, or a draft state.

That doesn’t mean the issue author is wrong. It just tells you what to check next.

2. Ask whether there is enough information to act

Before doing detective work, ask a simpler question: Is there even enough detail here to classify the problem?

For an issue, that usually means:

exact version
reproduction steps
expected behavior
actual behavior
environment
logs or screenshots, when relevant

For a PR, it can also mean:

scope
linked issues
tests
migration notes
whether the changed files actually match the claim

If the basics are missing, asking for them may already be the most useful thing you can do.

3. Search for existing context

Before you comment, build a tiny map in your head by searching for:

the same symptom
a broader issue in the same area
a deeper issue behind the symptom
an existing PR that may already cover it
a PR that may only cover part of it
newer releases or merged PRs that may already have changed the behavior

That small map is often enough to stop you from commenting too early or opening something that never needed to exist.

4. Decide the one job of your comment

Before writing anything, finish this sentence:

The job of this comment is to…

For example:

ask for missing details
link this issue to a broader one
point out that the PR is partial
tell readers where the real implementation work is happening
explain that two related issues aren’t duplicates

If your comment tries to do five jobs, it will usually do none of them well.

5. Only state what you have verified

This is the rule I trust most: Don’t guess. Only state exactly what you’ve personally checked.

Not the longest explanation. Not the most confident-sounding assumption. Just the verified facts.

Comment on the issue when the main point is that the problem needs clarification, is narrower or broader than another issue, or already has a relevant PR.

Comment on the PR when the main point is that the proposed fix is partial, broader than the linked issue, or overlapping with other work.

Only comment on both the issue and the PR if the two audiences (the issue authors reporting the bug and the developers reviewing the code) genuinely need different information.

And whatever you do, match your certainty to what you actually verified.

Don’t write:

Fixed by #123.

because the title sounds right. Write it only if you checked the diff and are confident it really solves the issue.

If you’re not there yet, softer wording is better:

This may be addressed by #123.

That sounds like a small difference. In triage, it isn’t.

On GitHub, if you write Fixes #123 in a PR description, the linked issue will usually get closed automatically once the PR is merged. If you’re wrong, the bug stays in production, users get frustrated, and someone has to open a new issue weeks later. That false confidence is expensive.

What good triage comments sound like

The best triage comments are usually short, concrete, and slightly boring in the best possible way.

They don’t try to sound clever. They don’t try to sound authoritative. They remove confusion.

A useful triage comment usually does three things:

Lead with the conclusion.
Explain why.
Then stop.

That doesn’t mean the comment has to be tiny. It means the comment should contain the information needed to make the next decision without requiring everyone else to reconstruct your reasoning.

Here are four real patterns from the OpenClaw issues and PRs that inspired this post.

Linking a narrower issue to a broader one

From Issue #50337:

This issue is similar to #56344.

This issue is about allowing documents to be uploaded in addition to images through the Web UI, while #56344 is about allowing all file types. I prefer the approach of #56344, because it’s consistent with what channels like Telegram allow and would also cover other useful file types like .patch, .md, .adoc, etc.

Because of that, I think this issue could be closed in favor of #56344. The PR for that broader change is #57707.

This works because it doesn’t just say “duplicate” or “related”. It explains the relationship between the issues: one is narrower, the other is broader, and the broader one already has an implementation path.

The useful pattern is:

Name the related issue.
Explain how it’s related.
Explain which one should stay open and why.
Point to the PR, if one exists.

Explaining that a PR is only partial

From PR #54248:

This PR is incomplete, because it only covers the UI side of the upload flow.

Files other than images would still not be handled properly by the backend, so this would not fully solve the problem. I think this PR could be closed in favor of #57707, because that one covers both the frontend and backend parts of the same issue.

This works because it leads with the conclusion but still includes the reason that matters. The problem isn’t that the PR is bad. The problem is that it only fixes one side of the flow.

The useful pattern is:

State that the PR is incomplete.
Say exactly which part it covers.
Say exactly which part is still missing.
Link to the more complete PR.

Pointing issue readers to the implementation

From PR #57707:

Implements #56344 and #58423.

It also includes the smaller change requested in #50337, since allowing all file types also covers allowing documents in addition to images through the Web UI.

This works because it makes the scope of the PR explicit. Someone reading one of the issues can understand that this implementation covers more than one request, and why the smaller request is included in the broader one.

The useful pattern is:

List the issues the PR implements.
Mention smaller related requests that are also covered.
Explain why they are covered, instead of assuming that the link is obvious.

Asking for the one detail that matters next

Adapted from Issue #56375:

The upload button isn’t just decorative. Uploading image files through it works for me.

You’re on 2026.3.24. Can you check whether this still happens on 2026.4.2?

It would help if you could share the file type you are trying to upload, the actual file if you can attach it, whether this only affects one file or different file types, whether it also happens in another browser, what OS/environment OpenClaw runs on, and whether you use an ad blocker, VPN, router-level blocking, or something similar.

Since the screenshot shows a custom API setup, it would also help to know whether this happens with other providers/models and to see the relevant part of your openclaw.json, with secrets redacted.

One important detail: the Web UI currently only supports image uploads. So if your file picker lets you choose a non-image file, that could explain what you are seeing. This may also change with #57707, which adds support for all file types in the Web UI.

This works because it doesn’t just ask for “more information”. It asks for the specific information that would help separate several possible causes: an old version, an unsupported file type, a browser issue, an environment issue, a blocking extension, or a provider/model configuration problem.

The useful pattern is:

State what you could verify yourself.
Ask the issue author to test the newest relevant version, if they aren’t already using it.
Ask for the smallest useful set of missing details.
Explain any limitations that might already explain the issue report.
Link to the PR that may change that behavior.

The pattern is the same in all four cases: say what you think should happen, give enough context to make it actionable, and avoid turning the comment into another discussion thread.

Good triage comments aren’t short because information is missing. They’re short because everything unrelated to the next decision has been removed.

The fastest ways to make triage worse

Bad triage is worse than no triage because it adds noise and false confidence.

The fastest ways to make a busy issue tracker worse are usually these:

opening a new issue or PR before checking what already exists
posting bare links like Related: #123 or no links at all without saying why the link matters
posting comments like “I have this issue too” without providing any relevant context or reproduction steps
guessing from titles instead of reading the diff
assuming that just because two issues touch the same part of the UI, they must be the exact same bug
asking for information that’s already in the issue body or screenshot
sounding more certain than you really are
treating “it was my mistake” as the end of the story, instead of turning the misunderstanding into a docs improvement or follow-up issue
pasting AI-generated comments without reviewing them first

Remember: triage is supposed to reduce work, not increase it!

AI makes human triage more important, not less

AI is genuinely useful for triage. It can help with things like:

finding related issues and PRs
checking whether an issue or PR follows the repository template
suggesting better search terms
summarizing the overlap between two issues
mapping the surrounding repository context

But AI is also very good at sounding certain when it shouldn’t be. That makes it useful as an assistant and dangerous as a substitute for judgment. A simple way I think about it is this:

AI is a speed multiplier. It multiplies good process and bad process.

In my OpenClaw case, the assistant quickly understood the code and was ready to fix it. What it didn’t naturally do was the human part: slow down, inspect the issue tracker carefully, and figure out whether a new PR would actually help.

Instead of letting AI blindly post comments for you, the best way to use it for triage is to map the territory first.

You can use AI to scan the repository, identify similar issues, check recent PRs, read contribution guidelines, and inspect issue or PR templates before you ever write a line of code.

One tedious part of that work is checking whether an existing issue or PR actually follows the repository’s own template. To make that easier, I wrote a reusable prompt for checking whether an issue or PR follows the repository’s own template. You paste in the issue or PR URL, and it asks the agent to find the relevant template, compare the body against it, classify the result, flag possible inconsistencies, and draft a concise comment for you to review, if one is needed.

I contributed that prompt to the Good OSS Citizen skills, so if you use an AI coding agent, Good OSS Citizen is the more convenient version: same idea, less copy-pasting, and more structure.

From the cloned fork of the open source project you plan to work from, install it with one of these commands (requires Node.js or Bun):

# npm
npx tessl i tessl-labs/good-oss-citizen

# Yarn
yarn dlx tessl i tessl-labs/good-oss-citizen

# pnpm
pnpm dlx tessl i tessl-labs/good-oss-citizen

# Bun
bunx tessl i tessl-labs/good-oss-citizen

If your coding agent has internet access and can run shell commands, you can also point it to the Good OSS Citizen repository and ask it to install the tool in your fork. Review the command before running it.

Then ask your agent:

Triage this issue:
https://github.com/example/project/issues/123

That’s it.

The triage skill in Good OSS Citizen does a bit more than the raw prompt. It can fetch the already-open issue or PR body, fetch the matching templates, apply a reusable rubric, write a triage_comment.md handoff, and explicitly tell the agent not to post to GitHub. It drafts; you decide whether to post.

Good OSS Citizen also includes broader open source contribution checks through its rules, skills, and scripts: contribution guidelines, AI policies, prior rejected PRs, claimed issues, DCO requirements, and changelog expectations. For triage, the important part is that the agent does the boring checks first and leaves the judgment to you.

Use AI for the heavy lifting. Let it search. Let it summarize. Let it prepare a draft.

But don’t outsource the judgment.

Final thoughts

Triage isn’t glamorous. It doesn’t give you the same dopamine hit as opening a PR, seeing green CI checks, and getting something merged. But on busy open source projects, it’s often the most impactful contribution you can make.

Issue trackers don’t usually get messy because of a single big mistake. They become messy the same way a kitchen junk drawer does. One day, you toss a vague bug report in. The next day, an unlinked PR. Then an overconfident comment. Nobody cleans it out, and six months later, no one can find the batteries.

Good triage works in the opposite direction. It makes the issue tracker easier to trust. It makes the next decision easier. It helps maintainers spend more time reviewing the right work and less time reconstructing context that should already be there.

And if you’re not sure what kind of help a project needs, ask.

Most projects link to their community from the README.md, CONTRIBUTING.md, or documentation. Look for words like “Community”, “Contributing”, “Support”, “Chat”, or “Getting help”. That might lead you to Discord, Slack, Matrix, Zulip, a forum, a mailing list, or GitHub Discussions.

Once you find the most relevant place, ask a simple question:

I like this project and would like to contribute in a way that actually helps. Is this the right place to ask what would be most useful right now?

Even if it isn’t the perfect place, this makes it easy for someone to point you in the right direction.

So the next time you want to contribute to open source, don’t start by asking:

“What can I code?”

Also ask:

“What can I clarify?”

On busy projects, that’s triage. And very often, that’s exactly the contribution maintainers need most.

If you’d like to share your own experiences with triage, want a second opinion on a messy issue tracker, or need specific advice, feel free to reach out via email or connect with me on LinkedIn.

Jfokus 2026: 20 Years of Java, Community, and Innovation

2026-04-22T00:00:00+00:00

At Karakun, we closely follow trends in Java and modern software engineering. Jfokus 2026 marked 20 years of one of Europe’s leading developer conferences, covering topics from core Java to AI and cloud technologies. This article summarizes key insights, themes, and observations from the event.

A Milestone for the Global Java Community
From Java Conference to Multi-Track Developer Conference
Two Decades of Growth in the Java Community
A Unique Atmosphere: Where Tech Meets Nordic Mythology
Key Topics: Java, AI, and Modern Software Engineering Trends
Expo and Networking at a Leading Developer Conference
Personal Highlight: The Mentoring Hub
Beyond the Conference: Exploring Nordic Culture
Conclusion: Setting the Standard for Modern Developer Conferences
Let’s Discuss

A Milestone for the Global Java Community

Jfokus 2026 at the Stockholm Waterfront Congress Centre offered a unique experience for anyone with a professional interest in Java. The event marked a milestone for the Swedish developer community, as Jfokus celebrated its 20th anniversary.

Since its humble beginnings in January 2007, when just over 450 Java enthusiasts gathered in Stockholm for the very first edition, the conference has grown into one of Europe’s premier developer events, drawing around 2,000 attendees annually from across the globe. The 2026 edition, held from February 2–4, was not just another conference – it was a celebration of twenty years of community, code, and continuous learning.

From Java Conference to Multi-Track Developer Conference

Founded by Mattias Karlsson and organised in partnership with Javaforum Stockholm, Jfokus was driven by a passion to create an unparalleled experience for the global developer community.

Over the past two decades, it has remained at the forefront of software development – evolving from a tightly Java-focused gathering into a broad, multi-track conference covering AI/ML, DevOps, Cloud, and emerging technologies, all while staying true to its developer-first roots.

Two Decades of Growth in the Java Community

The first Jfokus was held in January 2007 and was an immediate success. With more than 450 participants, it became the biggest meeting place in Sweden for Java professionals. This marked the beginning of a format that has since gained recognition within the professional developer community.

Attending in 2026 offered a rare opportunity to meet the people who have shaped Jfokus over the years, like Johan Rhedin, who has been instrumental in crafting the conference’s branding and digital identity, or Jeanne Göthberg, who has been managing bringing people, ideas, and projects together to move Jfokus forward. Nowhere else have I encountered such a high concentration of Java Champions in one room as at the Jfokus Speaker Dinner.

Since its launch in 2007, the conference has developed into an established fixture in the global developer events calendar. For two decades, Jfokus has played a significant role in the evolution of Java conferences. Over the years, the conference has covered a wide range of Java-related topics and consistently engaged its audience.

A Unique Atmosphere: Where Tech Meets Nordic Mythology

The 20th anniversary highlighted this unique blend – where else do a Viking-inspired atmosphere and modern Java innovation come together so naturally?

This year, world-leading Java experts took the stage, contributing to an atmosphere that felt more like a celebration than a typical conference. The opening talk marked the beginning of a saga of fire and ice, creating a distinctive Nordic winter atmosphere with elements inspired by Nordic mythology – including visual effects and fire shows that complemented the theme.

The combination of cutting-edge software engineering innovations, imaginative conference design, and top-class performances has attracted an ever-growing number of visitors, including locals, international software developers, and world-class speakers.

The 20th anniversary of Jfokus was a significant milestone, enriched by its Nordic theme. Despite the wintery setting in Stockholm, the atmosphere was warm, driven by engaging conversations and a strong sense of community. The organization was smooth, the attendees were amazing, and the Viking spirit of the conference made the event stand out.

Key Topics: Java, AI, and Modern Software Engineering Trends

Over the years, the conference broadened its topics beyond core Java to include Frontend & Web development, Android & Mobile, Continuous Delivery & DevOps, Cloud & Big Data, Security, and alternative JVM languages.

One key takeaway was how rapidly AI is evolving and increasingly shaping the work of software engineers – as well as the pace at which software technologies themselves are advancing.

The talks and booths at Jfokus were both inspiring and insightful: attendees could learn about the latest features and innovations in Java and AI and how leading industry experts apply them in practice – across all areas of modern software development.

Expo and Networking Opportunities at a leading Developer Conference

One of the key advantages of attending in person was the exhibition floor, where some of the biggest names in the tech industry had set up booths. Companies from the Java ecosystem were represented, making it easy to walk up, start a conversation, and get direct insights from the engineers and advocates who actually build the tools developers use every day.

For many attendees, those informal booth chats turned out to be just as valuable as the sessions themselves.

Personal Highlight: The Mentoring Hub

My personal highlight was the Mentoring Hub: there were so many opportunities to exchange ideas with leading experts – either one-on-one or in small groups – on a wide range of professional development topics.

The mentors were eager to share their experience and knowledge. It proved to be an excellent way to gain meaningful advice for career development.

Beyond the Conference: Exploring Nordic Culture

Attending the conference also offered the opportunity to explore aspects of Viking history and culture, their way of life, and to visit, for example, the Viking Museum in Stockholm – a true Nordic highlight.

Setting the Standard for Modern Developer Conferences

The Jfokus conference series combines emerging technology topics, a distinctive Nordic-inspired atmosphere, excellent speakers, and a strong community focus, and it continues to set a high standard for developer conferences.

Let’s discuss!

Do you have questions about Jfokus, other developer conferences, or specific developer topics or AI? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Swiss Testing Day 2026 – Reflections on Testing AI and Non-Deterministic Systems

2026-04-01T00:00:00+00:00

At Karakun, we are closely following how software engineering evolves in the age of AI – especially when it comes to testing and reliability. The Swiss Testing Day 2026 brought together a range of perspectives on exactly this topic: from classical software verification to emerging approaches for testing non-deterministic systems. I, Mike Mannion, attended the conference and captured a set of reflections and observations from selected talks.

Opening Keynote: Software Verification in the Age of AI
Good AI Testing Strategy / Bad AI Testing Strategy. The difference and why it matters
Agentic testing in banking: From hype to governed practice
Why AI Is Useless for Compliance
Key Takeaways
Karakun Perspective
Let’s Discuss

Opening Keynote: Software Verification in the Age of AI

Bertrand Meyer – OO and Software Correctness Pioneer

Bertrand is a personal hero of mine. His work on software correctness shaped my profile as a software developer the moment I came in contact with it. In this keynote he covers a wide range of issues, but comes back to the central idea: proving that the software does what it promises; a challenge which LLMs, with their always-present non-determinism, have only made more difficult.

The second line of the following slide is absolutely crucial and a key aspect of the probabilistic testing framework PUnit.

The performance of stochastic features – which especially includes LLMs – must be measured, because a simple correct/not correct is not sufficient to gauge performance.

But despite this observation, Bertrand does not go into detail about how to measure such systems. Instead he reiterates the message, which he has been saying for decades: program correctness must be built into the software.

This was, in fact, the genius of his Eiffel language, which unfortunately was not adopted by any mainstream language that followed. But the principle stands – even if it does not yet fully answer the question of performance of stochastic systems.

He ends on an optimistic note, stressing that the need for good software engineers will not disappear any time soon. Generated code must still be verified, and human understanding of the code remains key.

Good AI Testing Strategy / Bad AI Testing Strategy. The difference and why it matters

Iosif Itkin

A very philosophical but worthwhile talk. Iosif asks: What is strategy? What is it not?

He challenges us to think about this carefully, and reminds us not to confuse strategy with a list of goals. It is also not QA; QA requires a different mindset than testing – and that mindset is critical.

He frequently references the book Good Strategy/Bad Strategy.

A useful distinction: QA is not testing.

Iosif is adamant that testing is about identifying bugs. He does not explicitly include other outputs such as usability feedback, responsiveness data, or success rates for stochastic services.

When I asked him about this, he suggested that these aspects could also be interpreted as “bugs”. I’m not sure I agree – a suggestion for improvement is not necessarily a bug, but a claim that needs to be evaluated and may evolve into a requirement.

Despite this difference of opinion, the talk is an important reminder: organisations need to think carefully about their testing strategy – not just their tooling.

Agentic testing in banking: From hype to governed practice

J. Reitermayer, S. Baumberger, M. Hause

Reitermayer presents an agent called “Avalon”, which generates synthetic test data.

He quickly dives into product details, which can be difficult to follow without prior context. However, one thing is clear: this is a sophisticated agent-based system that delivers measurable productivity gains.

What stands out is the transparency of the system. The execution is visualised in real time in the UI, exposing LLM interactions, tool calls, and decision steps.

Why AI Is Useless for Compliance

Nick Gushchin – AI Transformation Manager

Nick structures risk into different levels, each requiring different types of tooling.

He presents a matrix using likelihood and impact to systematically assess risks associated with AI systems. This is not a new concept – but its importance applies just as much in the context of AI.

One open question remains: how do we quantify “likelihood” in non-deterministic systems? Frameworks like PUnit (view repository on GitHub) may offer part of the answer here.

This was an outstanding talk – highly practical, with no hype, and extremely relevant for anyone working on testing and reliability in AI-driven systems.

Key Takeaways

Across the talks, a few recurring themes emerged:

Non-deterministic systems require new testing approaches: Traditional binary correctness is not sufficient for AI-based systems.
Measurement becomes critical: Observability, probabilities, and performance metrics are central to evaluating stochastic behaviour.
Testing is strategic, not just operational: Organisations need to actively design how they approach testing – not just execute it.

Karakun Perspective

For us at Karakun, these discussions reinforce a key observation: As AI systems become part of real-world engineering systems, testing can no longer rely on deterministic assumptions.

Instead, we need:

new models for evaluating system behaviour
transparent and explainable execution
and engineering practices that integrate correctness and probabilistic performance

This is particularly relevant in domains such as automotive, aerospace, and other safety-critical environments – where reliability is non-negotiable.

Let’s discuss!

The Swiss Testing Day 2026 made one thing very clear: AI does not eliminate the need for engineering discipline – it increases it.

What are your thoughts on AI and engineering discipline? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Migrating from Elasticsearch 7.17 to 8.19: A Practical Guide

2026-03-26T00:00:00+00:00

Migrating from Elasticsearch 7.17 to 8.x introduces significant changes in client APIs, security defaults, and index management. This article provides a practical migration guide, covering the transition from HLRC to the Java API Client, structured error handling, composable index templates, and production-ready testing strategies.

Why Upgrade to Elasticsearch 8.x Now?
Elasticsearch Migration Overview
Replacing HLRC with the Elasticsearch Java API Client
Structured Error Handling in Elasticsearch Java Client
Elasticsearch 8 Index Templates and Mappings Changes
Bulk Operations and Response Handling in Elasticsearch 8 Java Client
Testing Elasticsearch 8 with Testcontainers
Spring Boot Elasticsearch Health Indicator Migration
Administrative Operations
Elasticsearch Migration Checklist (7.17 to 8.x)
Key Lessons from the Migration
Elasticsearch Migration Resources
Let’s connect

Elasticsearch 7.x has reached end-of-life, with maintenance ending in April 2025 and support ending in January 2026, prompting many teams to migrate to version 8.x. This migration is more than a simple version bump—it requires rethinking how your Java application interacts with Elasticsearch. The Java High Level REST Client (HLRC), the primary client library for ES 7.x, is now deprecated in favor of a completely redesigned Java API Client that embraces modern patterns such as builders, functional composition, and strong typing.

This article documents our journey migrating a production Spring Boot application from Elasticsearch 7.17 to 8.19.3, covering the key technical challenges, code transformations, and lessons learned along the way.

Why Upgrade Now?

Beyond the requirement to remain on a supported version, Elasticsearch 8.19 brings:

Security by default: TLS and basic authentication are now enabled out of the box
Performance improvements: Leveraging Lucene 9.12.2 with numerous bug fixes and optimizations
Modern API design: The new Java client offers type-safe requests and responses, reducing runtime errors
Future-proofing: Access to vector search, inference APIs, and other 8.x-exclusive features

Most importantly, continuing with HLRC means living with a frozen, unmaintained codebase while the ecosystem moves forward.

The Migration Landscape

Our migration touched four major areas:

Client library replacement: Swapping HLRC for the new typed Java API Client
Security configuration: Adapting to Elasticsearch’s security-first defaults
Index templates and mappings: Updating to composable templates and changed analyzer semantics
Error handling: Reworking exception handling for the new client’s error model

Part 1: Replacing the Java Client

Dependency Updates

The first step was updating our Gradle dependencies:

// Old (ES 7.17)
implementation "org.elasticsearch.client:elasticsearch-rest-high-level-client:7.17.0"

// New (ES 8.19)
implementation "org.elasticsearch.client:elasticsearch-rest-client:8.19.3"
implementation "co.elastic.clients:elasticsearch-java:8.19.3"
implementation "jakarta.json:jakarta.json-api:2.1.1"

Note that the new client requires a JSON-P implementation. We chose Jackson’s JSON-P mapper for seamless integration with our existing Jackson setup.

Client Initialization

The old HLRC used a simple builder pattern:

// ES 7.17 approach
RestHighLevelClient client = new RestHighLevelClient(
    RestClient.builder(
        new HttpHost("localhost", 9200, "http")
    )
);

The new client separates concerns between transport and the client itself:

// ES 8.19 approach
RestClient restClient = RestClient.builder(
    new HttpHost("localhost", 9200, "http")
).build();

ElasticsearchTransport transport = new RestClientTransport(
    restClient, 
    new JacksonJsonpMapper()
);

ElasticsearchClient client = new ElasticsearchClient(transport);

Authentication and Security

Elasticsearch 8.x enables security by default. For production environments, we added basic authentication:

RestClientBuilder builder = RestClient.builder(httpHosts)
    .setRequestConfigCallback(cfg -> 
        cfg.setSocketTimeout(timeoutInSeconds * 1000)
    );

if (authEnabled) {
    CredentialsProvider credentials = new BasicCredentialsProvider();
    credentials.setCredentials(
        AuthScope.ANY,
        new UsernamePasswordCredentials(username, password)
    );
    builder.setHttpClientConfigCallback(cb -> 
        cb.setDefaultCredentialsProvider(credentials)
    );
}

For local development and temporary flexibility, we have the option to disable the security in docker-compose.yml:

elasticsearch:
  image: docker.elastic.co/elasticsearch/elasticsearch:8.19.3
  environment:
    - xpack.security.enabled=false
    - discovery.type=single-node
  ports:
    - "9200:9200"

Request/Response Pattern Changes

The new client’s biggest shift is from generic maps to strongly-typed builders and response objects.

Old search operation (ES 7.17):

SearchRequest request = new SearchRequest(indexName);
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.query(QueryBuilders.matchQuery("field", "value"));
request.source(sourceBuilder);

SearchResponse response = client.search(request, RequestOptions.DEFAULT);
SearchHit[] hits = response.getHits().getHits();

New search operation (ES 8.19):

SearchResponse<MyDocument> response = client.search(s -> s
    .index(indexName)
    .query(q -> q
        .match(m -> m
            .field("field")
            .query("value")
        )
    ),
    MyDocument.class
);

List<Hit<MyDocument>> hits = response.hits().hits();
for (Hit<MyDocument> hit : hits) {
    MyDocument doc = hit.source();
    // Strongly typed access to your document
}

The functional builder pattern takes some getting used to, but it eliminates entire categories of errors by enforcing type safety at compile time.

Handling Field Values

One subtle change: the new client introduces FieldValue as a wrapper for all dynamic values in queries, aggregations, and scripts.

Search-after tokens must now be explicitly converted:

// Old
searchAfter(Arrays.asList("value1", 123, timestamp))

// New
searchAfter(Arrays.asList(
    FieldValue.of("value1"),
    FieldValue.of(123),
    FieldValue.of(timestamp.toEpochMilli())
))

Script parameters need similar wrapping when passing variables to Painless scripts in aggregations or updates:

// New
Map<String, JsonData> params = Map.of(
    "boost", JsonData.of(1.5),
    "field_value", JsonData.of("some_text")
);

Part 2: Structured Error Handling

The HLRC threw generic ElasticsearchException instances that required parsing error messages as strings. The new client provides structured error information through ErrorCause.

We created an enum to classify error types systematically:

public enum ElasticsearchErrorKind {
    INDEX_NOT_FOUND("index_not_found_exception", false),
    INDEX_CLOSED("index_closed_exception", false),
    CLUSTER_BLOCK("cluster_block_exception", true),
    VERSION_CONFLICT("version_conflict_engine_exception", true),
    ES_REJECTED_EXECUTION("es_rejected_execution_exception", true),
    TIMEOUT("timeout_exception", true),
    UNKNOWN("_unknown", false);

    private final String type;
    private final boolean recoverable;

    // Constructor and methods...

    public static ElasticsearchErrorKind fromErrorCause(ErrorCause cause) {
        if (cause == null) return UNKNOWN;
        String type = cause.type();
        
        // Check nested root causes
        List<ErrorCause> rootCause = cause.rootCause();
        if (rootCause != null && !rootCause.isEmpty()) {
            type = rootCause.get(0).type();
        }
        
        return fromType(type);
    }
}

This allowed us to build intelligent retry logic:

try {
    return operation.execute();
} catch (ElasticsearchException e) {
    ElasticsearchErrorKind kind = 
        ElasticsearchErrorKind.fromErrorCause(e.error());
    
    if (kind.isRecoverable() && retryCount < maxRetries) {
        Thread.sleep(backoffMs);
        return retryOperation(operation, retryCount + 1);
    }
    throw e;
}

Part 3: Index Templates and Mappings

Elasticsearch 8.x introduces composable index templates, replacing the legacy template format. While our templates were relatively straightforward, we had to:

Update analyzer configurations: Some token filters changed names (e.g., french_elision syntax)
Switch to explicit normalizers: Keyword fields now use explicit normalizer definitions
Fix deprecated syntax: Date histogram intervals like 1M must now be spelled out as month

Example template structure for ES 8.19:

{
  "index_patterns": ["my-index-*"],
  "template": {
    "settings": {
      "number_of_shards": 1,
      "number_of_replicas": 1,
      "refresh_interval": "1s",
      "analysis": {
        "normalizer": {
          "lowercase_normalizer": {
            "type": "custom",
            "filter": ["lowercase", "asciifolding"]
          }
        },
        "analyzer": {
          "custom_analyzer": {
            "type": "custom",
            "tokenizer": "standard",
            "filter": ["lowercase", "stop", "synonym_graph"]
          }
        }
      }
    },
    "mappings": {
      "properties": {
        "title": {
          "type": "text",
          "analyzer": "custom_analyzer"
        },
        "status": {
          "type": "keyword",
          "normalizer": "lowercase_normalizer"
        }
      }
    }
  },
  "version": 1
}

Template Versioning Strategy

We implemented automatic template updates by tracking versions. While template versioning existed in ES 7.x, the API for accessing version metadata changed with the new client.

Old approach (ES 7.17 with HLRC):

GetIndexTemplatesResponse response = client.indices()
    .getIndexTemplate(request, RequestOptions.DEFAULT);

Long remoteVersion = response.getIndexTemplates().get(0).version();

New approach (ES 8.19 with new client):

GetIndexTemplateResponse response = client.indices()
        .getIndexTemplate(request);

Long remoteVersion = response.indexTemplates().stream()
        .findFirst()
        .map(t -> t.indexTemplate().version())  // Strongly typed access
        .orElse(0L);

long localVersion = extractVersionFromTemplate(templateContent);

if (localVersion > remoteVersion) {
    client.indices().putIndexTemplate(t -> t
        .name(templateName)
        .indexPatterns(patterns)
        .template(templateBody)
    );
}

Part 4: Bulk Operations and Response Handling

Bulk indexing saw significant API changes. The new client provides cleaner separation between successful and failed operations:

BulkResponse response = client.bulk(b -> {
    for (Document doc : documents) {
        b.operations(op -> op
            .index(idx -> idx
                .index(indexName)
                .id(doc.getId())
                .document(doc)
            )
        );
    }
    return b;
});

if (response.errors()) {
    for (BulkResponseItem item : response.items()) {
        if (item.error() != null) {
            ElasticsearchErrorKind kind = 
                ElasticsearchErrorKind.fromErrorCause(item.error());
            
            if (kind == ElasticsearchErrorKind.VERSION_CONFLICT) {
                // Handle version conflict specifically
            } else {
                logger.error("Bulk operation failed for {}: {}", 
                    item.id(), item.error().reason());
            }
        }
    }
}

Part 5: Testing Infrastructure

We updated our testing stack to use Elasticsearch 8 Testcontainers:

@Container
static ElasticsearchContainer elasticsearchContainer = 
    new ElasticsearchContainer(
        "docker.elastic.co/elasticsearch/elasticsearch:8.19.3"
    )
    .withEnv("xpack.security.enabled", "false")
    .withEnv("ES_JAVA_OPTS", "-Xms512m -Xmx512m");

@DynamicPropertySource
static void elasticsearchProperties(DynamicPropertyRegistry registry) {
    registry.add("elasticsearch.nodes[0].host", 
        elasticsearchContainer::getHost);
    registry.add("elasticsearch.nodes[0].port", 
        elasticsearchContainer::getFirstMappedPort);
}

Testing Pitfall: Wildcard Deletes

Elasticsearch 8 rejects wildcard index deletions by default. Our test cleanup code needed updating:

// Old approach (fails in ES 8)
client.indices().delete(d -> d.index("test-*"));

// New approach
GetIndexResponse indices = client.indices().get(g -> g.index("test-*"));
for (String indexName : indices.result().keySet()) {
    client.indices().delete(d -> d.index(indexName));
}

Part 6: Health Indicator Updates

Spring Boot’s default ElasticsearchHealthIndicator still relies on HLRC. We replaced it with a custom implementation:

@Component("elasticsearch")
public class CustomElasticsearchHealthIndicator implements HealthIndicator {
    
    private final ElasticsearchClient client;
    
    @Override
    public Health health() {
        try {
            if (!client.ping().value()) {
                return Health.down()
                    .withDetail("error", "Ping failed")
                    .build();
            }
            
            RestClient lowLevel = 
                ((RestClientTransport) client._transport()).restClient();
            Request req = new Request("GET", "/_cluster/health");
            Response resp = lowLevel.performRequest(req);
            
            JsonNode json = mapper.readTree(resp.getEntity().getContent());
            String status = json.path("status").asText("red");
            
            boolean up = "green".equalsIgnoreCase(status) || 
                        "yellow".equalsIgnoreCase(status);
            
            return (up ? Health.up() : Health.down())
                .withDetail("status", status)
                .build();
                
        } catch (Exception e) {
            return Health.down(e).build();
        }
    }
}

Don’t forget to disable the default indicator in application.yml:

management:
  health:
    elasticsearch:
      enabled: false

Part 7: Administrative Operations

A critical change in ES 8 involves the ignore_unavailable parameter. Previously, setting this to true for admin operations would silently succeed even if indices didn’t exist—useful for idempotent cleanup scripts but dangerous for user-triggered actions.

We now explicitly set ignore_unavailable=false for user-facing operations:

client.indices().delete(d -> d
    .index(indexName)
    .ignoreUnavailable(false)  // Fail loudly if index doesn't exist
);

This surfaces proper errors to the UI when users attempt invalid operations.

Migration Checklist

Based on our experience, here’s a practical checklist for teams undertaking this migration:

Pre-Migration

Audit all usages of RestHighLevelClient in your codebase
Document custom analyzer and token filter configurations
Review security requirements (TLS certificates, authentication)
Plan for breaking changes in REST API responses

Code Changes

Update Gradle/Maven dependencies to ES 8.19.3
Replace RestHighLevelClient with ElasticsearchClient
Refactor all search operations to use fluent builders
Wrap dynamic values with FieldValue or JsonData
Update bulk operation handling for new response structure
Implement structured error classification
Replace Spring Boot’s default Elasticsearch health indicator

Configuration

Update index templates to composable format
Validate and update analyzer configurations
Configure authentication for production environments
Disable security for local development (if appropriate)
Set explicit ignore_unavailable values for admin operations

Testing

Upgrade Testcontainers to use Elasticsearch 8.19.3
Fix test cleanup to avoid wildcard deletes
Add tests for highlighting, aggregations, and spellcheck
Verify security configuration in integration tests
Test error handling for all error kinds

Deployment

Update Docker Compose files for local development
Plan production rollout (rolling restart vs. reindex)
Monitor cluster health during initial deployment
Verify application logs for migration-related warnings

Lessons Learned

Strong typing prevents runtime surprises: While the functional builder syntax felt verbose initially, it caught numerous bugs at compile time that would have been production incidents.
Error handling needs a strategy: Don’t treat all Elasticsearch exceptions the same. Classify them, log context-rich messages, and implement smart retry logic for recoverable errors.
Security isn’t optional anymore: ES 8’s security-first approach is the right move, but it requires thoughtful configuration management across environments.
Test with the real version: Don’t rely on in-memory fake implementations. Use Testcontainers with the exact Elasticsearch version you’ll run in production.
Index templates matter: Small changes in analyzer behavior can subtly break search quality. Diff your templates carefully and test with production-like data volumes.

Looking Forward

With Elasticsearch 8.19 in place, we’re positioned to explore capabilities that were painful or impossible in 7.x:

Vector search for semantic similarity
Inference endpoints for ML-powered features
Runtime fields for schema flexibility
Improved aggregation performance for analytics workloads

This type of migration is substantial, typically touching 50-150+ files depending on your codebase size. But the result is a more maintainable, type-safe, and future-proof integration with Elasticsearch.

Let’s Connect!

Do you have questions about our migration to Elasticsearch 8.19? Or would you like to discuss the best migration path for your installation? Did you already migrate and experienced other pitfalls? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Resources

Accessibility: Going from Couch to Marathon

2026-03-06T00:00:00+00:00

Accessibility matters in 2025 because inclusive digital experiences are no longer optional — they’re required by law and expected by users. As the European Accessibility Act nears enforcement, developers and designers must adopt WCAG 2.2 and inclusive design practices to ensure equal access, usability, and long-term compliance.

Getting Started: Information and Research
Putting Accessibility into Practice
Share Your Accessibility Journey
Let’s connect

“Next year I will run a marathon.” What a wonderful and challenging New Year’s resolution. On the first possible day (not too cold, not too wet, not too sunny, not on a weekend) I went out in my new running gear. The first kilometer felt ok. The second was already quite hard. By the third, I stopped running and walked back home. Never again. There went my New Year’s resolution. I went running a few more times but finally gave up on it. Sounds familiar to you? What made you give up your resolution?

For me, it was this massive mountain I saw in front of me. This marathon thing was huge, and I could not even run a kilometer without almost dying. My goal was too ambitious, and even though I started, I was never able to pull through.

Let’s take my sporty ambitions into the software world. Imagine someone (for example, the EU and their laws) saying that from now on you have to implement accessibility. For me, that sounds almost as impossible as my marathon mountain. We are sitting on the couch, bag of crisps in our hand, watching it all happen. But how can we start moving? I mean, it already is the law in Europe. And by the way, accessibility does not only mean catering to blind or deaf people. There are so many more disabilities which are not always visible or even permanent. In the end, it helps all users when our products are more accessible because they become more user-friendly. Now let’s get our running gear together.

Getting Started: Information and Research

You have to have some basics, some knowledge, a little bit of background. You will find a ton of blogs, videos, and tutorials out there, and I will give you a short list to get started on the topic. You wouldn’t go running in your flip-flops, would you?

Understanding WCAG 2.0 - For those of you who would like to figure it out by themselves.
Blog by Leonie Watson https://tink.uk/ - Topics around web accessibility.
Udemy course by Liz Brown - Not free but definitely worth it.
Web Accessibility Cookbook by Manuel Matuzovic - You can also book Manuel for in-house workshops (highly recommended) and watch his talk at beyond tellerrand 2023.

Putting Accessibility into Practice

Ok, that’s the list. The second step: just go out and do it. Once you’ve got your gear together, you have to take the first step. Go out and run. Go out and program. You don’t need to restructure your whole website or application. That’s for the next level, when you start on a new project. But for now, let’s go one ticket at a time. How do your focus styles look? Are there any? No? Create some. Do you have alt text for your images? No? Start from here. Are all your input fields connected to labels? Go ahead and check. These are all very simple tasks that can be done in a short time, but they get you started. And once you start, you’ll notice it’s hard to stop. There’s more to learn and more to do out there.

Ok, I said the list was finished by #2, but this is an easy one. Go out and talk about what you did. You implemented that one style? You added a descriptive label to a button? You checked out the interface using a screen reader? Tell your colleagues. Let them get inspired.

Now we’re off the couch, slowly heading towards this mountain. But we’ll tackle it one step at a time, because we’re in it for the long run. To be honest, accessibility is a marathon. It takes time to implement and might seem like it never finishes. And it is not about checking the box but about the mindset. Once you have the running bug (or the accessibility bug), it is hard not to do it.

Ready? Steady? Go!

Let’s Connect!

Do you have questions about the European Accessibility Act? Or would you like to discuss what is the best starting point for accessibility? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Retrofitting an Existing Spring Application with AI Capabilities Using Spring AI

2026-02-20T00:00:00+00:00

Adding AI-powered capabilities to existing enterprise systems is often complex, especially when modernization or migration to new frameworks is not immediately feasible. However, it is possible to retrofit an application with a natural language interface while keeping the original business logic untouched.

This post walks through how to integrate AI-based request generation for an existing search API, focusing on structured outputs, tooling, and validation, while discussing some real-world obstacles. This example represents a simpler case where tool calls are fast and have no side effects. It allows us to focus on the interaction between the LLM, the tools, and the structured output without introducing external dependencies, complex state handling, or repeated tool calls.

1. The Idea: Let the AI Build Your Request Objects
2. Technology Setup
- Gradle Dependencies
- Configuration
3. Converting Prompts into Valid Search Requests
4. Adding Domain-Specific Tools
- Why Tools Matter
5. Testing AI-Assisted Code
6. Business Value and Constraints of AI Integration
- Practical Constraints
7. Takeaways
8. Expert Support for AI Integration Projects

1. The Idea: Let the AI Build Your Request Objects

Existing systems often have well-defined APIs for search, analytics, or operations. They usually expect strongly typed input models, such as a SearchRequestModel. Retrofitting them for AI input means giving the user a natural language interface and letting the LLM create valid request objects automatically.

With Spring AI, this becomes practical through:

Structured output handling – the LLM generates JSON matching a given class.
Tools – annotated functions that the LLM can call to retrieve external data or validate intermediate results.

This approach bridges free-text prompts with typed data structures, enabling human-like queries while keeping the backend stable.

2. Technology Setup

For this article, we use our own HIBU platform as an example. HIBU provides an API library that includes request and response classes annotated with OpenAPI metadata, which makes it well suited for generating structured outputs.

Gradle Dependencies

You can retrofit without heavy dependencies, provided your project already runs on Spring Boot 3.x (required for Spring AI) or you are maintaining this code in a separate module/project.

dependencies {
    implementation platform("org.springframework.boot:spring-boot-dependencies:3.5.7")

    implementation 'org.springframework.boot:spring-boot-starter-web'

    implementation platform("org.springframework.ai:spring-ai-bom:1.0.3")
    implementation 'org.springframework.ai:spring-ai-starter-model-openai'

    implementation "com.karakun.hibu:hibu-api:3.6.1"

    testImplementation 'org.springframework.boot:spring-boot-starter-test'
    testImplementation 'org.assertj:assertj-core'
    testRuntimeOnly 'org.junit.platform:junit-platform-launcher'
}

Configuration

spring:
  ai:
    openai:
      api-key: add your key
      chat:
        options:
          model: gpt-4.1-mini
          temperature: 0

A low temperature reduces output variance and improves reproducibility for testing and validation.

3. Converting Prompts into Valid Search Requests

The main service delegates prompt interpretation to the LLM. The ChatClient and your own @Tool definitions drive this.

package com.karakun.hibu.promptassistance;

@Service
public class AiPromptAssistanceService {
    private final ChatClient chatClient;
    private final HibuTools hibuTools;

    public AiPromptAssistanceService(ChatClient chatClient, HibuTools hibuTools) {
        this.chatClient = chatClient;
        this.hibuTools = hibuTools;
    }

    public SearchRequestModel getRequestFromPrompt(String prompt, List<String> facetFields, String container) {
        return chatClient.prompt()
            .system(u -> u.text("""
                    Task: Given a user prompt, produce a SearchRequest JSON for our search API.
                    Use synonyms and simple_query_string syntax for "query".
                    Allowed filter fields: {filterFields}
                    Use the tools to fetch keyword filter values and validate the result object.
                    """
                )
                .param("filterFields", String.join(",", facetFields))
                .param("container", container)
            )
            .tools(hibuTools)
            .user(u -> u.text(prompt))
            .call()
            .entity(SearchRequestModel.class);
    }
}

This example uses the Spring AI fluent API. The LLM receives a system prompt describing how to construct a valid SearchRequestModel. The .entity(SearchRequestModel.class) call ensures the response is automatically deserialized and validated against the record definition. For this, Spring AI processes existing annotations such as @Nullable, @Schema, @JsonProperty, and many more.

Note: The actual system prompt most likely contains many more instructions and restrictions for the LLM, such as “DO NOT invent or change filter values.” I have kept it brief for this article.

4. Adding Domain-Specific Tools

The @Tool annotation turns normal Spring beans into callable LLM functions. In this example, two tools support validation and controlled value selection.

package com.karakun.hibu.promptassistance;

@Service
public class HibuTools {
    private final RestClient client;
    private final String filtersUrl;
    private final Validator validator;

    public HibuTools(Validator validator, RestClient.Builder builder,
                     @Value("${tools.hibu.fetchAvailableFilterValuesUrl}") String fetchUrl) {
        this.filtersUrl = fetchUrl;
        this.validator = validator;
        this.client = builder.build();
    }

    @Tool(name = "isValidSearchRequestModel",
          description = "Validate JSON against SearchRequestModel class.")
    public String isValidSearchRequestModel(String searchRequestModel) {
        try {
            SearchRequestModel model = new ObjectMapper().readValue(searchRequestModel, SearchRequestModel.class);
            Set<ConstraintViolation<SearchRequestModel>> violations = validator.validate(model);
            if (!violations.isEmpty()) {
                return violations.stream().map(ConstraintViolation::getMessage).collect(Collectors.joining("\n"));
            }
        } catch (JsonProcessingException e) {
            return e.getMessage();
        }
        return "true";
    }

    @Tool(name = "fetchAvailableFilterValues",
          description = "Fetches available filter values for a given keyword-based field.")
    public List<String> fetchAvailableFilterValues(@NotNull String container, @NotNull String fieldName) {
        // Query the existing API
        SearchRequest request = new SearchRequest(container, "", List.of(), null, Map.of(), 0, 0, null, false, null, List.of(fieldName));
        var type = new ParameterizedTypeReference<SearchResponse<ObjectMapCustomData>>() {};
        var resp = client.post().uri(filtersUrl).body(request).retrieve().body(type);
        if (resp == null || resp.getFacets() == null) return List.of();
        return resp.getFacets().stream()
            .filter(f -> f.getFieldName().equals(fieldName))
            .flatMap(f -> f.getValues().stream().map(FacetValue::getValue))
            .toList();
    }
}

Why Tools Matter

The validation tool (isValidSearchRequestModel) enables the LLM to correct invalid JSON through iterative tool-assisted regeneration.
The fetch tool limits the model to known keyword values, avoiding invented filters and producing robust output.

These patterns greatly reduce runtime errors and make the integration resilient to AI hallucinations.

5. Testing AI-Assisted Code

LLMs like ChatGPT do not guarantee deterministic replay, so defining explicit testing expectations is essential. You can use mocks to isolate behavior and assert that generated requests meet certain structural and semantic criteria.

package com.karakun.hibu.promptassistance;

public class AiPromptAssistanceServiceTest extends SpringBaseTest {

    @Autowired
    private AiPromptAssistanceService service;

    @MockitoBean
    private HibuTools mockedHibuTools;

    @Test
    public void getRequestFromPrompt() {
        when(mockedHibuTools.fetchAvailableFilterValues(any(), any()))
            .thenReturn(List.of("Karakun AG", "Another AG"));

        SearchRequestModel result = service.getRequestFromPrompt(
            "Search for all company presentations of Karakun created in the last three months of each of the last five years.",
            List.of("metadata.creation_date", "metadata.companyName_string"),
            "foo");

        verify(mockedHibuTools).fetchAvailableFilterValues("foo", "metadata.companyName_string");
        assertThat(result.query()).contains("presentation");
        assertThat(result.filters()).containsKey("metadata.creation_date");
        assertThat(result.filters().get("metadata.creation_date")).hasSize(5);
    }
}

Tip: Always verify tool calls and expected key fields. This ensures your prompt and model configuration are aligned with predictable outcomes.

6. Business Value and Constraints of AI Integration

Adding AI features on top of existing systems provides several advantages:

Faster experimentation – you can test AI-driven interfaces without refactoring the core logic.
Lower risk – tools isolate the AI layer, so failures do not affect critical paths.
Improved UX – users interact in natural language while the backend remains unchanged.

Practical Constraints

Spring Boot 3.x required: Spring AI only supports applications running on the latest generation. Legacy projects may require upgrade work before integration or maintain such a retrofitting component in a separate module/project.
Validation tools improve reliability: Without them, structured output tends to break on minor syntax issues.
Model selection and cost: Smaller models like gpt-4.1-mini often suffice. Larger ones may be cost-prohibitive for frequent use.
Testing discipline: Because LLMs behave probabilistically, regression tests are critical to detect subtle prompt changes or API behavior shifts. At Karakun, we are building an infrastructure that enables consistent testing across multiple models and helps us curate a maintainable collection of prompt patterns and best practices.

7. Takeaways

Retrofitting an existing Spring application with AI features is possible and often valuable.
Spring AI’s Tools and Structured Output simplify controlled AI integration.
Custom validation tools make AI-generated structures robust and retryable.
Expect some migration effort to Spring Boot 3.x and ensure your LLM configuration is “deterministic” for repeatable tests.
Test, observe, and iterate - AI integration is not a one-time setup but a continuous process.

By combining Spring AI, careful tool definitions, and disciplined validation, teams can extend legacy systems with intelligent interfaces while maintaining technical and business stability.

8. Expert Support for AI Integration Projects

While this example keeps things simple with side-effect-free tool calls, real-world applications often involve more complex integrations. Integrating AI into existing software ecosystems requires architectural expertise and experience balancing maintainability and business objectives.

At karakun.com, we help organizations analyze their current solutions and design the best way to integrate AI - whether that means lightweight retrofitting, full-stack modernization, or targeted use of AI capabilities.

If you are exploring how to introduce intelligent features into your existing systems, reach out to us. Together we can identify where AI delivers measurable value without disrupting stable systems.

Beyond Productivity: Impacts & Risks of AI Coding Tools

2026-02-10T00:00:00+00:00

AI coding assistants can feel like supportive pair programming without social friction. But that comfort has trade-offs worth examining.

Why AI Coding Assistants Feel So Good — and Why This Should Make Us Wary
Bias, Perception, and The Hidden Competence Penalty
The Next Hidden Trap in AI Coding: Why Your Assistant Might Be Reinforcing Your Misconceptions
The Need for Ongoing Self-Reflection: Power, Privacy, and Asymmetry
What Developers Can Do Today
Let’s Connect
References

Why AI Coding Assistants Feel So Good — and Why This Should Make Us Wary

Have you ever caught yourself genuinely enjoying a digital pat on the back? You’ve been using AI assistants in your daily work for a while now, and suddenly everything feels lighter. You feel encouraged, supported, maybe even empowered. The next prototype comes together in record time. You write software and develop solutions in technologies that once felt unfamiliar or intimidating. Mastery of a specific programming language seems less important than before. Instead, your ability to think creatively and develop solutions takes center stage.

You start experiments you’ve been postponing for months. Long-abandoned projects finally get finished. Maybe you’ve even realigned your company’s vision. You feel more creative, more confident, full of new ideas. New business models, new projects, professional goals suddenly feel within reach. You might even dare to contribute seriously to an open source project, because time and mental pressure are no longer the limiting factors they once were.

If this sounds familiar, you’re not alone. Many developers, managers, and CTOs report similar experiences. Early studies suggest that coding assistants can provide short-term emotional support and increase satisfaction with everyday software development tasks [1]. What we don’t yet understand, however, are the long-term effects.

AI coding assistants are not neutral tools. They are carefully and smartly designed dialogue systems built to encourage and affirm. They appear understanding, patient, and endlessly available. They also remove many small frictions common in human collaboration: eye-rolling, implicit judgment, repeated questioning, and time pressure. For many developers, this feels deeply liberating.

In traditional pair programming, interpersonal barriers are part of the experience, and learning to navigate and overcome them can take years. With AI support, a sense of psychological safety can emerge. This kind of safety is often hard to achieve with real people, because human collaboration operates in a different social and emotional mode.

With an AI coding assistant, you can voice imperfect ideas, make naïve suggestions, or even delegate an entire task. Modern AI assistants simulate emotional intelligence: they respond empathetically, provide structured explanations, and appear supportive. For many, this feels like pair programming without friction or frustration. There is no noticeable knowledge gap.

In human pair programming, a such gaps often push us out of our comfort zone. With AI, that gap feels largely absent — not because it does not exist, but because it does not trigger social comparison. AI tools typically challenge ideas only when explicitly prompted and tend to show less critical resistance than human collaborators [2] [3]. They do not hold genuine opinions, values, or a lived perspective in the way real people do. While certain behaviors and values can be partially simulated through configuration and prompting, this remains fundamentally different from engaging with a human collaborator who brings their own convictions and experience into the discussion.

And This Is Where The Ambivalence Begins

Early research indicates that developers who rely heavily on AI assistants for pair programming may gradually invest less in real workplace relationships. Studies suggest reduced depth of knowledge transfer, fewer mentoring interactions, and a shift away from interpersonal exchange and team-level collaboration toward individual, tool-mediated work [1] [2] [3]. The assumption that this technology is free of disappointment turns out to be an illusion, because disappointment does not disappear - it shifts.

It shows up when no useful solution emerges despite careful prompting, suggestions are shallow or wrong, tools crash, token limits are reached, or costs spiral out of control. It also appears when code with security vulnerabilities makes it into production despite careful reviews, when data leaks occur despite privacy assurances, or when platforms are suddenly discontinued or unavailable.

The List of Risks Is Long — and Growing

AI coding assistants open up enormous opportunities. They are reshaping how we work, learn, and think. But precisely because they feel so good, it’s worth taking a closer look. Not every digital pat on the back is harmless: some distract us, some obscure risks, and some replace something that cannot easily be simulated – genuine collaboration and growing together as a team [4].

Bias, Perception, and The Hidden Competence Penalty

Not all colleagues view AI-assisted work positively. The use of AI in software development is still surrounded by strong biases and preconceived notions. Research shows that individuals who receive help from AI often face a hidden competence penalty: even when the quality of the work is identical, people are perceived as less competent, less diligent, and lazier simply because AI was involved [8].

Experiments consistently demonstrate that engineers believed to have used AI are evaluated more negatively, despite no measurable difference in code quality. This penalty does not target the output, but the perceived ability of the person behind it. The effect is not evenly distributed. Female engineers are penalized significantly more than their male counterparts, and the harshest judgments come from engineers who do not use AI themselves – particularly male non-adopters evaluating women.

As a result, many developers anticipate this social penalty and strategically avoid using AI to protect their professional reputation, as mentioned in the research of [8]. The authors also state, that ironically, the groups that could benefit most from productivity-enhancing tools – women and older engineers – are the least likely to adopt them. This reflects broader social and organizational structures in which AI assistance is framed not as strategic tool use, but as evidence of inadequacy, especially for already stereotyped groups.

The research highlights a fundamental mismatch in how organizations approach AI adoption. While companies focus on access, tooling, and training, they often ignore the social dynamics that determine whether AI is actually used. Since AI-assisted work shows no inherent quality disadvantage, a more responsible path forward may be to shift evaluation away from perceived competence and toward objective outcomes such as accuracy, defect rates, and delivery time, rather than how the work was produced. The introduction of role models and joint AI hackathons within the organisation can also mitigate these effects.

The Next Hidden Trap in AI Coding: Why Your Assistant Might Be Reinforcing Your Misconceptions

Every software engineer has introduced bugs for mundane reasons – forgetting to add a test case, a condition, or calling the wrong method. These are small oversights that occur even when we understand the problem reasonably well [5]. This is precisely where AI coding assistants excel. They act as a second set of eyes, catching missing checks, inconsistent logic, or obvious implementation errors.

However, there is a more dangerous category of errors – one that AI assistants may not only fail to prevent, but may actively reinforce: misconceptions.

Beyond Simple Mistakes: The Problem of Misconceptions

A misconception is not a typo, syntax error, or an overlooked edge case. It is a faulty assumption about how a framework, a system, a data structure, or an API works. It occurs when your mental model of the code, commands, or tools is wrong, even though you feel confident in your reasoning. This can happen, for example, when assumptions are reused from previous projects that simply do not hold in a new context.

Correcting bugs in thinking requires replacing an entire mental model with a new one – a significant cognitive shift, whilst even after learning the correct model, developers may revert to the old misconception under time pressure, cognitive load, or familiarity bias [5].

The AI Dynamic: A Cooperative Risk

This is where the relationship between developers and AI assistants becomes complex.

AI coding tools are deliberately designed to be cooperative. They follow the context you provide, reinforce your framing, and optimize for helpfulness. As a result, they rarely challenge your assumptions unless explicitly instructed to do so. If your prompt or existing code is based on a misconception, the AI may:

Adopt the incorrect assumption and build further logic on top of it
Strengthen the misconception by producing plausible-looking code that appears to confirm your flawed mental model
Introduce additional errors, offering suggestions that look correct but are fundamentally wrong

Unlike a human colleague, an AI assistant does not naturally push back, express doubt, or question intent. It does not notice conceptual inconsistencies unless they are syntactically or statistically obvious. As a result, misconceptions can persist longer, spread further, and become deeply embedded in the codebase, potentially hindering your future self or your team from developing correct solutions.

Why Critical Thinking Is Still Your Most Important Tool

AI can help us write code faster, but it cannot replace critical thinking, sound testing strategies, or shared reasoning.

Traditionally, one of the most effective ways to uncover misconceptions has been collaboration: pair programming, code reviews, or refinements. When different mental models collide, assumptions are exposed, challenged, and refined. A solid and well-designed test suite can also play a crucial role by forcing assumptions to become explicit and verifiable.

Working with AI can subtly shift this dynamic. Instead of challenging our thinking, the assistant often mirrors it. If we are not careful, AI becomes an amplifier of our misconceptions rather than a safeguard against them.

To use AI responsibly, developers must remain aware of its limitations and actively compensate for them – by questioning outputs, seeking alternative explanations, and deliberately inviting dissent into their workflow.

In the end, the most dangerous bugs are not caused by missing code. They are caused by flawed thinking – and no assistant, however powerful, can fix that for us.

The Need for Ongoing Self-Reflection: Power, Privacy, and Asymmetry

All of This Calls for Continuous Self-Reflection

On the one hand, it is important to remember that today’s AI platforms are run by profit-driven companies. These systems are not neutral infrastructures; they are operated by businesses that must generate revenue, and that inevitably means data has economic value. When we work with AI coding assistants, we are often handling proprietary code, architectural decisions, or internal business logic. Even when terms and policies promise safeguards, we are still engaging in an economic relationship where data matters.

On the other hand, the relationship between a developer and an AI pair programmer is inherently asymmetrical. The AI does not need support, mentoring, or feedback. It does not grow into a role, nor does it share responsibility. As a result, there is a risk that genuine team relationships – and especially the mentoring and support of junior developers – are gradually deprioritized. The more frictionless coding becomes through AI assistance, the easier it is to substitute human collaboration with tool-based interaction.

Modern technology actively reinforces this shift. Tools are becoming more polished, more intuitive, and more responsive. User interfaces improve continuously, interaction feels increasingly natural, and the perceived quality of results keeps rising. This is not accidental – it is a rapidly growing market with millions of users and strong economic incentives to optimize for adoption and dependency.

When Convenience Turns into A Security Risk

Recent findings underline why this reflection is necessary. A new security research report by Koi Research revealed that several popular browser extensions have been secretly harvesting private AI conversations from millions of users [7]). While these tools claimed to protect user privacy, they injected scripts into the browser to intercept AI dialogues and sell the collected data to data brokers.

More than eight million users were affected, with sensitive information harvested for marketing and analytics purposes. Particularly alarming is the fact that some of these extensions received official recommendations from major platforms such as Google and Microsoft – despite their covert surveillance behavior.

The investigation shows how severe the security risks of browser add-ons can be. Even when inactive, these extensions were capable of exfiltrating data in the background. This practice represents a data-broker business model centered on monetizing highly sensitive user interactions. Once a user visits one of several supported AI platforms - such as ChatGPT, Claude, or Gemini - the extension injects code that overrides native browser functionality. Prompts, AI responses, timestamps, and metadata are captured and transmitted to the provider’s servers continuously, regardless of whether VPN features are enabled.

In the past, such models primarily relied on clickstream data. Today, the focus has shifted to AI conversations – data that is far more revealing. These interactions may contain personal dilemmas, medical questions, financial information, or proprietary source code. From a data monetization perspective, this information is exceptionally valuable.

Between Progress and Moral Cost

Few would deny the potential benefits of using AI, for example in the medical research field for early detection of breast or skin cancer. If the price of mass deployment and tool optimization is that a large corporation gains access to the medical records of millions of people, a careful moral trade-off is required. Even if the use of such technology can still be ethically justified after weighing the benefits, society must remain conscious of the price it is paying [6]). That price should be openly acknowledged, debated, and – where possible – actively negotiated, rather than silently accepted or surrendered to opaque data-collection practices.

The same applies to software development. AI tools offer undeniable advantages, but their social, organizational, and ethical implications do not disappear simply because the tools are useful and user-friendly. Continuous reflection is not a luxury – it is a responsibility.

What Developers Can Do Today

First of all, leading companies should be aware of existing biases and ethical implications, and the development of AI tools must actively take them into account. Nevertheless, it would be unrealistic to assume that these challenges can be resolved very soon and solely at the platform level. The effects described above emerge in everyday practice — and they can also be addressed there.

Two concrete strategies are particularly effective and can be applied immediately.

Actively Demand Critique from The Assistant

When using AI to formulate requirements, designs, or solutions, developers can deliberately counteract its tendency toward affirmation. Instead of inviting help, they can proactively require critical review with prompts like

Before proceeding with anything else, evaluate this requirement for ambiguities. Ask clarifying questions if you have any.

Tell me if there are any open source libraries out there that already solve this problem.

Critique this design from the perspective of best-practice Java development. Check for clean code criteria. Check whether the code has any security gaps or vulnerabilities. Do not give compliments. The more specific, the better. Used this way, the assistant becomes less reassuring and more adversarial, reintroducing friction and reflection.

Do Not Replace Pair Programming or Peer Review — Augment Them

Rather than forgoing human collaboration, AI tools should be used alongside it. While an LLM can review code faster than any colleague, it lacks domain-specific context, architectural history, and shared responsibility. A human reviewer brings judgment and lived experience that cannot be simulated. Combined, both forms of feedback serve complementary roles — and preserving that balance is essential.

Let’s Connect

Do you have questions about the impact and risks of AI coding tools? Or would you like to discuss the latest developments in AI-based software engineering? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

References

Xiao, Q., Hu, X. E., Whiting, M. E., Karunakaran, A., Shen, H., & Cao, H. (2025). AI hasn’t fixed teamwork, but it shifted collaborative culture: A longitudinal study in a project-based software development organization (2023–2025). arXiv. https://arxiv.org/abs/2509.10956
Welter, A., Schneider, N., Dick, T., Weis, K., Tinnes, C., Wyrich, M., & Apel, S. (2025). From developer pairs to AI copilots: A comparative study on knowledge transfer. arXiv. https://arxiv.org/abs/2506.04785
Apel, S., et al. (2025). Software developers show less constructive skepticism when using AI assistants than when working with human colleagues. The 40th IEEE/ACM International Conference on Automated Software Engineering (ASE 2025). Reported by TechXplore (edited by Stephanie Baum, reviewed by Andrew Zinin). https://techxplore.com/news/2025-11-software-skepticism-ai-human-colleagues.html
Dishop, C. R., Brown, A. S., Chao, P. Y., et al. (2025). Machines in the Middle: Using Artificial Intelligence (AI) While Offering Help Affects Warmth, Felt Obligations, and Reciprocity. Journal of Business and Psychology. https://doi.org/10.1007/s10869-025-10068-x
Hermans, Felienne (2021). The Programmer’s Brain: What Every Programmer Needs to Know About Cognition. Manning Publications.
Rosengrün Sebastian (2023). Künstliche Intelligenz zur Einführung. Junius Verlag.
Dardikman, Idan (2025, December 15). 8 Million Users’ AI Conversations Sold for Profit by “Privacy” Extensions. Koi Research Blog. https://www.koi.ai/blog/urban-vpn-browser-extension-ai-conversations-data-collection
Oguz A. Acar, Phyliss Jia Gai, Yanping Tu and Jiayi Hou (2025, August 1). Research: The Hidden Penalty of Using AI at Work. Harvard Business Review Generative AI. https://hbr.org/2025/08/research-the-hidden-penalty-of-using-ai-at-work

LIGHTS – A Lightweight Global Collaboration for Methodological Research

2026-01-30T00:00:00+00:00

With a very small budget, the LIGHTS project demonstrates how a practical, globally distributed research infrastructure can be built using familiar, readily available tools. Health researchers across continents collaborate efficiently on a highly specialized collection for methodological guidance - without complex IT operations or expensive software.

Behind LIGHTS: An Infrastructure for Research on Methodological Guidelines
From Concept to Practice: Collaboration Between University of Basel and Karakun
The Adapter: Three Research Data Pipelines, One Workflow
Key Benefits of the LIGHTS Research Architecture
Component Schema Overview
Conclusion
Let’s Connect

Behind LIGHTS: An Infrastructure for Research on Methodological Guidelines

The Library of Guidance for Health Scientists (LIGHTS) is a living inventory of more than 2,000 hand-selected methods guidance papers. Its mission is to help clinical researchers find the best available methodological guidance to design and conduct high-quality studies.

Many clinical studies have avoidable limitations due to poor methodological decisions - such as inadequate study design, measurement bias, or flawed statistical analysis - even though suitable guidance documents exist. LIGHTS addresses this gap by systematically identifying, classifying, and making available methodological guidance documents.

From Concept to Practice: Collaboration Between University of Basel and Karakun

The project is led by Dr. Stefan Schandelmaier at the University of Basel and technically supported by Karakun AG.

At the heart of the platform lies HIBU, Karakun’s search and AI platform, providing an intuitive and interactive search experience. To feed HIBU with structured and continuously updated metadata, Karakun developed a special adapter system - a set of Java-based tools orchestrated through GitHub CI/CD pipelines.

The Adapter: Three Research Data Pipelines, One Workflow

1. Paperpile and Google Drive Integration

Researchers collect and curate literature in Paperpile, a web-based reference manager.
A BibTeX export is automatically uploaded to a GitHub repository, triggering the first pipeline.
This pipeline converts the BibTeX file into a CSV export, commits it to Git, and synchronizes it to a Google Drive folder.
In Google Sheets, scientific collaborators enrich the data with domain-specific metadata.

2. Data Transformation for HIBU

When the team is ready to publish, a second pipeline is manually triggered.
It fetches the curated CSV from Google Drive and merges it with Paperpile metadata.
The combined data is transformed into JSON objects, conforming to HIBU’s flexible index schema based on naming conventions (e.g., text, dates, multilingual fields).
The resulting JSON becomes the search index input for HIBU.

3. Quality Assurance

A third pipeline runs automated tests whenever code changes occur.
It validates syntax, checks for duplicates, and ensures backward compatibility with data formats.

Key Benefits of the LIGHTS Research Architecture

Version-Controlled Collaboration – Every artifact (BibTeX, CSV, JSON) is tracked in Git without researchers having to handle Git directly.
Automated Validation – Pipelines detect structural or semantic issues early.
Rapid Deployment – New or corrected records are integrated into HIBU within minutes.
Low Maintenance – Built entirely from existing cloud tools and open standards.

Component Schema Overview

            +----------------+
            |   Paperpile    |
            |  (References)  |
            +--------+-------+
                     |
                     v
           +---------+----------+
           |  GitHub Repository |
           |  (Artifacts, CI/CD)|
           +----+-------+-------+
                |       |
     (Pipeline 1)       (Pipeline 3)
                |       v
                |   +-----------+
                |   |   Tests   |
                |   | Validation|
                |   +-----------+
                v
       +--------+---------+
       | Google Drive/    |
       | Google Sheets    |
       +--------+---------+
                |
         (Pipeline 2)
                v
           +----+----+
           |  JSON   |
           | (HIBU   |
           |  Input) |
           +----+----+
                |
                v
           +----+----+
           |   HIBU  |
           |  Search |
           +---------+

Conclusion

LIGHTS showcases how lean, automated data pipelines can empower international research projects. By combining common tools such as GitHub, Paperpile, and Google Sheets with Karakun’s HIBU platform, the team created a robust, low-cost ecosystem that turns methodological research into a truly global, living collaboration.

Let’s Connect

Want to learn how your organization can build similar intelligent data workflows? Visit karakun.com or reach out to the HIBU team.

Devoxx Morocco 2025: A Java Conference with a Unique Community Spirit

2025-12-18T00:00:00+00:00

Devoxx Morocco matters in 2025 because it exemplifies how regional developer ecosystems in the MEA region are rapidly maturing: it blends deep Java and cloud-native expertise with strong community-driven culture, attracts global industry leaders, and provides a platform where emerging talent and established experts collaborate on modern engineering challenges.

Introduction: Inside Devoxx Morocco, the leading Java conference in MEA
A Brief History of Devoxx Morocco
First Impressions of Devoxx Morocco 2025
Conference Tracks at Devoxx Morocco 2025
My Personal Takeaways
Let’s connect

Inside Devoxx Morocco

Devoxx Morocco is the largest developer conference in the MEA region. It took place in mid-November in Marrakech. This developer event was characterised by energy, curiosity, and passion for technology, and offered many opportunities to expand one’s network. In previous editions, the founder even made memorable entrances on stage - once on a bike, another time with a camel - highlighting the playful spirit behind the event.

The conference provided three incredible days of learning, innovation, and community. The Devoxx Morocco conference left a lasting impression on me and felt more like an exhilarating party than a conventional conference. The atmosphere was high-energy, respectful, and relaxed, marked by a free-spirited vibe and friendly face-to-face conversations, even with high-level industry players. In addition, there were many opportunities to learn about Morocco’s culture and history and to interact with locals - whether it was about current technological challenges and practices, or conversations about the best places to visit. When we immerse ourselves in new cultures, we gain the ability to see the bigger picture and a deeper understanding of the problems and challenges that we actually want to tackle with the help of technology.

What impressed me most, however, was the fact that participants and speakers were brought together with peers and professionals working on similar projects and topics. This was an incredible opportunity to learn and broaden one’s technological perspective. It demonstrated how well-thought-out the conference organisation was and how knowledgeable the organisers were. I, for example, had the opportunity to meet developers from the Miro team as well as OpenFeature community leaders.

A Brief History of Devoxx Morocco

The 2025 edition marked the 12th iteration of the conference. It was a milestone in its remarkable growth. Devoxx Morocco began in 2014 under the name JMaghreb, a local Arabic-language community event dedicated to strengthening the regional developer ecosystem. Historically, the conference has roots in Java. In 2017, the conference officially joined the Devoxx family and, thus, became part of one of the world’s most respected networks of developer events.

The first Devoxx-branded edition was held in Casablanca. Over the years, the conference has travelled across Morocco. In 2018, the conference went to Marrakech. In the following years, the conference took place in Agadir. In 2024 and 2025, the event returned to Marrakech, reconnecting with the city where some of its most memorable editions took place. From a local gathering to a major international developer conference, Devoxx Morocco has grown into a vibrant hub for knowledge exchange, innovation, and community in the MEA region.

First Impressions of Devoxx Morocco 2025

The dreamlike venue, an impressive congress and hotel hall in modern Moroccan style, was spacious and open, facilitating interaction. Meeting highly skilled international professionals was a highlight. All of them were remarkably accessible and eager to engage in direct conversations, further enhancing the dynamic and interactive spirit of the three-day event. These conversations broadened my perspectives because we discussed the challenges, benefits, and problems that developers encounter in their daily work, and gained insights into developers’ practices as well as into companies’ strategies and structures.

Conference Tracks at Devoxx Morocco 2025

Devoxx Morocco 2025 featured eight distinct tracks covering the full spectrum of today’s technological landscape: GenAI, cloud, security, people & culture, and more. The “People & Culture” track is considered one of the most important, highlighting the human side of technology and how teams, careers, and communities evolve. Among the many standout topics in that track were talks on keeping children safe on the internet, the realities of digital nomad life. The schedule also included career development sessions such as the rise to engineering manager, and even guidance on how to create your first conference talk - delivered by a remarkable 16-year-old speaker.

Quality remains at the heart of Devoxx Morocco’s programme. The programme committee pays particular attention to the background, expertise, and relevance of speakers. As a result, they ensure that every invited speaker delivers valuable, high-level content. The event also serves as an important platform for public-speaking opportunities, especially for emerging voices in the tech community.

My Personal Takeaways

The Moroccan community is incredibly talented. The excellent education in computer science and unparalleled networking opportunities enable valuable exchanges with top minds in the industry. And many young talents seize the opportunity to join leading companies in Europe and the US. Devoxx Morocco serves precisely this purpose and allows young talents to present their talks. This year, the youngest speaker was 16 years old.

An official post-conference event included karaoke, a guided tour of Marrakech, and a dinner in a desert restaurant. These activities created a unique environment, allowing speakers to see their peers in action, express themselves spontaneously, and enjoy a highly social experience. Overall, Devoxx Morocco was a breath of fresh air in today’s IT event landscape, combining insightful content with strong networking opportunities. The conference left a lasting impression on me, broadening my horizons and connecting me with incredible members of the Java community.

Let’s connect!

Do you have questions about Devoxx Morocco, other developer conferences, or specific developer topics or AI? Feel free to reach out. I’m always happy to exchange knowledge, ideas, and experiences.

Karakun Developer Hub

Shipping marimo WASM Notebooks as Browser-Based Engineering Tools with Spring Boot

Table of Contents

The Engineering Data Problem

Why marimo for Browser-Based Python Tools

What We Built: marimo WASM Apps in Spring Boot

Curve Editor

Curve Fitting

How We Built the marimo WASM Deployment

Challenge 1 — Secure Static Notebook Deployment Without a marimo Server

From Notebook to Static WebAssembly Assets

Wiring the Build with Gradle and uv

Challenge 2 — REST API Integration from the Browser

Browser-Based HTTP Requests with Pyodide

A Typed REST API Layer

Notebook Integration

Challenge 3 — Development Mode vs. Production WASM Mode

Browser Bootstrap with Pyodide and micropip

Trade-offs and Constraints of marimo WASM

Benefits of Browser-Based Python Engineering Tools

Conclusion: When marimo WASM Fits

Let’s discuss!

Open Source Doesn’t Need Another Pull Request. It Needs Triage.

Table of Contents

Triage is debugging the issue tracker

Why this matters more than most people think

Duplicate work is easier than people realize

A merged PR isn’t the same as a solved issue

A messy issue tracker lies to people

It’s one of the best ways to start contributing

Related isn’t the same as duplicate

The 5-minute workflow I wish more people used

1. Read the whole thing

2. Ask whether there is enough information to act

3. Search for existing context

4. Decide the one job of your comment

5. Only state what you have verified

What good triage comments sound like

Linking a narrower issue to a broader one

Explaining that a PR is only partial

Pointing issue readers to the implementation

Asking for the one detail that matters next

The fastest ways to make triage worse

AI makes human triage more important, not less

Final thoughts

Jfokus 2026: 20 Years of Java, Community, and Innovation

Table Of Contents

A Milestone for the Global Java Community

From Java Conference to Multi-Track Developer Conference

Two Decades of Growth in the Java Community

A Unique Atmosphere: Where Tech Meets Nordic Mythology

Key Topics: Java, AI, and Modern Software Engineering Trends

Expo and Networking Opportunities at a leading Developer Conference

Personal Highlight: The Mentoring Hub

Beyond the Conference: Exploring Nordic Culture

Let’s discuss!

Swiss Testing Day 2026 – Reflections on Testing AI and Non-Deterministic Systems

Table Of Contents

Opening Keynote: Software Verification in the Age of AI

Good AI Testing Strategy / Bad AI Testing Strategy. The difference and why it matters

Agentic testing in banking: From hype to governed practice

Why AI Is Useless for Compliance

Key Takeaways

Karakun Perspective

Let’s discuss!

Migrating from Elasticsearch 7.17 to 8.19: A Practical Guide

Table of Contents

Why Upgrade Now?

The Migration Landscape

Part 1: Replacing the Java Client

Dependency Updates

Client Initialization

Authentication and Security

Request/Response Pattern Changes

Handling Field Values

Part 2: Structured Error Handling

Part 3: Index Templates and Mappings

Template Versioning Strategy

Part 4: Bulk Operations and Response Handling

Part 5: Testing Infrastructure